Who knew, that the file format we daily use to store our images, is not just a file format, but also a state-of-the-art compression technique. It has been a common day-to-day experience for every multimedia enthusiast, that the same png image when saved in the jpeg format, results in smaller size, i.e. the image gets compressed. The degree of compression is adjusted, allowing a selectable trade-off between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. The term "JPEG" is an acronym for the Joint Photographic Experts Group which created the standard, it is the most common format for storing and transmitting photographic images on the World Wide Web.
The image compression techniques is of 2 types :-
(1). Lossy
(2). Loss-less.
It is the Lossy technique that is more preferred, as it gives better compression ratios, for very little loss in clarity.
The compression is often achieved by leaving out non-important data parameters, and focussing only on important parameters, like :-
(1). color-space transformation matrix :-
The image is converted into a RBG colored matrix and also a gamma channel determing the brightness of the respective color. This kind of a color space conversion creates greater compression, without any perceptual change in image quality.The compression is more efficient because the brightness information, which is more important to the eventual perceptual quality of the image, is confined to a single channel. This more closely corresponds to the perception of color in the human visual system. The color transformation also improves compression by statistical de-correlation.
The various steps involved in the conversion of an image into jpeg format, is :-
(2). Down-sampling
(3). Block-Splitting
After Sub-sampling each block is split into 8*8 blocks.
(4). Discrete cosine transform
Next, each 8×8 block of each component (Y, Cb, Cr) is converted to a frequency-domain representation, using a normalised, two-dimensional type-II discrete cosine transform (DCT). Before computing the DCT of the 8×8 block, its values are shifted from a positive range to one centred around zero. For an 8-bit image, each entry in the original block falls in the range [0,255]. The mid-point of the range (in this case, the value 128) is subtracted from each entry to produce a data range that is centred around zero, so that the modified range is [ − 128,127]. This step reduces the dynamic range requirements in the DCT processing stage that follows. (Aside from the difference in dynamic range within the DCT stage, this step is mathematically equivalent to subtracting 1024 from the DC coefficient after performing the transform – which may be a better way to perform the operation on some architectures since it involves performing only one subtraction rather than 64 of them.)
(5). Quantisation
The human eye is good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. This allows one to greatly reduce the amount of information in the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This rounding operation is the only lossy operation in the whole process if the DCT computation is performed with sufficiently high precision. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers, which take many fewer bits to represent.
(6). Entropy encoding :-
It is basically a step where we apply Huffman coding/algorithm on the redundant information bit.
sighhhh.......... you would have never thought that these many steps take place, just while saving the image.
To reconstruct the image from these mathematical Data, an entirely opposite data transformation takes place, i.e. the encoded Data passes through the decoder, the inverse discrete fourier transform takes place, and so on and so-forth.
The image compression techniques is of 2 types :-
(1). Lossy
(2). Loss-less.
It is the Lossy technique that is more preferred, as it gives better compression ratios, for very little loss in clarity.
The compression is often achieved by leaving out non-important data parameters, and focussing only on important parameters, like :-
(1). color-space transformation matrix :-
The image is converted into a RBG colored matrix and also a gamma channel determing the brightness of the respective color. This kind of a color space conversion creates greater compression, without any perceptual change in image quality.The compression is more efficient because the brightness information, which is more important to the eventual perceptual quality of the image, is confined to a single channel. This more closely corresponds to the perception of color in the human visual system. The color transformation also improves compression by statistical de-correlation.
The various steps involved in the conversion of an image into jpeg format, is :-
(2). Down-sampling
(3). Block-Splitting
After Sub-sampling each block is split into 8*8 blocks.
(4). Discrete cosine transform
Next, each 8×8 block of each component (Y, Cb, Cr) is converted to a frequency-domain representation, using a normalised, two-dimensional type-II discrete cosine transform (DCT). Before computing the DCT of the 8×8 block, its values are shifted from a positive range to one centred around zero. For an 8-bit image, each entry in the original block falls in the range [0,255]. The mid-point of the range (in this case, the value 128) is subtracted from each entry to produce a data range that is centred around zero, so that the modified range is [ − 128,127]. This step reduces the dynamic range requirements in the DCT processing stage that follows. (Aside from the difference in dynamic range within the DCT stage, this step is mathematically equivalent to subtracting 1024 from the DC coefficient after performing the transform – which may be a better way to perform the operation on some architectures since it involves performing only one subtraction rather than 64 of them.)
(5). Quantisation
The human eye is good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. This allows one to greatly reduce the amount of information in the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This rounding operation is the only lossy operation in the whole process if the DCT computation is performed with sufficiently high precision. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers, which take many fewer bits to represent.
(6). Entropy encoding :-
It is basically a step where we apply Huffman coding/algorithm on the redundant information bit.
sighhhh.......... you would have never thought that these many steps take place, just while saving the image.
To reconstruct the image from these mathematical Data, an entirely opposite data transformation takes place, i.e. the encoded Data passes through the decoder, the inverse discrete fourier transform takes place, and so on and so-forth.
No comments:
Post a Comment