Which are the main blocks that make JPEG2000 work?
What does ‘image compression’ mean?
Why do we need a lower bit
rate?
What does ‘quantization’ mean?
HVS: what is
it? Which is its
purpose in image compression?
Which is the meaning of lossy
and lossless compression?
Why is a lossy compression
acceptable?
When is a lossless
compression preferable?
How can the compression of an
image be measured?
Does the peak signal to noise
ratio represent a reliable index of image quality?
What is the ‘concept of
irrelevance’?
What is the embedded
bit-stream?
Which is the difference between an embedded and a non-embedded
bit-stream?
What is the meaning of ‘higher quality image’?
MSE: what is
it? Is it reliable? Is there an alternative?
How much does the PSNR
improve if we codify an image using JPEG2000 instead of JPEG?
Why do we make a distinction
between bit-stream and code stream?
How is the
strutture of an
image that has to be coded organized?
Which is the disadvantage of
JPEG baseline compared to JPEG2000?
How does the old JPEG standard
work?
How does the new standard
behave with reference to this?
What is the meaning of
‘Spatial Random Access’?
Why can we say that JPEG2000
has a progressive way of working?
What
about JPEG2000 performances compared to the other compression
standards?
Which are the main blocks that make JPEG2000 work?
The JPEG2000 way of working
can be described using functional blocks:

![]()
In one-dimensional DWT an analysis block is used. It is made up of two
digital filters (
e
): a high pass and a low pass one. In JPEG2000, both the filters
always have odd components and practically (5,3) and (9,7) are used: 5 is the
number of the low-pass components, whereas 3 is the number of the high-pass
ones. The difference between the numbers of the two components must be always a
power of two. The output of either filter is subject to a downsampling that
halve the number of samples for each filter (if summed, they are still equal to
the input number): these samples are called wavelet coefficients.
In order to reconstruct the bit stream, an upsampling that restores the
lost samples by putting zeros between one sample and the other is used; then
another couple of filters is used.
In 2 dimensions the DWT is applied to each row and then to each column
of the image. As a result, the image is divided in subbands marked by their
position:

The image can be still seen only in the 1LL subband, which can be
transformed once more and divided into other 4 subbands: this is called 2nd
level DWT. This process can be repeated as long as a satisfactory compression
level is reached.
What does ‘image compression’
mean?
Compression allows to reduce the number of bits used to represent the
coded image, a number that is smaller than how we would use to store the image
in the original format. This number is variable and it depends on the
compression algorithm we use.
JPEG2000 supports two different kinds of compression: lossy and
lossless.
Why do we need a lower bit rate?
A lower bit rate implies a lower occupation of the transmission channel
and of the storage supports. Therefore it is advisable for a standard to assure
enough lower bit-rates depending on the user’s requirements. In this case
JPEG2000 reaches much better results than traditional JPEG.
What
does ‘quantization’ mean?
Quantization allows to pass from a continuous to a discrete set;
therefore it is the main step in image digitalization. Typically one byte is
used for each chromatic component.
It consists of the necessity of having a range
of integers to be codified. Both the encoder and the decoder must use a number
of bits enough large to represent the quantization factor for each subband.
HVS: what is
it? Which is its purpose in image compression?
The abbreviation HVS is the
acronym of Human Visual System. Its analysis and study represent a basic step
in the realization of image compression standards (both for still pictures and
video sequences of any kind), in order to improve the quality perceived in
image vision as much as possible. Human eye sensibility changes with respect to
the different image components (brightness level, spatial frequency, color,
contrast, temporal frequency, etc…). In the case of gray levels perception,
human eyes do not have a linear behavior but a behavior similar to a cube root.
Therefore, if we applied a linear quantization, the perceived quality would be
worse than in the case of a quantization made with a cube root function. From a
mathematical point of view, it is as though we had two filters whose transfer
functions are one the inverse of the other: putting them in a cascade
structure, we obtain an uniform perception of gray levels. Similar
considerations can be made for every aspect of an image.
In an image it is highly likely that pixels
close to each other have a similar color. By exploiting this property, called
spatial redundancy, it is possible to reduce the image size considerably, using
a smaller number of bits.
Which is the meaning of lossy
and lossless compression?
A lossy compression involves
a loss of information compared to the original signal; therefore it is not
possible to reconstruct the original signal from one that has been compressed
by using a lossy compression algorithm. Vice versa, a lossless compression does
not involve a loss of information and allows a perfect reconstruction of the
original signal after the decompression.
A lossy compression is
acceptable mainly for three reasons:
Ø
The human visual system is
often able to tolerate loss in image quality without compromising the ability
to perceive the contents of the scene.
Ø
It happens very often that
the digital input of the lossy compression algorithm is, in its turn, an
imperfect representation of the real image. It should be considered, indeed,
that the image is reconstructed starting from a finite number of samples, which
can take only discrete values.
Ø
A lossless compression would
never be able to grant the high levels of compression that can be obtained by
using a lossy compression and that are necessary for applications in storage
and/or distribution of images.
When is a lossless
compression preferable?
Ø
In medical applications, when
it is difficult to choose an acceptable error level for image representation.
Ø
In those cases when we deal
with highly structured images (typically non-natural ones) like texts or
graphics, which are usually more easily handleable by the means of lossless
compression techniques.
Ø
In all those applications
when images are frequently modified or re-compressed: using a lossy coding,
there would be too many errors in the image that would make the quality
absolutely unacceptable.
How can the compression of an
image be measured?
Image compression is measured
by the means of an index called Compression Ratio: =
;
and
represent the
number of pixels along the horizontal and vertical dimension of the image, B is
the number of bits required to represent each pixel, while ||c|| is the final
bit-stream obtained after the compression.
Does the peak signal to noise
ratio represent a reliable index of image quality?
The PSNR can give an
approximate index of image quality, but by itself it cannot make a comparison
between the quality of two different images. It is possible, indeed, that an
image with a lower PSNR might be perceived as an image of better quality
compared to one with a higher signal to noise ratio.
What is the ‘concept of
irrelevance’?
The concept of irrelevance is
applied to that information which is unnecessary for the exact reconstruction
of an image. There are two different kinds of irrelevance:
Ø
Visual irrelevance, if the
image contains details that cannot be perceived by human eye.
Ø
Irrelevance due to the fact
that, for specific applications (for instance medical and military ones), only
some specific regions of the image may be of interest rather than the image in
its wholeness. Therefore, those parts that do not belong to the region of
interest are classified as irrelevant.
A code block is a rectangular
assemblage (Tile) of coefficients belonging to the same subband. Each Tile can
be identified by four parameters:
and
locate the starting
point of each rectangle, whereas
and
identify its size.
Differential Pulse Code Modulation. It is a predictive coding used in
order to transmit the value of a sample calculated by the means of previous
samples instead of the real value.
DPCM is typically
used before applying entropy coding, after the wavelet transform, in order to
optimize the performance of the compression algorithm.
What is the embed
Filtering the image through subsequent quality layers creates the
embedded bit-stream; the first one (Q0) codifies the different
code-blocks by using a prefixed length l0 and minimizing the
distortion. The following layers contain additional contributions related to
every code-block, minimizing further the distortion.
Each quality layer contains the contributions of each code-block.
In order to have an embedded bit-stream that allows scalability in
distortion, it is necessary to introduce enough information to identify the
contributions of each quality layer related to the code-blocks.
Which is the difference between an embedded and a non-embedded
bit-stream?
The most important advantage of using an embedded bit-stream rather than
a non-embedded one is that it is possible to choose the aimed compression level
after the compression of the source: this is called an a-posteriori approach.
It is therefore possible to carry out scalability in distortion.
The chromatic space is a vectorial space used to represent the colors.
The most important chromatic spaces are:
· CYMB (Cyan, Yellow, Magenta,
Black);
What is the meaning of ‘higher quality image’?
There are different approaches to image quality evaluation and they are
based on objective and subjective parameters.
The quality of a compressed image is evaluated by analyzing the
difference between the original and the compressed one.
One of the most widely used parameters for the evaluation of image
quality is the MSE.
MSE: what is
it? Is it reliable? Is there an alternative?
MSE means ‘Mean Square Error’. It represents the classical error
estimate given by the equation:

where M and N are the image dimensions.
It is a widely used criterion but is often not enough representative.
JND and Hosaka Plots may be used as an alternative to MSE.
Many new applications of
JPEG2000 standard demand that compress data should be transmitted over channels
with different error characteristics. For example, ‘wireless’
communications are subject to ‘random’ channel errors, while Internet
communications are subject to a loss of data due to congestioning traffic.
As we know, JPEG2000 carries out
a codification that is independent from the code-blocks; this is an advantage
in error resilience, as each error in the ‘bit-stream’ that corresponds to a
code-block will remain inside it.
How much does the PSNR improve if we codify an image using JPEG2000
instead of JPEG?
The following graphic, related to the same image coded
with JPEG baseline and JPEG2000, shows that despite of using the same number of
bits per pixel, the difference between the two standards is remarkable. For
example, for 1.125 bpp, JPEG2000 performs better with a difference of 6 dB;
this confirms the higher visual quality of the new standard.
|
Figure 1: JPEG baseline (0.125
bpp) |
Figure 2: JPEG 2000 (0.125
bpp) |

The Canvas is a reference
greed (that is a coordinate system) that JPEG2000 uses to describe the action
of different components (typically the color components) on the image.
This greed allows to deal more
easily with components management, above all if we notice that these components
can have different size. A particular localization inside the Canvas is given
to each sample of each component.
Why do we make a distinction
between bit-stream and code stream?
The two expressions are not
equivalent but they are related to different objects.
Ø
The bit-stream is the
sequence of bits derived from the codification of a set of symbols; it is
typically used to refer to the sequence of bits that comes from the codification
of each code-block.
Ø
The code-stream is, on
the other hand, a set made up of several bit-streams to which the necessary
information for the image decodification and reconstruction is associated. This
additional information may refer, for example, to the locations of particular
bit-streams or to information regarding transformation, quantization and coding
methods.
&n