| Research |
Search |
The Compression understood as a joint process of Modeling and Coding [RL81].
The initial advancements in compression of one-dimensional signals extended to the image domain by concatenating the image pixels in a single stream in an appropriate order. Run-length encoding (RLE) replaces the runs of same-colored pixels by two numbers: color and length of the run [Cap59].
Sophisticated pixel-ordering techniques for one-dimensional sequential compression [NW80].
Statistical approaches are based on modeling the image according to the probability distribution of the pixel values. Shannon-Fano [Sh48], Huffman coding [Huf52, Vit87] and integral codes [Col66, Ric79, Will91] assign shorter codes to more frequent pixel color values and vice versa, as for example in a Morse alphabet.
The Huffman encoding process is usually very fast and not complex. However, because of the integral property, the probability distribution of the code does not necessarily match the distribution of the source. Modified Huffman (MH) algorithm applies Huffman code, not directly to the pixels, but to runs of pixels [ITU T.4, Hun80].
The following techniques have been developed to take advantage of a two-dimensional correlation between the image lines:
Another approach is to exploit the correlation between successive lines of the image. Relative element address designate (READ) method encodes the location of the boundaries of the runs (point of transitions from black to white and vice versa) relative to the corresponding positions in the previous line. If there are no such positions within three pixels in the reference line, one-dimensional run-length coding is used [ITU T.4].
An arithmetic coding invented by Rissanen and Langdon [RL79]. It represents the entire input message as a small interval in the range [0,1]. The resulting codeword is the binary code representation of the interval. Arithmetic coding is an optimal coding method as regards the model. It has no limitation of integral codes and suits for compression of binary images. It is also well suited for dynamic modeling, because there is no need to store and update complex data structures such as Huffman trees.
Rissanen and Langdon proposed to use Shannon’s context-based statistical modeling conjunction with arithmetic coding [LR81]. The idea of context-based modeling is to obtain the statistical model of the image by conditioning the probability distribution of the pixels on the context. The context is determined by the combination of the pixels in the local neighborhood, which is defined by the template.
Context-based statistical modeling and arithmetic coding are implemented in the international standard for compression of binary images in communications JBIG (Joint Bi-level Image Experts Group) [JBIG1].
The solutions for multi-tone and grayscale image compression were presented by separating the images into bit-planes [WRA96, JKW98].
Another approach to deal with multi-tone images has been developed by At&T and is called "DjVu". It classifies the pixels of the image either as foreground (text, drawings) or background (pictures, photos, paper texture). The classification is compressed as a binary image, whereas a progressive, wavelet-based lossy compression technique is applied to the foreground and background images [Haf+99].
Moffat has experimented with various sizes and shapes of the context templates in order to improve the image compression [Mof91]. Although observing the limitations for the size of the context template (~14 pixels), he has demonstrated the potential of larger templates of up to 24 pixels and has proposed the two-level modeling technique using both 10- and 24-pixel templates.
The variable-size modeling based on a context tree has been introduced by Rissanen in [Ris83]. Martins and Forchhammer have recently studied variable-size context modeling and proposed both context tree and "free tree" techniques, were they varied the number and the positions of the context pixels depending on the pixel neighborhood [MF98].
Remarkable improvements in image compression have been achieved by specializing in some known image types (e.g. text images) and exploiting global dependencies. The emerging standard JBIG2 [JBIG2] segments a page into different classes of image data, in particular, textual, halftone and generic (other) [TK99], and utilize the repetitive nature of the textual and halftone images. JBIG2 will also provide lossy image compression [MF99], and quality and content progressive coding [How+98].
For textual data, JBIG2 uses pattern matching techniques, which are based on the following works:
JBIG2 will also address the compression of halftone image data using either of the two following methods:
| Updated: Aug, 2001 | © Eugene Ageenko |