Get the Right Shrink
The way to compress an image as much as possible is to use the right compression tools. Some tools work best for images with only a few different colors, such as a symbolic map, whereas others are better at compressing images with a scattered collection of many different color shades, such as an aerial photo image.
To understand image compression strategies, consider the original format of most images. Images begin life as a series of pixels, or individual points in a two-dimensional grid of points. The image in Figure 1 is 632 x 3,179 pixels. Each horizontal row of 632 pixels is a scan line. Each pixel in a scan line represents a color; when seen close together, the pixels appear to be continuous gradations of tones. Zooming in to the pixel level (Figure 2) reveals the jagged edges of individual pixels. The lowest scan line of Figure 2 (diagrammed in Figure 3) has 10 yellow (Y) pixels followed by 15 blue (B) pixels followed by 10 more yellow pixels. We can symbolize this scan line from the original bitmap format as YYYYYYYYYYBBBBBBBBBBBBBBBYYYYYYYYYY.
Most compression routines attempt to process the scan lines separately by summarizing repeating patterns or by simplifying complex patterns. RLE compression, for example, examines scan lines for repeated values and summarizes them with a number and value. Thus, the above scan line with with RLE compression is 10-Y-15-B-10-Y. This reduces the scan line from 35 bytes (if each pixel uses 1 byte of storage) to 6 bytes (3 for counts and 3 for color values) almost a 6:1 reduction. As one might expect, RLE works best on images with continuous colors such as Figure 1 s symbolic map. Another strategy, called the LZW algorithm, compresses images by searching for repeated patterns, storing them in a dictionary and referring to them by a single code that matches that dictionary entry. No data is lost, it is only represented by a space-saving short code rather than spelled out literally at each occurrence. In the scan line from Figure 3, the LZW algorithm compressed result might read YYYYY[1]1BBBBB[2]2211. The code for five yellow pixels in a row is 1, as marked following the pattern's first appearance (YYYYY[1]). Whenever this pattern reappears, it is compressed from YYYYY into 1. In this example, the compression turns 35 bytes into 17, a ratio of about 2:1.
To make images even more compressible, LZW applies a simplification technique called differencing, which replaces the value of a pixel with the difference between it and the adjacent pixel, repeating the process for the entire image. This evens out close values between adjacent pixels, producing more patterns and higher compression. TIFF and GIF formats use the standard features of the LZW algorithm and sometimes the RLE algorithm.
It's not a format. Often called an image format, JPEG is actually a collection of compression algorithms. It is best suited for compressing images that do not have contiguous, repeating pixels of the same color. Figure 4 shows the same area as Figure 1, but as a section of a digital orthophoto -- a good candidate for JPEG compression -- instead of a symbolic map. Closer inspection at the pixel level reveals few long repeating pixels and a variety of values -- not a good candidate for RLE compression.