UNIT – 6

Image and Video Compression Techniques Huffman code, Run-Length Encoding, Relative Encoding, Lempel-Ziv Encoding, Real-Time Interactive Audio/Video, RTP, HTTP, and WWW.

HUFFMAN CODING

Example

• Say we want to encode a text with the characters

a, b,…, g occurring with the following frequencies:

a b c d e f g

Frequency 37 18 29 13 30 17 6

Fixed-Length Code

a b c d e f g

Frequency 37 18 29 13 30 17 6

Fixed-length 000 001 010 011 100 101 110

code

• Total size is:

(37 + 18 + 29 + 13 + 30 + 17 + 6) x 3= 450 bits

Variable-Length Code

a b c d e f g

Frequency 37 18 29 13 30 17 6

Variable- 10 011 111 1101 00 010 1100

length code

• Total size is:

37x2 + 18x3 + 29x3 + 13x4 + 30x2 + 17x3 + 6x4 = 402 bits

• A savings of approximately 11%

Constructing a Huffman Code

g,6 d,13 f,17 b,18 c,29 e,30 a,37

Constructing a Huffman Code

g,6 d,13 f,17 b,18 c,29 e,30 a,37

Constructing a Huffman Code

19 f,17 b,18 c,29 e,30 a,37

g,6 d,13

Constructing a Huffman Code

f,17 b,18 19 c,29 e,30 a,37

g,6 d,13

Constructing a Huffman Code

f,17 b,18 19 c,29 e,30 a,37

g,6 d,13

Constructing a Huffman Code

35 19 c,29 e,30 a,37

f,17 b,18 g,6 d,13

Constructing a Huffman Code

19 c, 29 e,30 35 a,37

g,6 d,13 f,17 b,18

Constructing a Huffman Code

19 c,29 e,30 35 a,37

g,6 d,13 f,17 b,18

Constructing a Huffman Code

48 e,30 35 a,37

19 c, 29 f,17 b,18

g,6 d,13

Constructing a Huffman Code

e,30 35 a,37 48

f,17 b,18 19 c,29

g,6 d,13

Constructing a Huffman Code

e,30 35 a,37 48

f,17 b,18 19 c,29

g,6 d,13

Constructing a Huffman Code

65 a, 37 48

e,30 35 19 c,29

f,17 b,18 g,6 d,13

Constructing a Huffman Code

a,37 48 65

19 c,29 e,30 35

g,6 d,13 f,17 b,18

Constructing a Huffman Code

a,37 48 65

19 c,29 e,30 35

g,6 d,13 f,17 b,18

Constructing a Huffman Code

85 65

a,37 48

e,30 35

19 c,29 f,17 b,18

g,6 d,13

Constructing a Huffman Code

65 85

e,30 35

a,37 48

f,17 b,18 19 c,29

g,6 d,13

Constructing a Huffman Code

150

65 85

e,30 35

a,37 48

f,17 b,18 19 c,29

g,6 d,13

Resulting Code

0 1

a 10

0 1 0 1

b 011

c 111

e a d 1101

0 1 0 1

e 00

f b c f 010

0 1

g 1100

g d

Run-Length Encoding (RLE)

Run-length encoding is a data compression algorithm that is supported by most bitmap file formats, such

as TIFF, BMP, and PCX. RLE is suited for compressing any type of data regardless of its information content,

but the content of the data will affect the compression ratio achieved by RLE. Although most RLE algorithms

cannot achieve the high compression ratios of the more advanced compression methods, RLE is both easy to

implement and quick to execute, making it a good alternative to either using a complex compression algorithm

or leaving your image data uncompressed.

RLE works by reducing the physical size of a repeating string of characters. This repeating string, called a run,

is typically encoded into two bytes. The first byte represents the number of characters in the run and is called

the run count. In practice, an encoded run may contain 1 to 128 or 256 characters; the run count usually

contains the number of characters minus one (a value in the range of 0 to 127 or 255). The second byte is the

value of the character in the run, which is in the range of 0 to 255 and is called the run value.

Uncompressed, a character run of 15 A characters would normally require 15 bytes to store:

AAAAAAAAAAAAAAA

The same string after RLE encoding would require only two bytes:

15A

The 15A code generated to represent the character string is called an RLE packet. Here, the first byte, 15, is the

run count and contains the number of repetitions. The second byte, A, is the run value and contains the actual

repeated value in the run.

A new packet is generated each time the run character changes, or each time the number of characters in the run

exceeds the maximum count. Assume that our 15-character string now contains four different character runs:

AAAAAAbbbXXXXXt

Using run-length encoding this could be compressed into four 2-byte packets:

6A3b5X1t

Thus, after run-length encoding, the 15-byte string would require only eight bytes of data to represent the string,

as opposed to the original 15 bytes. In this case, run-length encoding yielded a compression ratio of almost 2 to

Long runs are rare in certain types of data. For example, ASCII plaintext seldom contains long runs. In the

previous example, the last run (containing the character t) was only a single character in length; a 1-character

run is still a run. Both a run count and a run value must be written for every 2-character run. To encode a run-in

RLE requires a minimum of two characters' worth of information; therefore, a run of single characters actually

takes more space. For the same reasons, data consisting entirely of 2-character runs remain the same size after

RLE encoding.

In our example, encoding the single character at the end as two bytes did not noticeably hurt our compression

ratio because there were so many long characters runs in the rest of the data. But observe how RLE encoding

doubles the size of the following 14-character string:

Xtmprsqzntwlfb

After RLE encoding, this string becomes:

1X1t1m1p1r1s1q1z1n1t1w1l1f1b

RLE schemes are simple and fast, but their compression efficiency depends on the type of image data being

encoded. A black-and-white image that is mostly white, such as the page of a book, will encode very well, due

to a large amount of contiguous data that is all the same color. An image with many colors that is very busy in

appearance, however, such as a photograph, will not encode very well. This is because the complexity of the

image is expressed as a large number of different colors. And because of this complexity, there will be relatively

few runs of the same color.

Variants on Run-Length Encoding

There are several variants of run-length encoding. Image data is normally run-length encoded in a

sequential process that treats the image data as a 1D stream, rather than as a 2D map of data. In sequential

processing, a bitmap is encoded starting at the upper left corner and proceeding from left to right across each

scan line (the X-axis) to the bottom right corner of the bitmap (shown in Figure 9-2, a). But alternative RLE

schemes can also be written to encode data down the length of a bitmap (the Y-axis) along with the columns (shown

in Figure 9-2, b), to encode a bitmap into 2D tiles (shown in Figure 9-2, c), or even to encode pixels on a

diagonal in a zig-zag fashion (shown in Figure 9-2, d). Odd RLE variants such as this last one might be used in

highly specialized applications but are usually quite rare.

Figure 9-2: Run-length encoding variants

Another seldom-encountered RLE variant is a lossy run-length encoding algorithm. RLE algorithms are

normally lossless in their operation. However, discarding data during the encoding process, usually by zeroing

out one or two least significant bits in each pixel, can increase compression ratios without adversely affecting

the appearance of very complex images. This RLE variant works well only with real-world images that contain

many subtle variations in pixel values.

Make sure that your RLE encoder always stops at the end of each scan line of bitmap data that is being encoded.

There are several benefits to doing so. Encoding only a simple scan line at a time means that only a minimal

buffer size is required. Encoding only a simple line at a time also prevents a problem known as cross-coding.

Cross-coding is the merging of scan lines that occurs when the encoding process loses the distinction between

the original scan lines. If the data of the individual scan lines are merged by the RLE algorithm, the point where

one scan line stopped and another began is lost or, at least, is very hard to detect quickly.

Cross-coding is sometimes done, although we advise against it. It may buy a few extra bytes of data

compression, but it complicates the decoding process, adding time costs. For bitmap file formats, this technique

defeats the purpose of organizing a bitmap image by scan lines in the first place. Although many file format

specifications explicitly state that scan lines should be individually encoded, many applications encode image

data as a continuous stream, ignoring scan-line boundaries.

Have you ever encountered an RLE-encoded image file that could be displayed using one application but not

using another? Cross-coding is often the reason. To be safe, decoding and display applications must take

cross-coding into account and not assume that an encoded run will always stop at the end of a scan line.

When an encoder is encoding an image, an end-of-scan-line marker is placed in the encoded data to inform the

decoding software that the end of the scan line has been reached. This marker is usually a unique packet,

explicitly defined in the RLE specification, which cannot be confused with any other data packets. End-of-scanline markers are usually only one byte in length, so they don't adversely contribute to the size of the encoded

data.

Encoding scan lines individually has advantages when an application needs to use only part of an image. Let's

say that an image contains 512 scan lines, and we need to display only lines 100 to 110. If we did not know

where the scan lines started and ended in the encoded image data, our application would have to decode lines 1

through 100 of the image before finding the ten lines it needed. Of course, if the transitions between scan lines

were marked with some sort of easily recognizable delimiting marker, the application could simply read through

the encoded data, counting markers until it came to the lines it needed. But this approach would be a rather

inefficient one.

Another option for locating the starting point of any particular scan line in a block of encoded data is to

construct a scan-line table. A scan-line table usually contains one element for every scan line in the image, and

each element holds the offset value of its corresponding scan line. To find the first RLE packet of scan line 10,

all a decoder needs to do is seek the offset position value stored in the tenth element of the scan-line lookup

table. A scan-line table could also hold the number of bytes used to encode each scan line. Using this method, to

find the first RLE packet of scan line 10, your decoder would add together the values of the first nine elements

of the scan-line table. The first packet for scan line 10 would start at this byte offset from the beginning of the

RLE-encoded image data.

Relative Encoding

Relative encoding is another data compression algorithm. While run-length encoding, bitmap encoding, and

diagram and pattern substitution were trying to reduce repeating data, with relative encoding the goal is a bit

different. Indeed run-length encoding was searching for long runs of repeating elements, while pattern

substitution and bitmap encoding were trying to “map” where the repetitions happen to occur.

The only problem with these algorithms is that the input stream of data is not always constructed out of

repeating elements. It is clear that if the input stream contains many repeating elements there must be some way

of reducing them. However, that doesn’t mean that we cannot compress data if there are no repetitions. It all

depends on the data. Let’s say we have the following stream to compress.

1, 2, 3, 4, 5, 6, 7

It's hard to imagine how this stream of data can be compressed. The same problem may occur when trying to

compress the alphabet. Indeed the letters of the alphabet are the very base of words so it is the minimal part for

word construction and therefore hard to compress.

Fortunately, this isn’t true always. An algorithm that tries to deal with non-repeating data is relative encoding.

Let’s see the following input stream – years from a given decade (the 90′s).

1991, 1991, 1999, 1998, 1991, 1993, 1992, 1992

Here we have 39 characters and we can reduce them. A natural approach is to remove the leading “19” as we

humans often do.

91, 91, 99, 98, 91, 93, 92, 92

Now we have a shorter string, but we can go even further by keeping only the first year. All other years will as

relative to this year.

91, 0, 8, 1, 7, 2, 1, 0

Now the volume of transferred data is reduced a lot (from 39 to 16 – more than 50%). However, there are some

questions we need to answer first because the stream won't always be formatted in such a pretty way. How

about the next character stream?

91, 94, 95, 95, 98, 100, 101, 102, 105, 110

We see that the value 100 is somehow in the middle of the interval and it is handy to use it as a base value for

the relative encoding. Thus the stream above will become:

-9, -6, -5, -5, -2, 100, 1, 2, 5, 10

The problem is that we can’t always decide which value will be the base value so easily. What if the data was

dispersed differently:

96, 97, 98, 99, 100, 101, 102, 103, 999, 1000, 1001, 1002

Now the value of “100” isn’t useful, because compressing the stream will get something like this:

-4, -3, -2, -1, 100, 1, 2, 3, 899, 900, 901, 902

To group the relative values around “some” base values will be far handier.

(-4, -3, -2, -1, 100, 1, 2, 3) (-1, 1000, 1, 2)

However, deciding which value will be the base value isn’t that easy. Also, the encoding format is not so trivial.

On the other hand, this type of encoding can be useful in some specific cases as we can see below.

Application

This algorithm may be very useful in many cases, such as this one: there are plenty of map applications around

the web. Some products such as Google Maps, Yahoo! Maps, Bing Maps are quite famous, while there are also

very useful open source projects like OpenStreetMap. The websites using these apps number in the thousands.

A typical use case is to transfer lots of Geo coordinates from a web server to a browser using JSON. Indeed any

GEO point on Earth is relative to the point (0,0), which is located near the west coast of Africa, however on

large zoom levels, when there are tons of markers we can transfer the information with relative encoding.

joy_of_learing

Search This Blog

Data communication (HUFFMAN CODING)

UNIT – 6

Comments

Post a Comment