Tuesday, August 4, 2009

Data Hiding in Images

Virtually all sophisticated steganographic methods hide a message by embedding it as low-level noise in an image or audio file, which then becomes the cover file. This approach has two disadvantages: the information-hiding capacity of a cover file is small, so a large cover file is needed to hide a substantial amount of data; and once data is hidden in an image or audio file, any lossy compression destroys the embedded data. It seems that such an image should be compressed with lossless compression only, but this article shows how secret data can be hidden even in a lossily compressed image. The data-hiding methods described here are mathematically simple.

LSB Encoding

Data can be hidden in a grayscale or color image because slight changes to the colors of pixels are many times imperceptible to the eye (after all, changing the original pixels is the basis of all lossy image compression methods. The principle of LSB (least significant bit) encoding is to hide data bits by storing them in the least significant bits of the pixels of an image. The modified image is referred to as a stegoimage. One of the first implementations of LSB encoding was due to Romana Machado. Given a color image A with three bytes per pixel, each of the three color components, typically red, green, and blue, is specified by one byte, so it can indicate one of 256 shades. The data to be hidden is a stream of bits, each of which is hidden by storing it as the least significant bit (LSB) of the next byte of image A, replacing the original bit. Obviously, the least significant bit of the next byte of A is either a 0 or a 1. The next bit to be hidden is also a 0 or a 1. This means that, on average, only half the least significant bits have to be modified by the bits being hidden. Each color pixel is represented by three bytes, so changing half the bytes means changing 1 or 2 least significant bits per pixel, on average. Small changes in the color of a few isolated pixels may be noticeable if the pixels are located in large uniform areas or on sharply defined boundaries. It is therefore important to choose the cover image A carefully. It should contain a large number of small details with many different colors. Such an image is termed busy. An improvement to the basic LSB method is to hide a bit in a pixel only i f the pixel satisfies certain conditions. If the variance of the luminances of the immediate neighbors of a pixel is very high (an indication that the pixel may be on a boundary) or very low (an indication that it is located in a uniform region), the pixel should be skipped. If a pixel passes these tests and is modified, it is important to verify that it passes the same tests after modification, since otherwise the decoder would not be able to reliably identify the pixels that have been skipped. If a pixel does not pass the tests after being modified, it should be returned to its original value.

If image A is to be compressed later, the compression should, of course, be lossless. If the owner forgets this, or i f someone else compresses image A with a lossy compression algorithm, some of the hidden data will be lost during the compression. Thus, this basic method is not robust. The hiding capacity of this method is low. If one data bit is hidden in a byte, then the amount of data hidden is 1/8 = 12.5% of the cover image size. As an example, only 128 K bytes can be hidden in a 1 Mbyte cover image. It is easy to come up with sophisticated variants of this method that use a stego-key and are therefore more difficult to break. Here are a few ideas.
1. If the key is, say, 1011, then data bits are hidden in the least significant bits of
bytes 1, 3, and 4 but not of byte 2, then in bytes 5, 7, and 8, but not in byte 6, and so on.

2. Bits are stored in selected bytes as in variant 1 but are stored alternately in the least-significant position and in the second least-significant position of bytes of the cover image.

3. The stego-key is again used as the seed of a pseudorandom number generator, and a random sequence of small positive integers Ti is generated. Data bits are hidden in pixels as in variant 3, but when an index 1 + r1 + r2 + • • • + rj exceeds the size of the image, it is computed modulo this size. The result is that data bits are hidden in pixels located farther and farther away from the start of the cover image, and following bits are hidden in bytes close to the start. It is easy to have a collision (a pixel that has already been modified is selected again), so this variant has to maintain a list of pixels that have been modified. When the algorithm arrives at such a pixel, the next pseudorandom number is drawn, and another pixel is selected for hiding the next bit. Notice that the decoder can mirror this operation.

4. Proceed as in the previous method but without maintaining a list of pixels that have been modified. Some of the embedded bits will be corrupted by bits stored u on top" of them, but these errors can be corrected by adding an error-correcting code to the data before it is hidden.

Robust Data Hiding in JPEG Images

Embedding data bits in the least significant positions of image pixels is simple and effective, but it has a serious drawback. The embedded data becomes seriously damaged and may even become completely unreadable when the image is modified as a result of image processing or lossy compression. Digital image processing techniques for sharpening images, increasing contrast, and filtering change the values of many pixels. Lossy image compression methods, such as the popular JPEG , have the same effect. JPEG compression may shrink an image to a small percentage of its original size. After decompression, an observer may not detect any changes from the original image, but a direct check will reveal that many pixel values have changed.
The attribute of color images that makes such lossy compression possible is called noise. A color image may have many pixels with similar colors, and the eye cannot tell when some of them (or even many of them) have been slightly modified. In contrast, a monochromatic image (i.e., an image with one foreground color on a uniform back-ground) has very little noise, so it is much harder to change pixel colors without visually modifying such an image.

Step 1. The color representation of the pixels is changed from RGB to a luminance-chrominance color space, such as YCbCr. The remaining steps are executed on each color component separately.

Step 2. The image is partitioned into blocks of 8x8 pixels each and the remaining steps are executed on each block separately.

Step 3. The 64 pixel values of a block are transformed by the discrete cosine transform (DCT) to become 64 transform coefficients. The DCT has the following advantages.
(1) Many of the coefficients are zero or small numbers,
(2) the nonzero coefficients
are concentrated at the top-left corner of the block, and
(3) small changes to the nonzero
coefficients do not affect the inverse DCT much.

Step 4. The nonzero coefficients are quantized. This is the lossy step, where image information is irretrievably lost. Quantization is done by dividing each coefficient by a quantization coefficient (QC, an integer taken from a quantization table) and rounding the result. The JPEG standard recommends certain quantization tables for the luminance and chrominance components, based on human factors studies.

Step 5. The 64 DCT coefficients are collected by scanning the block in zigzag. Because of the nature of the DCT, the nonzero coefficients are mostly concentrated at the start of the resulting sequence and are followed by runs of zeros interspersed by a few nonzero coefficients.

Step 6. Each nonzero coefficient and each run of zero coefficients is replaced by a Huffman code. The codes are written on a file, and they (plus the values of certain parameters used in the preceding steps) constitute the compressed image.

free hit counter


  1. This comment has been removed by the author.

  2. hi, im doing image data hiding project in matlab. i need some sample code. it ll be useful for implementing my project. can u able to send the code... karthi151088@gmail.com. thanks.