Tuesday, May 12, 2009

Generational loss

When working on a large project, it often gets compressed multiple times. Typically--and especially for non-commercial use--the quality loss between 3 generations is irrelevant, but it got me thinking: what happens after more generations? Can a common lossy compression algorithms detect that they already compressed the data and can recompress without additional loss? Short answer: no.


Procedure: I used this script (LAME, 192kbps) to recompress a 30 second clip of "Hey Jealousy" performed by Hit the Lights, originally by Gin Blossoms.

1st generation
10th generation
25th generation
50th generation
100th generation
200th generation
400th generation

Within 10 generations, high frequencies are distorted, making it sound like an over-compressed clip from a camera phone.

By the 25th generation, the distortions are more severe, and the volume is decreasing.

At the 100th generation, random blippy distortions overpower the significantly quieter music.

By the 400th generation, the clip is practically silent.


For JPEG, I threw a wrench into the mix. I've tried recompressing an image with identical settings, but the results weren't identical, so I decided to do something that abuses the compression scheme. JPEG uses 16x16 pixel blocks. At each generation, I shifted the image 8 pixels so blocks would never consist of the same data as in the parent generation. As a side effect, the black region that was revealed during the offset was introduced into the image.

Initially, it looks like noise is introduced and colors bleed and become harsher. Eventually, colors bleed so much that there are just a few, and finally, saturation is lost and the picture looks like a a two-bit dithered image.

the image continues to lose detail at a slow rate, favoring white.

JPEG is far more robust than mp3. Distortions not near the edges are minor for at least 50 generations, and images from 200 generations to 800 have an artistically-interesting grain.  JPEG, however, also had the luxury of increasing the file size, and it did; by the 10,000th generation, the image was roughly twice the size of the original.

Original image provided by kwerfeldein under the Creative Commons Attribution license.

Video showing the generational loss

Images are at generations 0, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192

Bonus: image resizing

For this test, I used Photoshop's highest quality bicubic resize for reductions. The left side was resized 13 times (5% each time), and the right side was reduced just once. Note the clarity of the branches in each and the detail in the foreground field. Not surprisingly, the consequence of too many resizes is a blurrier image.