What Is Generation Loss? How Copying Degrades Quality

Generation loss is the gradual decline in quality that happens when a copy is made from a copy, then another copy is made from that copy, and so on. Each “generation” introduces small errors or artifacts that compound over time, eventually producing noticeable degradation. The concept applies to analog media like photocopies and VHS tapes, but it’s equally relevant in digital workflows whenever lossy compression is involved.

How Generation Loss Works in Digital Files

The key distinction is between copying a file and re-encoding it. If you duplicate a JPEG by dragging it into a new folder, the file is bit-for-bit identical and no quality is lost. But if you open that JPEG in an image editor, make even a minor change (or no change at all), and hit “Save,” the software decodes the image and then re-encodes it from scratch. That re-encoding process is where generation loss creeps in.

JPEG compression works by dividing an image into small 8×8 pixel blocks and converting each block into frequency data using a mathematical transform. The encoder then rounds those frequency values according to a quality table, discarding fine detail that the human eye is less likely to notice. This rounding is what makes the file smaller, but it’s also permanent. When the file is opened and saved again, the already-rounded values get rounded a second time, introducing new errors on top of the old ones. After enough cycles, you’ll see the familiar signs: blocky artifacts, color banding, and blurred edges.

The same principle applies to video and audio. Formats like H.264 and MP3 use lossy compression that throws away data each time content is decoded and re-encoded. The degradation is especially pronounced when compression settings differ between generations, because each pass optimizes for slightly different targets.

Why Files Can Actually Get Bigger

One counterintuitive effect of generation loss is that re-encoded files sometimes grow in size rather than shrink. The artifacts introduced by repeated compression add visual complexity (noise, ringing, blockiness) that the encoder treats as real detail. This extra “information” increases the file’s entropy, meaning the encoder needs more data to describe what was originally a cleaner, simpler image. So you end up with a larger file that looks worse.

How Professionals Avoid It

In video production, the standard approach is to transcode source footage into an intermediate (or “mezzanine”) format before editing. Formats like ProRes and DNxHR are designed to be visually lossless, meaning they can survive multiple rounds of encoding and decoding without perceptible quality loss. A ProRes 444 file, for example, retains 10-bit color depth and is commonly used as an editing master even though it’s technically not mathematically lossless, because the compression is gentle enough that artifacts don’t accumulate across generations.

For true archival preservation, organizations like the Library of Congress recommend genuinely lossless formats. Their 2024-2025 guidelines list FFV1 (a lossless video codec, version 3, in Matroska containers) as a preferred format for long-term video storage. For audio, lossless compression is explicitly recommended over lossy schemes. The logic is simple: if no data is discarded during compression, re-encoding introduces zero errors, and generation loss becomes impossible.

The practical rule for everyday use: always keep an uncompressed or lossless master copy of anything you plan to edit multiple times. Do your edits on that master, and only export to a lossy format (JPEG, MP4, MP3) as the final step.

Generation Loss in Analog Media

Before digital formats existed, generation loss was an unavoidable part of media production. Every time a VHS tape was copied, the analog signal picked up noise from the recording hardware. Photocopies of photocopies grew progressively blurrier and more contrasty. Audio cassette dubs lost high-frequency detail with each pass. In analog systems, there’s no such thing as a perfect copy, so some degree of generation loss was simply accepted as the cost of reproduction.

Digital technology eliminated this problem for exact copies (a file copy is identical to the original), but reintroduced it through lossy compression. The mechanism changed, but the core issue remained the same: each processing step that discards information makes the next step worse.

Model Collapse: Generation Loss for AI

A striking modern parallel has emerged in artificial intelligence. When AI language models are trained on text generated by other AI models, and those models are in turn trained on text from still earlier AI models, quality degrades in a pattern researchers call “model collapse.” A 2024 study published in Nature demonstrated that this process causes irreversible defects: the models progressively lose the ability to represent rare or unusual content, and their outputs converge toward a narrow, repetitive average.

The mechanism mirrors traditional generation loss almost exactly. Each AI generation slightly distorts the data distribution it learned from. Rare events in the training data (the “tails” of the distribution) disappear first, because the model underrepresents them in its output. The next generation, trained on that already-narrowed output, loses even more. Over enough cycles, the model’s understanding of reality collapses to a single point with almost no variation.

This has practical implications for the internet at large. As AI-generated text becomes more common online, future AI models trained on web-scraped data will inevitably ingest some of it. The researchers noted that data reflecting genuine human interactions will become increasingly valuable precisely because it hasn’t passed through this degenerative loop. In essence, the same principle that turns a sharp photograph into a blocky mess after ten re-saves can turn a sophisticated language model into a shallow one after enough recursive training cycles.