Re: Video file archiving



Artnut <art@xxxxxxxxx> wrote:
From this experience, I infer that unlike text files, the video files aren't
compressed well.

Because video files are already compressed. Try compressing a text
file, and then compress the result; you'll get equally crummy numbers as
when you try to compress video.

Here's an example of 21 characters, where the algorithm (here a naive
implementation of Run Length Encoding) is to take runs of letters and
replace them by a count and the letter:

bbbbbbbbbbbbbbbbbbbbb --[compress1]--> 21b

Saying 21b is a lot smaller than writing them all out. Now try
compressing the already compressed result:

21b --[compress1]--> 12111b [worse!]

You won't get any better with this algorithm--it's already compressed.

We could introduce a new compression algorithm to try to compress
further: "Divide the number by ten, and discard the remainder." Now
this happens:

21b --[compress2]--> 2b [wow--just two characters!]

Now when we uncompress it, we have to multiply by ten, and we get:

2b --[uncompress2]--> 20b

And uncompressing that with the first algorithm, we get:

20b --[uncompress1]--> bbbbbbbbbbbbbbbbbbbb

Which is really close to what we started with, just one "b" shy. We
managed to compress 21 characters into just 2 characters! 90%
compression!

The first algorithm operates by removing redundancy. (It is non-lossy,
like GIF, PNG, or gzip's LZ77.) After the first pass, all the
redundancy was removed, and it couldn't make any more gains. But the
original could still be reproduced exactly.

The second algorithm operates by discarding least significant data. (It
is "lossy", like JPEG or MPEG.) You can gain more by discarding more
data. The data we lost prevented the original data from being exactly
reproduced, though it sure looks close! This is fine for photos and
movies, but maybe not so fine for a text file with your favorite cookie
recipe in it.

-Beej

.



Relevant Pages

  • Re: Attention Sean - question about CSI
    ... is possible to compress any given string to 1 bit. ... except the compressed data and the description of the algorithm, ... In our universe, there is one common reference: ...
    (talk.origins)
  • Re: compression type
    ... occurance of what repeats, for there to be a token, for how tokens ... see its construction algorithm. ... It's very limited to think that's the only way to compress, ... made different, to the other shape, where the math to say one shape ...
    (comp.compression)
  • Re: Attention Sean - question about CSI
    ... is possible to compress any given string to 1 bit. ... except the compressed data and the description of the algorithm, ... In our universe, there is one common reference: ...
    (talk.origins)
  • Re: Attention Sean - question about CSI
    ... It should compress very ... If I am allowed to choose the compression algorithm *after* ... which compresses that string to a single bit. ... from the binary code, and use it to decompress the data, then ...
    (talk.origins)
  • Re: Attention Sean - question about CSI
    ... It should compress very ... If I am allowed to choose the compression algorithm *after* ... which compresses that string to a single bit. ... and decompress readable text to algebraic combinations of large random ...
    (talk.origins)