Standard data correction methods takes some blocks of the data and enlarge it to protect against some specific set of errors. We loose whole block if we get out of this set.

For example Hamming (7,4) uses 3 additional checksum bits for 4 bits of information - it works fine when there is at most 1 error per block of 7 bits.

We use quite large redundancy, but is it safe now? The problem is with pessimistic cases - if expected error rate is 1/100 bits, still quite often it can happen that we have 2 errors in some of 7 bit blocks.

To be able to correct errors we are adding usually constant density of redundancy, but the density of errors usually isn't constant - can fluctuate - sometimes is above average, sometimes beyond.

When is above - we need a lot of redundancy to cope with these pessimistic cases.

When is beyond - we've used there more redundancy than it was really needed - we waste some capacity.

The idea is to somehow transfer these unused surpluses of redundancy to help with difficult cases.

Thanks of it instead of adding redundancy according to pessimistic error density, we could use for a bit above average density, which is usually a few orders of magnitude smaller.

To do this we cannot place data in independent blocks (like 7 bits in Hamming)- their redundancy is also independent.

We should use a coding which allow to hide redundancy so that each piece of it affects large part of the message.

I'll show now an example how it can be done.

Unfortunately it has various latency - simple errors can be quickly corrected, but more complicated may take long times... maybe there are better ways?

The idea is to use a very precise coding - such that any error would make that the following decoded sequence should be completely random bit sequence.

For example a block ciphers, which uses previous block to calculate the following one ... but there is much better coding for it I will say later about.

Now - add to the information some easily recognizable redundancy - for example insert '1' between each digit (each such bit is hidden in all succeeding bits of encoded message).

If while decoding it occurs that there is '0' in one of these places - that means we had some error before and most probably it's nearby.

Knowing statistical characteristics of expected errors, we can make list of most possible errors in such cases, sort them by their probability - on the top of this list there should be 'switched previous bit',... after a while there can appear 'switched two bits:...'. This list can be very large.

Now if we know that there (nearby) appeared some error, we take this list position by position, correct as it was really this case (switch some bits) and try to decode further fixed number of bits (a few dozens).

If everything is ok - we get only '1' on selected positions - we can assume that it was this error.

If not - try the next one from the list.

This list can be generated online - using large amount of time we could repair even badly damaged transmission.

While creating the list, we have to remember that errors can appear also in the succeeding bits.

Using block ciphers is a bit nasty - they are slow, we have large blocks to search for errors ...

There is new coding just ideal for above purpose - Asymmetric Numeral Systems (ANS)- new entropy coder, which has very nice properties for cryptography ... and data correction - it's much faster than block ciphers and uses small blocks of various length.

Here for example is demonstration about it:

http://demonstrations.wolfram.com/Da...umeralSystems/

What do You think about it?

agrees : Reed-Solomon codes fairly common today in deep space communications networks and I've even used them over noisy terestrial communications channels.