Copyright (c) Arturo San Emeterio Campos 1999. All rights reserved. Permission is granted to make verbatim copies of this document for private use only.
Download the whole article zipped.
Table of contents
Move To Front is a transformation algorithm which does not compress data but can help to reduce redundancy some times, like after a bwt transformation, where a symbol which has recently seen, appears again.
Mtf instead of outputting the symbol (byte), outputs a code which refers to the position of the symbol in a table with all the symbols, thus the length of the code is the same as the length of the symbol. (when using bytes as symbols, byte based output can be performed) Both encoder and decoder should initialize the table with the same symbols in the same positions. Once a symbol is being processed the encoder outputs its position in the table and then put it in the top of the table, position 0, All the codes from the position till the position of the symbol being coded are moved to the next position. This simple scheme assign codes with lower values for more redundant symbols. (symbols which appear more frequently) Let's say we have an string like the following: "abaacabad" Mtf will process it in this way:
"011021213" As you can see this is easier to entropy code. In runs of
the same byte, where mtf can transform "aaaaabbbb" to "000001000" an entropy
encoder (I mean order-0) will assign less bits to 0 which now occurs more
often than "a" or "b" did. Don't try to compress the output of mtf with
a sensitive context compressor, results will be poor, because this transformation,
though it's reversible, completely eliminates the contexts.
The first thing to do is having an array of 256 entries (in the case we have an alphabet with 256 entries, like when coding bytes), unsigned char list;
Once you have it, you have to initialize it, the code is very easy:
Then you start with the main loop, get a byte, and search it in the array:
for (index=0;;++index) scan the whole array, no need of end condition
if (_byte_ == bytes_order[index]) till we found it
break; //then, break the for
Once you are here, you output the byte, and then update the array, moving
all the bytes till the position of our match, and then putting in the position
0 the byte:
for ( ; index!=0 ; --index) scan from the position of the byte to 0
list[index] = list[index-1]; move all of them to the right
list=_byte_ update the order 0
This is the end of the main loop, do that for all the symbols to code.
Well, once you have done the encoder, the decoder is very easy. Like the encoder does, initialize the table, then the main loop, get a code, this is the position in the table, and then get the symbol from the table:
code = getc(file);
_byte_ = list[code]; get the byte
Now update the table, like the encoder does:
index = position; the position to start
list[index] = list[index-1];
This is everything you need for a mtf codec.
If you are a beginner I recommend you to learn lz77 and static huffman. If you find any mistake in this text, or have any idea, email me.
You can reach me via email at: firstname.lastname@example.org Also don't forget to visit my home page http://www.arturocampos.com where you can find more and better info. See you in the next article!
This article comes from Arturo Campos home page at http://www.arturocampos.com Visit again soon for new and updated compression articles and software.