Arturo Campos home page


- Compression programming
- Old articles
- Abstracts
- Links

Contact information


Here are the abstracts of every article, appart from an explanation it also has links to the online version in html format, and to a zipped version of the article which has the article, images, and other information needed (like source), at the right you also have the date of the last update of this article.

Arithmetic coding, entropy coder
Bwt, block reduction
Canonical huffman
Crc-32, the standard Crc
Finite context modeling
Flexible parsing, improvement of lz 
Lz77, also called lzss
Lzw, for gif decoding
Mtf, a transformation scheme
Implementing ppmc with hash tables
Quasi Static model
Range coder
Rle, Basic scheme

Title: Arithmetic coding, entropy coder Online: 20k Zip: 7k Date: 22-Jul-1999
Arithmetic coding is the only entropy coder which achieves the entropy, that's why it's widely used, it's only back draw is speed, though. In this article I talk about a simple implementation of arithmetic coding. Any programmer interested in compression can understand this article.

Title: Bwt, block reduction Online: 21k  Zip: 8k Date: 22-Jul-1999
Burrows and Wheeler transformation, an algorithm which transforms the input by sorting the suffixes and outputting an string which contains the order-(n-1) symbol, thus introducing a kind of redundancy which is then exploited with mtf and a final coder. This algorithm has proved to achieve high compression rates at high speeds. Probably the best compression algorithm invented in the 90s.

Title: Canonical huffman. Online: 15k  Zip: 5k Date: 23-Jul-1999
Canonical huffman is an improvement of huffman, which defines a set of rules for the codes based on the lengths and number of codes, thus it can avoid passing the codes themselfes to the decoder.

Title: Crc-32, the standard Crc Online: 6k Zip: 7k Date: 10-Aug-1999
An algorithm widely used for error detection, based on the input it creates a 32 bits number which identifies it, this number is transmitted along with the compressed data, so the decoder can check if there was any error at the compressed data.

Title: Finite context modeling Online: 37k Zip: 12k Date: 16-Nov-1999
This is an introduction to finite context modeling covering any basic topic, which in my opinion everybody interested in data compression should read, and specially those interested in ppm implementations and theory. It also discuss some basic topics. That article is of an introductory level so any begginer can read it, though it can also teach new things to programmers who already know compression but not finite context modeling.

Title: Flexible parsing, improvement of lz  Online: 10k Zip: 4k Date: 24-Jul-1999
Flexible is approach of optimal parsing in real time, it tries to achieve this by examing all the matches of suffixes in a match, therefore it lacks of a big worst case in very repetitive data. It assumes knowledge of any lz variant.

Title: Lz77, also called lzss Online: 40k Zip: 14k Date: 23-Jul-1999
Lz77 compresses the input file by substituting phrases of a given length with references in the form offset-length. This presents lz77 and a few improvements. Recommened for begginers.

Title: Lzw, for gif decoding Online: 27k Zip: 9k Date: 23-Jul-1999
Lempel-Ziv-Welch, the algorithm patented by Unisys, which is used in the widely used file format Gif, in this article not only the lzw is teached but also how to decompress a simple gif file. Remember that lzw is patented.

Title: Mtf, a transformation scheme Online: 9k Zip: 3k Date: 24-Jul-1999
Move To Front, is a simple scheme which gives more importance to the most frequent symbols by recoding the input, the algorithm itself doesn't compress, but like in the case of bwt it allows to more compression. That's easy to learn.

Title: Implementing ppmc with hash tables Online: 111k Zip: 327k Date: 21-March-2000
"Implementing ppmc with hash tables" explains ppmc and how to implement it without forgetting any aspect of it. It explains three different kinds of ppmc, using lazy exclusion, full exclusions (2.34bpb, high compression) and order-3h (fast with relative good compression, 2.66bpb).

Title: Quasi Static model Online: 19k Zip: 6k Date: 13-Aug-1999
Quasi Static Model, is a model used for arithmetic coding or with the range coder, it's an approach between an static model and an adaptative one, and it results in a combination of both, which is useful. Recomened for programmers who want a fast model for their order-0 arithmetic coder.

Title: Range coder Online: 24k Zip: 8k Date: 15-Aug-1999
This article talks about a version of arithmetic coding which is almost twice faster, but loses a little bit of compression, however it's too litle (0.01%) that you don't notice it. Recommened for programmers who already know arithmetic coding.

Title: Rle, Basic scheme Online: 7k Zip: 3k Date: 24-Jul-1999
Rle is a very simple scheme which tries to code runs of the same byte with the number of bytes that are equal. This is used in the pcx file format, though I don't talk about it. This is recommened for programmers with no experience about compression.

As you can read in the index I'm going to publish as soon as possible new articles, if you want to get an email when this happens, please fill this form, moreover you can use it to tell me which file format do you prefer. If you have any question about the articles feel free to email me.

My goal is to offer the best articles about compression for free. Thus I need your feedback on the articles to further improve them: What you don't know. What was confusing or unclear. What difficulties you found while implementing it. Or what you miss. Teeling me so is the only way of further improving articles for you, and depends only of you.

Welcome | Compression | Contact information
Copyright (c) Arturo San Emeterio Campos 2000. All rights reserved.