Introduction to lzp
by
Arturo San Emeterio Campos








Disclaimer
Copyright (c) Arturo San Emeterio Campos 2000. All rights reserved. Permission is granted to make verbatim copies of this
document for private use only.

Note
"Compression programming" is on a early stage. This text is not finished.
 

Table of contents
What is lzp?
In what do lzp variations differ, and which one should I choose?
 
 

What is lzp?

Lzp is a variation of lz77. Which is a compression algorithm invented by Lempel and Ziv. Lz is based on the fact that the same phrase tend to occur again. It then replaces the whole phrase with a code. That code represents a position in the text seen till the date. The code is generally much smaller than the phrase and thus the process compress the file.

Lzp is directly influenced by finite context models. Instead of choosing from all the positions available in the input file, it selects a single one. It chooses the last occurrence in the current context.
 

In what do lzp variations differ, and which one should I choose?

The main difference is the order of the context and the data structure used to access it. For order 2 and 3 we use a hash table. The hash value is made with the context, and in the entry we store an offset to the previous data. For higher orders such as 4 or 5 hash tables are too big and we have to use linked lists in every hash entry. That is instead of storing the pointer of one context, we store the pointers of all the contexts who share the same hash entry.

Order-3 with hash tables is used if we want to compress the files down to 3.4bpb to 6bpb at high speeds. We use order-4 or higher with linked lists for compression of 3.4bpb to 2.3bpb at lower speeds. The compression that it achieves depends mainly on another coder like huffman or ppmc. Also depends on other aspects as window size or dictionary updating.

If you want to implement order-3 hashed, read only Implementing lzp. If you want to use order-4 or 5, read it too, and after it go to Using linked lists to access higher orders
 


First version of this article: 15-Sept-2000 Last update of this article: 25-Sept-2000
This is part of Compression programming by Arturo Campos (email address: arturo@arturocampos.com).
http://www.arturocampos.com Visit again soon for new and updated compression articles and software.