"When Fibbonaci and Huffman met"
by
Arturo San Emeterio Campos








Disclaimer
Copyright (c) Arturo San Emeterio Campos 1999. All rights reserved. Permission is granted to make verbatim copies of this document for private use only.
 

Download
Download the whole article, also with the source code.
 

Table of contents

  • Introduction
  • Fibbonaci series
  • Fibbonaci and Huffman
  • Solution
  • My implementation
  • Closing words
  • Contacting the author

  •  

     
     
     
     
     
     

    Introduction
    This article is aimed to a programmer who has implemented Huffman, and wants it to be bug free. If you still don't know huffman learn it.
     

    Fibbonaci series
    I'll only explain the basics of it, if you want to know how to do an efficient implementation on it, look for an article about it in my h-page. (others) Fibbonaci series are defined that way:

  • Fib(0)=0
  • Fib(1)=1
  • Fib(n)=Fib(n-2)+Fib(n-1)

  • So, calculating it till fib(7):  0,1,1,2,3,5,8,13
     

    Fibbonaci and Huffman
    The problem comes when we have a set of probabilities, which is a fibbonaci serie. Look at this probabilities:
     
    Symbol 
     Probability 
    a
    1/19
    b
    2/19
    c
    3/19
    d
    5/19
    e
    8/19

    Those probabilities are Fib(6). (the 0 probability is not used) This will make the following huffman tree (when n means node):

      Root
       /  \
     e     n
          /   \
        d     n
             /   \
           c     n
                /   \
              b     a

    With the following codes:
     
    Symbol 
     Probability  Code
    a
    1/19 0000
    b
    2/19 0001
    c
    3/19 001
    d
    5/19 01
    e
    8/19 1

    This kind of tree is called lopsided, or unbalanced. As we can see we the maximum length is alphabetsize-1, where alphabetsize=5. (the different symbols) And here it is the problem, we only had 5 symbols, but if we had 256 symbols our maximum code could be 255! and usually we design our codecs to work with codes of 16 bits long, and even 32 bits long, but never 255 bits long.
     

    Solution
    Most implementations use 16 bits code -- no need of more -- so I'll deal with how to avoid it with 16 bits codes. The maximum code length is -- obviously -- 16, so we'll have a problem if we have a set of probabilities till fib(17), but for getting probabilities till fib(x) we need to get bytes:
      x
     S fib(i)
     i=0
    So for getting the fibbonaci probabilities for fib(6), we'll need to get at least: 0+1+1+2+3+5+8=20 bytes. And for Fib(16+1) we'll need 4180. So what we have to do? we'll change the probabilities, so they don't make an unbalanced tree, we'll do that dividing the probabilities by a factor, we'll choose 2. (just a shift right) First you gather the probabilities, if all the probabilities (added) make 4180, then you change the probabilities, we divide every probability by 2. But be careful, no probability should go beyond 1, or then you'll not assign any code to that probability. This operation is usually done shifting the value to the right one position, and perform an or with the value 1. Once it's less than 4180 you can construct the tree.
     

    My implementation
    In fact, all this theory is ok, but this wasn't how I did it. Because I wanted teh maximum length of a code to be 16, I kept a flag, if once any code went above 16, then I put this flag to 1, when finished doing the codes, I checked if the flag was set, if it was, then I divided all the probabilities by 2. Remember to store the original probabilities apart (so they aren't changed while building the tree) and to copy the new probabilies in the old ones, so you don't enter an infinite loop in the case that the first division can't make your maximum code length 16.
     

    Closing words
    In the zip there's also a little program in C which prints fib(x) and also computes this:
      x
     S fib(i)
     i=0
    Well, if ever your huffman code hanged, and you didn't knew why, this is -- probably -- the solution for it. If you've implemented huffman you should also do canonical huffman. If you found any errors, and have any idea about how to improve this article, email-me.
     

    Contacting the author
    You can reach me via email at: arturo@arturocampos.com  Also don't forget to visit my home page http://www.arturocampos.com where you can find more and better info. See you in the next article!
     
     

    Arturo San Emeterio Campos, Barcelona 23-Jul-1999


    This article comes from Arturo Campos home page at http://www.arturocampos.com Visit again soon for new and updated compression articles and software.