Bit input/output
by
Arturo San Emeterio Campos







Disclaimer
Copyright (c) Arturo San Emeterio Campos 2000. All rights reserved. Permission is granted to make verbatim copies of this
document for private use only.

Note
"Compression programming" is on a early stage. This text is not finished.
 

Table of contents
What is bitio?
Bit operations: or, and, shift
Writing a bit
Pseudo code for writing a bit
Reading a bit
Pseudo code for reading a bit
Writing a variable length value
Reading a variable length value and an example
Closing words
 
 

What is bitio?

Bit input/output is a collection of functions that perform input and output. While usual io is done with bytes, bitio works with values of variable integer length. That means that with bitio we can write a value of 1 bit or 4 bits. We don't need however to output all the codes with the same lengths.

There are different ways to do bitio. I'll explain the simplest and slowest one. This one outputs the bit value by calling a function which outputs a single bit a time. The fastest one instead writes the whole value at the same time. But it's conceptually difficult. My advice is that you implement the simple one with a clear interface. Then when you need a fast one, change the code but keep the interface.

The functions I'll explain are also huffman "compliant". That is, they have the right order for huffman codes. Still they can be used for anything else. It works for huffman because we output the codes left to right. That means that if we have a four bits value like "1010" first we output "1", then "0", "1" and "0". Making them usable for huffman isn't much work and it's worth it.

For the programmers that don't know yet, or don't remember quite well, the bit operations, the next section is a review of the instructions used for bitio.
 

Bit operations: or, and, shift

OR sets the bits that are 0 from the first parameter to 1 if they are 1 in the second parameter. We'll use it to set single bits. In a table like form (where the first row is the first parameter):
 
0 1
0 0 1
1 1 1

For example if we have the binary value "0010" and OR it with "1110" we get "1110".

First parameter  0110
Second parameter 1010
Result           1110

AND, if the bit from the second parameter is 1, the bit from the first one is not modified. If it's 0, it will be changed to 0. We use to clear bits we don't care about. Table:
 
0 1
0 0 0
1 0 1

For example if we only care about the rightmost bit, but the other ones could be non zero, we use AND. "011010" AND "000001", result: "000001". "101" AND "110", result: "100".

Shift. There are two variations, to the left and to the right. The first parameters is the value to modify. The second is the number of times to shift the value to the left or the right. We use it to move a bit from any position to the right most part, where it becomes of use. For example if we have "101001" and we want the fourth bit (right to left), we shift the value 3 times to the right "101". This combined with AND is used to isolate a bit.
 

Writing a bit

We'll use the bit operations that we have seen to put bits. We'll store them in a double word (32 bits, 4 bytes). There we put the bits right to left: if we put "0", "1" and "1", the output would be xxxx xx011 (where x means unknown, not written yet).

To put current bit in its position we have to shift it to the left. We have to keep a counter which has the position in which the bit has to go. We'll call that variable "bitio_used_bits" because its value is equal to the number of bits written to the double word, which we'll call "bitio_value". "bitio_value" must be initialized to 0, because we'll or bits there. Every time we write another bit, we increment "bitio_used_bits". For example, if we have to write "1","0" and "1":

-Let's write "1". "bitio_value"="0", "bitio_used_bits"=0
 - shift "1" 0 times to the left, result "1"
 - or "0" with "1", result "1"
 - increment "bitio_used_bits"

-Let's write "0". "bitio_value"="1", "bitio_used_bits"=1
 - shift "0" 1 times to the left, result "00"
 - or "01" with "00", result "01"
 - increment "bitio_used_bits"

-Let's write "1". "bitio_value"="01", "bitio_used_bits"=2
 - shift "1" 2 times to the left, result "100"
 - or "001" with "100", result "101"
 - increment "bitio_used_bits"

The last thing to care about is writing out the double word when it's full. To do so we check the counter and when it reaches 32 (or whatever we have as maximum number of bits), output the value and reinit the variables.

When the compression process is done, we have to care to flush the last bits written. In our case that's just a matter of writing "bitio_value" as usual.
 

Pseudo code for writing a bit

Putting the ideas together, we have the following code:

function "init_output"
  bitio_value=0
  bitio_used_bits=0

function "put_bit" accepts as a parameter "bit"
  bitio_value = bitio_value OR ( bit shift_left bitio_used_bits )
  bitio_used_bits = bitio_used_bits+1
  if( bitio_used_bits equals 32 )
    write out bitio_value
    bitio_value=0
    bitio_used_bits=0

function "end_bit_output"
  write out bitio_value
 

Reading a bit

Now that we know how to write a bit, reading one will be easy. We'll use the same variables as in output because output and input wouldn't be done at the same time. The process is more or less the opposite to output. We just have to read the bit from the right position, that we keep track of with "bitio_used_bits". Let's say we have "xxxxx011" and we read three bits:

-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=0
 - shift "xxxxx101" 0 times to the left, result "xxxxx101"
 - and "xxxxx101" with 1, result "00000001"
 - increment "bitio_used_bits"
 - return "00000001"

-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=1
 - shift "xxxxx101" 1 time to the left, result "xxxxxx10"
 - and "xxxxxx10" with 1, result "00000000"
 - increment "bitio_used_bits"
 - return "00000000"

-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=2
 - shift "xxxxx101" 2 times to the left, result "xxxxxxx1"
 - and "xxxxxxx1" with 1, result "00000001"
 - increment "bitio_used_bits"
 - return "00000001"

As while outputting we have to care to init the variables, and read a new double word when we have already used all the bits from the current one.
 

Pseudo code for reading a bit

This is the code which reads a bit a time from the input:

function "init_input"
  read bitio_value
  bitio_used_bits=0

function "get_bit" returns "bit"
  bit = ( bitio_value shift_right bitio_used_bits ) AND 1
  bitio_used_bits = bitio_used_bits+1
  if( bitio_used_bits equals 32 )
    read bitio_value
    bitio_used_bits=0
 

Writing a variable length value

Again for this process we'll use bit operations. Remember that we have to output the bits left to right if we want out bitio functions to be of use for huffman. Now we'll make a function which writes a value of variable integer length. It will call the routines we have done for outputting bits, thus its only job is to write the bits out properly.

If we have a four bits value like "1010" we output "1", "0", "1" and "0". Left to right. Our function will be called with the bits to write and the number of bits. We loop for every bit, get it and output it. Let's say our loop is a while based on "bits_to_put". It would look like the following:

while( bits_to_put doesn't equal 0 )
  select bit to output
  output bit
  decrement bits_to_put

The first time we loop bits_to_put equals 4. And to get the fourth bit we have to shift the value 3 times to the right. It's clear to see that we need to shift the value "bits_to_put-1" times to the right.

But that's not all, imagine we shift it one times, the result is "101". But we have to output a bit. Thus we clear all the bits but the rightmost one with an AND 1. Now it's your time, go and implement it.
 

Reading a variable length value and an example

We just have to read a bits from the input and place them in the right place. To select them we shift them to left "bits_to_read-1" for the same reason we did while putting them. To put them in the result value we have to or them, so start by setting this value to 0. This function accepts as a parameter the number of bits to read and returns the value. Instead of more pseudo code, let's see an example. The input is "01110101", and we need to read a three bits value:

-Read a value. "bitio_value"="01010011", "bits_to_read"=3, "result"="0"
 - get a bit, "1"
 - shift "1" "bits_to_read-1" times to the left (2), "100"
 - OR "result" with previous value, "100"
 - decrement "bits_to_read"

-Read a value. "bitio_value"="01010011", "bits_to_read"=2, "result"="100"
 - get a bit, "0"
 - shift "0" "bits_to_read-1" times to the left (1), "00"
 - OR "result" with previous value, "100"
 - decrement "bits_to_read"

-Read a value. "bitio_value"="01010011", "bits_to_read"=1, "result"="100"
 - get a bit, "1"
 - shift "1" "bits_to_read-1" times to the left (0, "1"
 - OR "result" with previous value, "101"
 - decrement "bits_to_read"
 

Closing words

All the above ideas, algorithms, examples and pseudo code make together our bitio library. Which we'll later use for many compression projects. Thus try to make it reusable, portable and that stuff good programmers should care about.

For clarity, the examples use 8 bits value. but in practice you should use 32 bits or whatever is the natural bit size for your platform.
 



First version of this article: 25-Sept-2000 Last update of this article: 25-Sept-2000
This is part of Compression programming by Arturo Campos (email address: arturo@arturocampos.com).
http://www.arturocampos.com Visit again soon for new and updated compression articles and software.