Bit input/output
by
Arturo San Emeterio Campos
Disclaimer
Copyright (c) Arturo San Emeterio Campos 2000. All rights reserved.
Permission is granted to make verbatim copies of this
document for private use only.
Note
"Compression programming" is on a early stage. This text is not finished.
Table of contents
What is bitio?
Bit operations: or, and,
shift
Writing a bit
Pseudo code for writing a
bit
Reading a bit
Pseudo code for reading a
bit
Writing a variable length
value
Reading a
variable length value and an example
Closing words
Bit input/output is a collection of functions that perform input and output. While usual io is done with bytes, bitio works with values of variable integer length. That means that with bitio we can write a value of 1 bit or 4 bits. We don't need however to output all the codes with the same lengths.
There are different ways to do bitio. I'll explain the simplest and slowest one. This one outputs the bit value by calling a function which outputs a single bit a time. The fastest one instead writes the whole value at the same time. But it's conceptually difficult. My advice is that you implement the simple one with a clear interface. Then when you need a fast one, change the code but keep the interface.
The functions I'll explain are also huffman "compliant". That is, they have the right order for huffman codes. Still they can be used for anything else. It works for huffman because we output the codes left to right. That means that if we have a four bits value like "1010" first we output "1", then "0", "1" and "0". Making them usable for huffman isn't much work and it's worth it.
For the programmers that don't know yet, or don't remember quite well,
the bit operations, the next section is a review of the instructions used
for bitio.
Bit operations: or, and, shift
OR sets the bits that are 0 from the first parameter to 1 if they are
1 in the second parameter. We'll use it to set single bits. In a table
like form (where the first row is the first parameter):
| 0 | 1 | |
| 0 | 0 | 1 |
| 1 | 1 | 1 |
For example if we have the binary value "0010" and OR it with "1110" we get "1110".
First parameter 0110
Second parameter 1010
Result
1110
AND, if the bit from the second parameter is 1, the bit from the first
one is not modified. If it's 0, it will be changed to 0. We use to clear
bits we don't care about. Table:
| 0 | 1 | |
| 0 | 0 | 0 |
| 1 | 0 | 1 |
For example if we only care about the rightmost bit, but the other ones could be non zero, we use AND. "011010" AND "000001", result: "000001". "101" AND "110", result: "100".
Shift. There are two variations, to the left and to the right. The first
parameters is the value to modify. The second is the number of times to
shift the value to the left or the right. We use it to move a bit from
any position to the right most part, where it becomes of use. For example
if we have "101001" and we want the fourth bit (right to left), we shift
the value 3 times to the right "101". This combined with AND is used to
isolate a bit.
We'll use the bit operations that we have seen to put bits. We'll store them in a double word (32 bits, 4 bytes). There we put the bits right to left: if we put "0", "1" and "1", the output would be xxxx xx011 (where x means unknown, not written yet).
To put current bit in its position we have to shift it to the left. We have to keep a counter which has the position in which the bit has to go. We'll call that variable "bitio_used_bits" because its value is equal to the number of bits written to the double word, which we'll call "bitio_value". "bitio_value" must be initialized to 0, because we'll or bits there. Every time we write another bit, we increment "bitio_used_bits". For example, if we have to write "1","0" and "1":
-Let's write "1". "bitio_value"="0", "bitio_used_bits"=0
- shift "1" 0 times to the left, result "1"
- or "0" with "1", result "1"
- increment "bitio_used_bits"
-Let's write "0". "bitio_value"="1", "bitio_used_bits"=1
- shift "0" 1 times to the left, result "00"
- or "01" with "00", result "01"
- increment "bitio_used_bits"
-Let's write "1". "bitio_value"="01", "bitio_used_bits"=2
- shift "1" 2 times to the left, result "100"
- or "001" with "100", result "101"
- increment "bitio_used_bits"
The last thing to care about is writing out the double word when it's full. To do so we check the counter and when it reaches 32 (or whatever we have as maximum number of bits), output the value and reinit the variables.
When the compression process is done, we have to care to flush the last
bits written. In our case that's just a matter of writing "bitio_value"
as usual.
Putting the ideas together, we have the following code:
function "init_output"
bitio_value=0
bitio_used_bits=0
function "put_bit" accepts as a parameter "bit"
bitio_value = bitio_value OR ( bit shift_left bitio_used_bits
)
bitio_used_bits = bitio_used_bits+1
if( bitio_used_bits equals 32 )
write out bitio_value
bitio_value=0
bitio_used_bits=0
function "end_bit_output"
write out bitio_value
Now that we know how to write a bit, reading one will be easy. We'll use the same variables as in output because output and input wouldn't be done at the same time. The process is more or less the opposite to output. We just have to read the bit from the right position, that we keep track of with "bitio_used_bits". Let's say we have "xxxxx011" and we read three bits:
-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=0
- shift "xxxxx101" 0 times to the left, result "xxxxx101"
- and "xxxxx101" with 1, result "00000001"
- increment "bitio_used_bits"
- return "00000001"
-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=1
- shift "xxxxx101" 1 time to the left, result "xxxxxx10"
- and "xxxxxx10" with 1, result "00000000"
- increment "bitio_used_bits"
- return "00000000"
-Let's read a bit. "bitio_value"="xxxxx101", "bitio_used_bits"=2
- shift "xxxxx101" 2 times to the left, result "xxxxxxx1"
- and "xxxxxxx1" with 1, result "00000001"
- increment "bitio_used_bits"
- return "00000001"
As while outputting we have to care to init the variables, and read
a new double word when we have already used all the bits from the current
one.
This is the code which reads a bit a time from the input:
function "init_input"
read bitio_value
bitio_used_bits=0
function "get_bit" returns "bit"
bit = ( bitio_value shift_right bitio_used_bits ) AND 1
bitio_used_bits = bitio_used_bits+1
if( bitio_used_bits equals 32 )
read bitio_value
bitio_used_bits=0
Writing a variable length value
Again for this process we'll use bit operations. Remember that we have to output the bits left to right if we want out bitio functions to be of use for huffman. Now we'll make a function which writes a value of variable integer length. It will call the routines we have done for outputting bits, thus its only job is to write the bits out properly.
If we have a four bits value like "1010" we output "1", "0", "1" and "0". Left to right. Our function will be called with the bits to write and the number of bits. We loop for every bit, get it and output it. Let's say our loop is a while based on "bits_to_put". It would look like the following:
while( bits_to_put doesn't equal 0 )
select bit to output
output bit
decrement bits_to_put
The first time we loop bits_to_put equals 4. And to get the fourth bit we have to shift the value 3 times to the right. It's clear to see that we need to shift the value "bits_to_put-1" times to the right.
But that's not all, imagine we shift it one times, the result is "101".
But we have to output a bit. Thus we clear all the bits but the rightmost
one with an AND 1. Now it's your time, go and implement it.
Reading a variable length value and an example
We just have to read a bits from the input and place them in the right place. To select them we shift them to left "bits_to_read-1" for the same reason we did while putting them. To put them in the result value we have to or them, so start by setting this value to 0. This function accepts as a parameter the number of bits to read and returns the value. Instead of more pseudo code, let's see an example. The input is "01110101", and we need to read a three bits value:
-Read a value. "bitio_value"="01010011", "bits_to_read"=3, "result"="0"
- get a bit, "1"
- shift "1" "bits_to_read-1" times to the left (2), "100"
- OR "result" with previous value, "100"
- decrement "bits_to_read"
-Read a value. "bitio_value"="01010011", "bits_to_read"=2, "result"="100"
- get a bit, "0"
- shift "0" "bits_to_read-1" times to the left (1), "00"
- OR "result" with previous value, "100"
- decrement "bits_to_read"
-Read a value. "bitio_value"="01010011", "bits_to_read"=1, "result"="100"
- get a bit, "1"
- shift "1" "bits_to_read-1" times to the left (0, "1"
- OR "result" with previous value, "101"
- decrement "bits_to_read"
All the above ideas, algorithms, examples and pseudo code make together our bitio library. Which we'll later use for many compression projects. Thus try to make it reusable, portable and that stuff good programmers should care about.
For clarity, the examples use 8 bits value. but in practice you should
use 32 bits or whatever is the natural bit size for your platform.