Disclaimer
Copyright (c) Arturo San Emeterio Campos 1999. All rights reserved.
Permission is granted to make verbatim copies of this document for private
use only.
Download
Download the whole article zipped.
Table of contents
Introduction
Here we are again, today let's talk about an interesting topic: Crc.
I'll not deal with all the theory behind it, I'll only present the way
of calculating the standard 'Crc 32 bits' that so many progams use (pkzip,
arj...) If you want to learn why and how it works, have a look at one article
by Ross Williams that is available at his
home page . What's the use of a Crc? just to check if a file has been
modified, it's used by most compressors and also by some anti-virus, it's
used for error detection.
Crc-32
As we don't want to intrude a lot of redundancy (usually we don't want
bigger files, but rather smaller ones) we keep a number and change it with
every new byte processed, and luckily we do not produce two equal crc's
(while this is possible, the probability is too low) so this value can
represent the file.
Crc, Cyclic Redundancy Code. In this case we use a 32 bits value, a double word, also named unsigned long (in C). What we'll do is to keep a table of double words with 256 entries (exactly, one for every input byte), and xor the output value (the crc) with the value in the table, choosed xoring the input byte and Crc value, before xoring them you should shift to the right 8 times the Crc value. If the definition isn't clearer enough, look at the pseudo code.
Attached in the zip file there's the table for assembler (pc), and for C. In case you need, I can send it to you in binary form.
Before starting doing the whole process you should init the Crc value
to 0FFFFFFFFh (the value used in this Crc) and when it's finished xor it
with 0FFFFFFFFh. Once this is over this value is ready to be saved, along
with the compressed data (in case of a compressor) then the decompressor
decompress the data, calculates the Crc32 for this data, and if it's not
equal to the Crc saved, then the Data is corrupted. (probably media error)
Pseudo Code
Here it is the pseudo code for doing it:
Checking the check
The example from Ross Williams's guide: the string "123456789" should
produce a Crc value of: CBF43926h
The include (asm version) with the table has a Crc value of: 4611B565h.
If you want more examples just get winzip and read the Crc of some files
and compare it with the result of your implementation.
Closing words
If you can't get it to work, you can get a C implementation at Crblib,
by Charles Bloom. And if you want to
learn more about crc's, you are encouraged to read "A PAINLESS GUIDE TO
CRC ERROR DETECTION ALGORITHMS" by Ross N. Williams, available at his home
page.
Contacting the
author
You can reach me via email at: arturo@arturocampos.com
Also don't forget to visit my home page http://www.arturocampos.com
where you can find more and better info. See you in the next article!
This article comes from Arturo Campos home page at http://www.arturocampos.com Visit again soon for new and updated compression articles and software.