Optimizing Huffman, part 3

So far, we’ve found a number of fairly low-cost (in terms of code and complexity) ways to tune the Huffman program for better performance.

Still, after thinking about the problem for a bit, I did see some other potential areas for improvement.  In particular, the fact that the encoding process was being done essentially bit-by-bit (appending 1 bit at a time to the buffer) seemed inefficient.  If there was only some way to append a whole bunch of bits at once (i.e. the entire encoding for a given character)…

simplehuffman4.c does precisely this.

Continue reading

Optimizing Huffman, part 2

In spite of the improvements made to the Huffman program’s use of I/O functions in the previous post, it’s fairly clear that more can be done.  In particular, looking at the profiles for encoding and decoding, the fgetc/fputc pair still account for a significant amount of the program’s runtime.  While we succeeded in getting rid of them for reading and writing normal (unencoded) data, we still used them for the encoded data.  It turns out that this isn’t too hard to fix either.

simplehuffman3.c implements this idea.

Continue reading

Profiling and optimizing Huffman

The Huffman example I posted last month has the virtue of being quite simple.  On the flip-side, it has some room for improvement, particularly when it comes to performance.

Of course no less a computer scientist than Don Knuth once said that “Premature Optimization is the root of all evil.”  So it makes sense to look for the most obvious bottlenecks first.

simplehuffman2.c is the optimized version.  A description of the approach follows.

Continue reading