
It’s been a busy few days.
Thursday at 5:30PM I took my final exam for my summer algorithms course. Then I went back to my grandparents’ place and packed. I went to bed around midnight.

It’s been a busy few days.
Thursday at 5:30PM I took my final exam for my summer algorithms course. Then I went back to my grandparents’ place and packed. I went to bed around midnight.
So far, we’ve found a number of fairly low-cost (in terms of code and complexity) ways to tune the Huffman program for better performance.
Still, after thinking about the problem for a bit, I did see some other potential areas for improvement. In particular, the fact that the encoding process was being done essentially bit-by-bit (appending 1 bit at a time to the buffer) seemed inefficient. If there was only some way to append a whole bunch of bits at once (i.e. the entire encoding for a given character)…
simplehuffman4.c does precisely this.
In spite of the improvements made to the Huffman program’s use of I/O functions in the previous post, it’s fairly clear that more can be done. In particular, looking at the profiles for encoding and decoding, the fgetc/fputc pair still account for a significant amount of the program’s runtime. While we succeeded in getting rid of them for reading and writing normal (unencoded) data, we still used them for the encoded data. It turns out that this isn’t too hard to fix either.
simplehuffman3.c implements this idea.
The Huffman example I posted last month has the virtue of being quite simple. On the flip-side, it has some room for improvement, particularly when it comes to performance.
Of course no less a computer scientist than Don Knuth once said that “Premature Optimization is the root of all evil.” So it makes sense to look for the most obvious bottlenecks first.
simplehuffman2.c is the optimized version. A description of the approach follows.