Optimizing Huffman, part 2

In spite of the improvements made to the Huffman program’s use of I/O functions in the previous post, it’s fairly clear that more can be done.  In particular, looking at the profiles for encoding and decoding, the fgetc/fputc pair still account for a significant amount of the program’s runtime.  While we succeeded in getting rid of them for reading and writing normal (unencoded) data, we still used them for the encoded data.  It turns out that this isn’t too hard to fix either.

simplehuffman3.c implements this idea.

Continue reading

Profiling and optimizing Huffman

The Huffman example I posted last month has the virtue of being quite simple.  On the flip-side, it has some room for improvement, particularly when it comes to performance.

Of course no less a computer scientist than Don Knuth once said that “Premature Optimization is the root of all evil.”  So it makes sense to look for the most obvious bottlenecks first.

simplehuffman2.c is the optimized version.  A description of the approach follows.

Continue reading

A simple Huffman implementation

Huffman encoding uses some interesting insights about data to achieve compression.  It’s reasonably simple, and exposes some interesting questions about data representation and optimization.

simplehuffman.c – (300 lines of C) is a program I wrote in an afternoon to do Huffman encoding and decoding of files.  It will work on any type of file and can compress and decompress a 15MB text file in under 2 seconds on current hardware.

Continue reading

The Coast to Coast Walk

P6207470 Across England – Walking Wainwright’s Coast to Coast route

Preparations – What we did to prepare for the walk

Day 0 – Getting to Kirkby Stephen – In which we all somehow manage to arrive in Kirkby Stephen in time for dinner.

Day 1 – St. Bees to Ennerdale Bridge – In which we set off on our journey only to immediately get lost. Also: an introduction to two English institutions: the pub and the weather, and a lesson in country living.

Day 2 – Ennerdale Bridge to Longthwaite – In which we continue our way across the Lake District and make a worthwhile detour up into the hills in search of adventure. Also: getting an understanding for the youth hostel system.

Day 3 – Longthwaite to Grasmere – In which we are thoroughly drenched en route to Wordsworth’s home and celebrate the birthday of our guides in style.

Day 4 – Grasmere to Patterdale – In which a complete absence of rain provides the perfect excuse to visit England’s third highest mountain and descend via the aptly named ‘Striding Edge.’ Also: experiencing a proper evening in the pub.

Day 5 – Patterdale to Shap – In which we ascend Kidsty Pike without being able to see it (or much of anything else) before leaving the hills of the Lake District for good. Also: admiring the handiwork of King Henry VIII and learning a valuable lesson about sheep.

Day 6 – Shap to Kirkby Stephen – In which we enjoy the benefits of a cream tea in the morning and a massive downpour in the evening crossing the middle of England.

Day 7 – Kirkby Stephen to Keld – In which a wet day is made considerably soggier thanks to the presence of numerous bogs and we encounter the happiest farm in England. Also: a brief stop at Nine Standard Riggs.

Day 8 – Keld to Reeth – In which we cross the Pennines and pass numerous relics of Britain’s mining past. Also: what to do when the afternoon proves rain-free.

Day 9 – Reeth to Richmond – In which a short leg ends in our first encounter with a real city. Also: trying out England’s new national dish at dinner.

Day 10 – Richmond to Oaktree Hill – In which we miss the rainy weather and sweat our way across the lowlands. Also: learning how to fill Sunday afternoon in a really small town.

Day 11 – Oaktree Hill to Clay Bank Top (Chop Gate) – In which we continue to bake even while leaving the lowlands for the moors. Also: the incredible friendliness of folks in the countryside and the case for leaving the urban grind.

Day 12 – Clay Bank Top to Blakey Ridge (Rosedale Abbey) – In which the weather mercifully cools off allowing us to traverse the Yorkshire Moors in relative comfort. Also: the peculiar story of the grouse, the gentry and the moors.

Day 13 – Blakey Ridge to Grosmont – In which we catch numerous glimpses of the ocean descending off the moors and are treated to some of England’s finest villages. Also: learning a bit about British railroads.

Day 14 – Grosmont to Robin’s Hood Bay (Kirkby Stephen) – In which we conclude our walk with a stretch along the rugged coast and help move England slightly eastward. Also: the best vegetarian dinner in Cumbria and swapping stories with 6 other hikers on the same hike.

Day 14 + 1 – Conclusion – In which we find ourselves in York and discover the difference between being a tourist and being a hiker. Also: the nuisances of luggage.

Lightroom oddity

Lightroom CPU usageAdobe Lightroom is pretty much the de-facto standard for DAM (digital asset management) and bulk editing among professional and serious amateur digital photographers. As a heavy user since the beta of version 1 in mid-2006 (we are currently at version 3.4), I can appreciate why that is. Lightroom offers a reasonably intuitive well-designed interface for editing and organizing large numbers of images. It has a number of minor flaws, but it compares favorably to pretty much every competing software package I’ve tried.

That said, from an architecture point of view, it seems that Lightroom could stand some improvement. My basic complaint is that the software is sluggish. Not tear-your-hair out slow mind you, but lethargic enough to be annoying. And this is when dealing with 12MP RAW image files, on a machine that has 4 2.8GHZ cores and 6GB RAM. That is to say image sizes aren’t that large, and the hardware is pretty current.

Continue reading

Leaker: a simple C leak detection example

Following up on my previous post, I’ve posted right below an actual working example of a simple C leak detector. It pretty much follows the outline given in that post. There is a global table to keep track of allocations and deallocations. Anything left in the table at the end of the program is known to have been leaked. By recording filenames and line numbers as well as memory addresses, the table makes it possible to know exactly where leaked allocations were made.

Leaker code. ~180 lines of ANSI C.

Continue reading

Olympux XZ-1 – A Serious Compact Camera?

Olympus XZ-1

I’ve been looking, on and off for a serious compact digital camera for quite some time. It’s not that I’m unhappy with my DSLR, but I have developed a certain lack of enthusiasm for carrying it everywhere. The DSLR is large, heavy and obtrusive. People notice me carrying it. I notice me carrying it. For quality, it can’t be beat, but for activities where photography is the primary goal, it often seems like overkill.

My search began in earnest after a couple of trips last summer where I was forced to put the camera away in the pack to keep it from interfering with climbing. While this helped with climbing, it also meant that I missed photographing a number of interesting moments.

Continue reading

Disk I/O in C – avoid fgetc/fputc

Most programs don’t do that much disk I/O. But for those that do, the disk I/O is often the bottleneck. I discovered this firsthand working on my Huffman tree encoding program. On small data files, it doesn’t matter much how you read or write your data, but when the filesizes get into hundreds of megabytes, implementation makes a big difference.

The 3 basic C functions for reading (and writing) unformatted text files are fgetc/fputc, fgets/fputs, and fread/fwrite. fgetc reads in one character at a time, fgets one line at a time and fread some number of ‘records’ at a time. As a record is just a sequence of bytes of a defined length, what fread really let you do is read in data in chunks whose size are recordsize * number of records. The respective put/write functions do the reverse.

Continue reading

Detecting Memory Leaks in C

Memory leaks are a common occurrence in C programs, particularly for novice programmers.  Because memory allocation and deallocation is left to the programmers, it’s not difficult to lose track of what was allocated where, and forget to deallocate it once it is no longer in use.  The is particularly true for more complex data structures which may have dynamically allocated blocks of memory that reference other dynamically allocated blocks of memory.

There are a fair number of tools to deal with this, like valgrind.  Still, building a simple detector is a relatively straightforward exercise, and can often reveal interesting things.

Continue reading

Installing Android x86 in a virtual machine

Ever wanted to play around with Android, without actually buying an Android phone? Thanks to the Android x86 project you can. While Google only officially supports the ARM platform, a group of volunteers has ported it to x86. What’s more, it runs more or less seamlessly inside a virtual machine, so there’s no danger of accidentally messing up your main system.

The process is pretty straightforward.

android.jpg

Continue reading