Simplecrypt: fun with encryption

The XXTEA encryption algorithm (a.k.a. Correct Block TEA) is an update to the original Tiny Encryption Algorithm (TEA) introduced in 1988 by Roger Needham and David Wheeler.  Designed expressly for simplicity, the algorithm itself can be expressed in a mere 20 lines of C code.  In spite of having been shown susceptible to a chosen-plaintext-attack, for basic uses it is actually reasonably secure.

Simplecrypt is a bare-bones implementation of file encryption using XXTEA as a block cipher in conjunction with several techniques to allow it work with reasonable security and efficiency on standard user files.  The program is written in ANSI C and depends on a few POSIX library calls.

simplecrypt.c (~7k)

Continue reading


The last part of the project for my database class involves building a web front-end for the database we’ve designed and built (information on music, like iTunes).  Fair enough.

Unfortunately, the choice of implementation languages was Java (which I do know) and PHP (which I don’t).  So I was more or less obliged to build the thing using Java servlets.

Mind you, I have no doubt that servlets are quite indispensable in many places.  Creating a HTML front-end for SQL queries is not one of those places though.  Hopefully it works for the presentation tomorrow, but in its current state the code ranks as some of the ugliest I’ve ever written.

Guess that’s good motivation to learn PHP before the next time…

Installing xv6 on MacOS X


For the OS course this semester, we’re using xv6, a simple operating system based upon Unix System 5, but rewritten from scratch for modern hardware and compilers.

The process for building xv6 on MacOS X is slightly more involved than on other systems.  It took me a few tries to get it right, so here it is.  These instructions are for MacOS X 10.7 (Lion), although they should be similar for other versions.  They’re based on the original MIT instructions.

Continue reading


strlang – a simple language for string manipulation

strlang is a programming language I created with the goal of making string manipulation simple and straightforward.  It is an imperative language with a minimalist syntax.  The language and its compiler were written as part of the Programming Languages and Translators (COMS 4115) course at Columbia University in Fall 2011.


  1. Basic data types are strings, numbers and maps (sets of key-value pairs)
  2. Full-set of operators for arithmetic, string manipulation (including basic regular expressions) and map construction
  3. C-like structure including functions, loops, conditionals and expressions
  4. No keywords


Continue reading


While computers have changed a lot in the last 20 years, we still use clock speed (MHZ or GHZ) as the primary metric for describing speed.  Unfortunately, the manufacturers don’t make it easy.  First AMD and then Intel switched away from labeling their processors by clock.  Thus if you purchase a new machine today, the processor is likely to be a ‘Core i3 2100’ or a ‘Phenom X4 2200’.  The numbers that they use after the processor type aren’t the clock speed, rather, they’re some sort of internally-designated model number.

Continue reading

MacOS X 10.7 compilers

Compile window

As part of a little project to make detecting memory/pointer errors easier for beginning C/C++ programmers, I’ve installed a number of different compilers on my system.  I wanted to make sure that my approach was widely applicable.

At this point, there are 4 (3.5 really) major C/C++ compilers available for MacOS X 10.7.  What follows is a brief description of each, and some background as to how we got here.

Continue reading

Building gcc on MacOS X

Apple has never included a stock version of GCC with their development tools, but now with the current version of Xcode, they don’t even include a modified version.  Seeing as their plan going forward is to move entirely to Clang/LLVM, if you intend to use GCC on OS X, you’ll have to build it yourself.  It’s not a terribly difficult process, but it can be a bit tricky the first few times around, particularly when it comes to configuring the build.

Instructions follow.

Continue reading

Notes on Building a C/C++ Leak Detector

Building a simple C memory leak detector is not too difficult.  I did it in a previous post in less than 200 lines.  But once you add features, it becomes a lot more complicated.

What features am I talking about?

1) Allow use in programs with more than 1 source file.
2) Make fast enough to use in reasonably large programs.
3) C++ support (track allocations made with new and delete).
4) Detect use of mismatched allocation/deallocation (e.g. new and free).
5) Detect simple buffer overflows (writing off the end of an array). 
6) Be generally helpful (errors should be handled gracefully). 
7) Test everything well.

And of course it goes without saying that I wanted to keep it portable, and as simple as possible.

Continue reading

Optimizing Huffman, part 3

So far, we’ve found a number of fairly low-cost (in terms of code and complexity) ways to tune the Huffman program for better performance.

Still, after thinking about the problem for a bit, I did see some other potential areas for improvement.  In particular, the fact that the encoding process was being done essentially bit-by-bit (appending 1 bit at a time to the buffer) seemed inefficient.  If there was only some way to append a whole bunch of bits at once (i.e. the entire encoding for a given character)…

simplehuffman4.c does precisely this.

Continue reading