The XXTEA encryption algorithm (a.k.a. Correct Block TEA) is an update to the original Tiny Encryption Algorithm (TEA) introduced in 1988 by Roger Needham and David Wheeler. Designed expressly for simplicity, the algorithm itself can be expressed in a mere 20 lines of C code. In spite of having been shown susceptible to a chosen-plaintext-attack, for basic uses it is actually reasonably secure.
Simplecrypt is a bare-bones implementation of file encryption using XXTEA as a block cipher in conjunction with several techniques to allow it work with reasonable security and efficiency on standard user files. The program is written in ANSI C and depends on a few POSIX library calls.
simplecrypt.c (~7k)
To use XXTEA securely as a block cipher for file encryption, simplecrypt addresses several issues:
- Randomness. Applying XXTEA to the same block twice with the same key will yield the same result. In practice this can let somebody intercepting your communications know a great deal, even if the communication itself is encrypted. The fix is to ensure that the same block almost never gets encrypted. This is done by applying a transformation to the original data block, usually by combining it with a randomly generated ‘initialization vector.’ The IV need not be secret – indeed in practice it almost never is – but it does greatly improve security.
- Padding. Normal user data isn’t always a convenient length for applying cryptographic primitives to. Padding it is therefore necessary, but to ensure proper decryption, it is also necessary to be able to know where the real data ends and the padding begins. It is advisable to use random padding to prevent leaking information (as a known padding value would). As a result, it’s helpful to include the length of the file in a predefined position (I choose the first word of the data block for this).
- Multiple blocks. It’s not particularly convenient to encrypt large amounts of data as a single block. XXTEA is actually flexible in this regard, but it’s still not ideal. As a result, the data is broken up into equal-sized blocks. Once again, the issue of randomness comes into play. While we could simply use a new IV for each block, that would require transmitting considerably more data (1 IV per block). A clever approach known as cipher-block-chaining avoids this. For the IV for the second block, we simply use the ciphertext of the first block. This ensures randomness and avoids transmitting anything extra beyond an IV for the first block.