I’ve always been predominantly a C programmer, mainly simply from habit. However, recent projects using Perl have me thinking about the tradeoffs involved. The big issue seems to be time – both at the programmer’s end and at the computer’s.
C compared to languages like Java or Perl generally offers greater flexibility and performance. On the flip side, it makes programming tasks more complex and debugging more difficult. C requires more time from the programmer, in order to minimize time spent running on the computer. Higher-level languages on the other hand, require less from the programmer, at the expense of longer run-time.
Just how big is the difference? Depending on the task, it seem that it can vary quite a bit. To satisfy my curiosity about that difference, I recently wrote a small text analysis program in bother C and Perl. The basic idea was to count characters and words in a text file, and output a sorted report of the frequency of the various characters (and words). Nothing terribly fancy.
I was a bit surprised by the results. The Perl program was a svelte 40 lines. I dashed it off in less than 10 minutes, and aside from 1 silly typo, it needed only negligible debugging. The C program by contrast required more than 400 lines of code, and the data structures in particular required substantial time to debug.
On the flip side, the C program proved nearly 10x faster than the Perl one. Which was interesting since I assumed that built-in data types and string processing were Perl’s strong suits.
I guess arguments can be made either way, but for my purposes, I think I’ll be using a lot more Perl in the future, so long as speed is not of the utmost concern.
Below is a sample of the Perl code.
#!/usr/bin/perl -w
use strict;
die("Usage: $0 infile outfile\n") if @ARGV != 2;
open(IN, "$ARGV[0]") || die("Can't open $ARGV[0] for reading.\n");
open(OUT, ">$ARGV[1]") || die("Can't open $ARGV[1] for writing.\n");
my($wordFreqs, $charFreqs);
while(<IN>)
{
my(@chars) = split //;
$_ =~ s/^[^a-zA-Z\']+//;my(@words) = split /[^a-zA-Z\']+/;
foreach (@words) { $wordFreqs->{lc($_)}++; }
foreach (@chars) { $charFreqs->{lc($_)}++; }
}
close(IN);
print OUT "Characters\n\n";
foreach (sort { $charFreqs->{$b} <=> $charFreqs->{$a} } keys %$charFreqs)
{
printf(OUT "%-5s %10s %5d\n", $_, " ", $charFreqs->{$_});
}
print(OUT "\n\n\nWords\n\n");
foreach (sort { $wordFreqs->{$b} <=> $wordFreqs->{$a} } keys %$wordFreqs)
{
printf(OUT "%-20s %10s %5d\n", $_, " ", $wordFreqs->{$_});
}
close(OUT);