One for the mathmos: an interesting article on hash functions by Bob Jenkins, formerly of Oracle, now at Microsoft. Jenkins looks at various hash functions in use at the time (1996, updated 2005), then proposes his own, along with a confident promise:
I offer you a new hash function for hash table lookup that is faster and more thorough than the one you are using now. I also give you a way to verify that it is more thorough.
Wikipedia’s interpretation of the function shows it to be the same function as the one Perl 5.8.8 uses for its built-in hashes. (Search for #define PERL_HASH).
Browsing Bob Jenkins’ site turns up many other gems, including this:
Little League did not produce a great baseball player, but Bob did acquire a knack for spotting four-leaf clovers.
An attitude to sport which is achingly familiar to me. My advice to Microsoft, Oracle et al: if you’re looking for the nerds of tomorrow, you could do a lot worse than start your search on the perimeters of the world’s school sports fields.
Perl allows you to do all kinds of weird things to the semantics of the language. One such weird thing is the facility to override the default die handler in perl with something of your own design. The way it works is like so:
function some_func {
local $SIG{__DIE__} = sub {
my $error = shift;
# do weird and wonderful stuff with $error here
};
# your code here
die("there's nothing to live for");
}
Pretty cool, huh? Well, yes, but there are a couple of quirks to self-rolled die handlers that can cause you no end of pain.
Firstly, the keyword local should already have brought you out in a cold sweat. local gives a variable dynamic scope, which means that the variable holds until the block that contains it is popped back off the call stack. And that can really hurt if your code block calls other functions.
Consider a simple scenario: func1 defines a local die handler, then calls func2…
sub func1 {
local $SIG{__DIE__} = sub {
print STDERR $_[0]; exit(1)
};
# do something that might die here
func2();
}
The die handler doesn’t just affect the stuff you do in func1, it affects anything that happens in func2, too. And if func2 calls functions itself, the die handler will apply while those functions are running as well, and the functions they call, and the functions they call…
Well that’s OK, you might think - I want my die handler to be called if anything dies. That’s why I wrote it! But that brings us to the second quirk.
You may have come across the use of eval blocks in perl as a means of emulating the try/catch blocks seen in some other languages, such as Java. Here’s a simple example:
eval {
# do something that might die
};
if ($@) {
# this is your catch block
print STDERR "Something horrible happened: $@";
}
Using an eval block, then testing the value of $@ afterwards, lets you catch instances of die and handle them gracefully. Your code continues to run, unless you specifically state that it doesn’t inside the “catch” block. It’s a nifty trick, and it’s used a lot. For example, the XML::Simple CPAN module uses it to decide which XML parser to use:
eval { require XML::SAX; }; # We didn't need it until now
if($@) { # No XML::SAX - fall back to XML::Parser
if($preferred_parser) { # unless a SAX parser was expressly requested
croak "XMLin() could not load XML::SAX";
}
return($self->build_tree_xml_parser($filename, $string));
}
Very cool.
Until, that is, you mix eval try/catch blocks with custom die handlers.
Why? Because custom die handlers run even if the die happened within an eval block. The perlvar perldoc calls this “an implementation glitch”. Yeah. What this means is that if you use a custom die handler in your code, and it contains a die or exit, you’ll trample all over any attempt to use eval for try/catch semantics anywhere else downstream of the block that declares the die handler! (I found this out the hard way - after an hour in the perl debugger trying to figure out why XML::Simple was crashing my code.)
Fortunately there is a workaround, albeit not a very elegant one: the variable $^S indicates whether code was called from within an eval block or not. So if you write a custom die handler, you should always check $^S, and die again if it’s set. (You can die within a die handler - the die handler doesn’t get called recursively.)
local $SIG{__DIE__} = sub {
die @_ if $^S;
# rest of the die handler goes here
}
Better still, don’t define your own die handlers! They’re rarely needed, you should use an eval block wherever possible to trap errors when they surface, rather than trying to affect the way they surface.