Chris Bentzel

Thursday, July 06, 2006

Rediscovering Knuth.

As I'm currently between jobs, I have lots more time than I used to. Mainly I've been enjoying myself outside. However, the weather changed for the worse today, so I spent some time going over some of my technical books.

I started going back over Introduction to Algorithms to refresh myself. Pretty soon, I followed a reference to generating good hash functions in Knuth's Art of Computer Progamming Volume 3. I had purchased these books, but never really read them. I had previously started reading Volume 1, but got pretty bored reading about MIX and my eyes glazed over on the seemingly endless list of summation analysis. I put it down and never looked at them again.

What a mistake that was. I started reading Chapter 6, about searching. Many of the ideas are still super relavent today. What's more, Knuth leads you through different search algorithms like he is telling a story, complete with historical basis for each of the approaches (many originating significantly before electronic computers), and full of jokes and asides which are actually funny, unlike the ones in SICP.

I've found the following steps make TAOCP a joy instead of a pain.
  • Read the main commentary and understand the algorithms.
  • Read the comments in the MIX pseudocode, but don't worry too much about remembering all the instructions.
  • Get a grasp of what the math indicates (i.e. where the tradeoffs lie), but don't get too tangled in the derivations.
  • Do a few of the exercises, at least partially.
To give an idea on how good a writer Knuth is, he manages to make reading about searching for an element in a flat, unsorted array interesting. I thought that this could be summed up in one paragraph, but he continues to find ways to twist the problem around to improve performance under different circumstances.

First, he illustrates how adding a sentinel to the end reduces comparisons compared to comparing an index counter against a max value. Next, he does a little bit of loop unrolling to further reduce increments. Finally, he shows a number of approaches where the list mutates in response to search queries, which helps reduce comparisons assuming queries are not uniform in probability. A great paper which goes further into different approaches and experimental results is at http://delivery.acm.org/10.1145/320000/314180/p53-bachrach.pdf?key1=314180&key2=3212322511&coll=GUIDE&dl=GUIDE&CFID=15151515&CFTOKEN=6184618

Wednesday, July 05, 2006

Testing, testing. Is this thing on?