# Comparison

We have seen several ways to construct dictionaries: hash tables, unbalanced binary search trees, red-black trees, and skip lists. There are several factors that influence the choice of an algorithm:

• Sorted output. If sorted output is required, then hash tables are not a viable alternative. Entries are stored in the table based on their hashed value, with no other ordering. For binary trees, the story is different. An in-order tree walk will produce a sorted list. For example:

```void WalkTree(Node *P) {
if (P == NIL) return;
WalkTree(P->Left);

/* examine P->Data here */

WalkTree(P->Right);
}
WalkTree(Root);
```

To examine skip list nodes in order, simply chain through the level-0 pointers. For example:

```Node *P = List.Hdr->Forward;
while (P != NIL) {

/* examine P->Data here */

P = P->Forward;
}
```

• Space. The amount of memory required to store a value should be minimized. This is especially true if many small nodes are to be allocated.

• For hash tables, only one forward pointer per node is required. In addition, the hash table itself must be allocated.

• For red-black trees, each node has a left, right, and parent pointer. In addition, the color of each node must be recorded. Although this requires only one bit, more space may be allocated to ensure that the size of the structure is properly aligned. Therefore each node in a red-black tree requires enough space for 3-4 pointers.

• For skip lists, each node has a level-0 forward pointer. The probability of having a level-1 pointer is ½. The probability of having a level-2 pointer is ¼. In general, the number of forward pointers per node is

n= 1 + ½ + ¼ + ... = 2.

• Time. The algorithm should be efficient. This is especially true if a large dataset is expected. Table 3-2 compares the search time for each algorithm. Note that worst-case behavior for hash tables and skip lists is extremely unlikely. Actual timing tests are described below.

• Simplicity. If the algorithm is short and easy to understand, fewer mistakes may be made. This not only makes your life easy, but the maintenance programmer entrusted with the task of making repairs will appreciate any efforts you make in this area. The number of statements required for each algorithm is listed in Table 3-2.
Table 3-2: Comparison of Dictionaries
method statements average time worst-case time
hash table 26 O(1) O(n)
unbalanced tree 41 O(lgn) O(n)
red-black tree 120 O(lgn) O(lgn)
skip list 55 O(lgn) O(n)

Average time for insert, search, and delete operations on a database of 65,536 (216) randomly input items may be found in Table 3-3. For this test the hash table size was 10,009 and 16 index levels were allowed for the skip list. Although there is some variation in the timings for the four methods, they are close enough so that other considerations should come into play when selecting an algorithm.

Table 3-3: Average Time (µs), 65536 Items, Random Input
method insert search delete
hash table 18 8 10
unbalanced tree 37 17 26
red-black tree 40 16 37
skip list 48 31 35

Table 3-4 shows the average search time for two sets of data: a random set, where all values are unique, and an ordered set, where values are in ascending order. Ordered input creates a worst-case scenario for unbalanced tree algorithms, as the tree ends up being a simple linked list. The times shown are for a single search operation. If we were to search for all items in a database of 65,536 values, a red-black tree algorithm would take .6 seconds, while an unbalanced tree algorithm would take 1 hour.

Table 3-4: Average Search Time (µs)
random
input
count hash table unbalanced tree red-black tree skip list
16 4 3 2 5
256 3 4 4 9
4,096 3 7 6 12
65,536 8 17 16 31
ordered
input
16 3 4 2 4
256 3 47 4 7
4,096 3 1,033 6 11
65,536 7 55,019 9 15