Monday, September 28, 2009

Hashtable

Download

This is a simple C++ hashtable implementation that I am using in the Puppeteer Runtime. It is doing coalesced hashing, which is a combination of open addressing and separate chaining.

Hashtable is using "const char*" strings as for the keys. I didn't implement a general keys to keep the code simple. It is not a big deal to change it to using a template parameter for the keys, but the code becomes harder to read. Anyway, the Puppeteer always uses the strings, so keeping it simple makes more sense to me.

Hashtable does not copy the keys, so if you are going to use it, make sure their lifetime covers the lifetime of the hashtable. The reason is obvious - performance. In fact, the performance is the highest priority for this hashtable. This is the main reason why I decided to make my own implementation. Most of the existing hashtables are making a copies of the keys, and this produces a lot of unnecessary memory allocations.

Hashtable is also supplied with 11 different hash functions. Here is the code that finds the best hash function for a given set of strings:

using namespace puppeteer;

template <HashFunction Hash>
int TestHashtable(const char** aStrings, unsigned aStringCount)
{
Hashtable<int, Hash> hashtable(aStringCount);
for (unsigned i = 0; i < aStringCount; ++i)
{
hashtable.Add(aStrings[i], i);
}
return hashtable.GetCollisionCount();
}

int main()
{
const char* strings[] =
{
"aaa",
"bbb",
"ccc",
"1",
"2",
"3",
"sgfdsgfsd",
"vcxvx",
"dc dcd ",
" ",
"4456"
};

printf("AdditiveHash: %i collisions\n", TestHashtable<AdditiveHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("XorHash: %i collisions\n", TestHashtable<XorHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("RotatingHash: %i collisions\n", TestHashtable<RotatingHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("ShiftAddXorHash: %i collisions\n", TestHashtable<ShiftAddXorHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("OneAtATimeHash: %i collisions\n", TestHashtable<OneAtATimeHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("BernsteinHash: %i collisions\n", TestHashtable<BernsteinHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("XorBernsteinHash: %i collisions\n", TestHashtable<XorBernsteinHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("SdbmHash: %i collisions\n", TestHashtable<SdbmHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("FnvHash: %i collisions\n", TestHashtable<FnvHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("ElfHash: %i collisions\n", TestHashtable<ElfHash>(strings, sizeof(strings) / sizeof(*strings)));
printf("JenkinsHash: %i collisions\n", TestHashtable<JenkinsHash>(strings, sizeof(strings) / sizeof(*strings)));

return 0;
}

No comments:

Post a Comment