Visit Our UNIX and Linux User Community


Hash Function Speed


 
Thread Tools Search this Thread
Top Forums Programming Hash Function Speed
# 1  
Old 10-22-2009
Hash Function Speed

I have created my own hash table class, but am looking to speed it up. My current hash function is:

Code:
 int HashTable::hashFunc(const string &key) const
        {
		int tableSize = theLists.size();
            int hashVal = 0;
			for(int i = 0; i<key.length();  i++)
			hashVal = 37*hashVal+key[i];
			hashVal %= tableSize;
			if(hashVal<0)
			hashVal += tableSize;
			return hashVal;
        }

I am looking for an alternative function that can hash a string. That will probably yield similar results.
# 2  
Old 10-22-2009
Your hash function is, um, unusual. Your for loop does nothing. Assuming that is what you really wanted, this is what you function is actually doing:
Code:
int HashTable::hashFunc(const string &key) const
{
  int tableSize = theLists.size();
  int hashVal = 0;
      			
  hashVal = 37*hashVal+key[key.length()];
  return hashVal % tableSize;
}

which is consierably faster - not iterating i over the length of a string.
# 3  
Old 10-22-2009
Quote:
Originally Posted by jim mcnamara
Your hash function is, um, unusual. Your for loop does nothing.
Are you sure of that? The indenting looks almost random, but if you ignore it, the loop looks like it should iterate over the line below it, modifying the value of hashVal further for every character in the key string.
# 4  
Old 10-22-2009
It is missing { }, IMO.

---------- Post updated at 09:09 ---------- Previous update was at 09:07 ----------

Code:
int HashTable::hashFunc(const string &key) const
        {
		int tableSize = theLists.size();
            int hashVal = 0;
			for(int i = 0; i<key.length();  i++) {
			hashVal = 37*hashVal+key[i];
			hashVal %= tableSize;
			if(hashVal<0)
			hashVal += tableSize;
                                      }

			return hashVal;
        }

# 5  
Old 10-22-2009
Quote:
Originally Posted by jim mcnamara
It is missing { }, IMO.
Without { }, the loop will operate on the following statement instead of the following code block. If the loop had a semicolon on the end, then it would be truly pointless. Try the following code:

Code:
int n;
for(n=0; n<10; n++)
  printf("loop A %d\n", n);

for(n=0; n<10; n++);
  printf("loop B %d\n", n);

It looks to me like the intent was to only work on the following line -- why strip the value to table size every loop instead of just once -- so code blocks were left out. Better indenting would have showed the intent. For maximum clarity they could've surrounded the single line.

---------- Post updated at 09:34 AM ---------- Previous update was at 09:10 AM ----------

As for improving the hash function, there's not a lot to it as is. Slowdowns may be coming from other things. How full are your hash tables, how many collisions are you getting?

Last edited by Corona688; 10-22-2009 at 12:16 PM..
# 6  
Old 10-22-2009
As to hash improvement in general - test avalanche/distribution for your data sets on these:

Code:
// XOR-Bernstein hash

unsigned xor_b_hash ( const void *key, 
                      const int len,
                      const unsigned tablesize )
{
  const unsigned char *p = (const unsigned char *)key;
  unsigned hval = 0;
  int i=0;

  for ( i = 0; i < len; i++, p++ )
    hval = 33 * hval ^ *p;

  return hval % tablesize;
}

// fowler/nol/vo hash

unsigned fnv_hash ( const void *key, 
                    const int len,
                    const unsigned tablesize )
{
  const unsigned char *p = (const unsigned char *)key;
  unsigned hval = 2166136261U;
  int i=0;

  for ( i = 0; i < len; i++, p++ )
    hval = ( hval * 16777619 ) ^ *p;

  return hval % tablesize;
}

OP's additive hash fails to treat permutations, i.e., “xyz”, “zyx”, and “xzy” all result in the same hash value.

And if the original hash is "slow", then so will these be. Did you try instrumtenting your code, or using a profiler? ...before you decided the hash algorithm was the bottleneck.
# 7  
Old 10-22-2009
Quote:
Originally Posted by jim mcnamara
OP's additive hash fails to treat permutations, i.e., “xyz”, “zyx”, and “xzy” all result in the same hash value.
Once again, sorry, but look closer: It's not an additive hash. Order alters the value because it multiplies hashval by 37 each iteration before adding the next character. With a table size of 2048, it gives me:
Code:
hash("xyz") == 943
hash("zyx") == 1631
hash("xzy") == 979


Previous Thread | Next Thread
Test Your Knowledge in Computers #674
Difficulty: Medium
Wi-Fi does not use any parts of the IEEE 803 protocol.
True or False?

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to print hash of hash in table format

Hi, I have a hash of hash where it has name, activities and count i have data like this - $result->{$name}->{$activities} = $value; content of that are - name - robert tom cat peter activities - running, eating, sleeping , drinking, work i need to print output as below ... (3 Replies)
Discussion started by: asak
3 Replies

2. Shell Programming and Scripting

Dynamically parse BibTeX and create hash of hash

Hello gurus, Iam trying to parse following BibTex file (bibliography.bib): @book{Lee2000a, abstract = {Abstract goes here}, author = {Lee, Wenke and Stolfo, Salvatore J}, title = {{Data mining approaches for intrusion detection}}, year = {2000} } @article{Forrest1996, abstract =... (0 Replies)
Discussion started by: wakatana
0 Replies

3. Shell Programming and Scripting

Compare values of hashes of hash for n number of hash in perl without sorting.

Hi, I have an hashes of hash, where hash is dynamic, it can be n number of hash. i need to compare data_count values of all . my %result ( $abc => { 'data_count' => '10', 'ID' => 'ABC122', } $def => { 'data_count' => '20', 'ID' => 'defASe', ... (1 Reply)
Discussion started by: asak
1 Replies

4. Shell Programming and Scripting

perl hash - using a range as a hash key.

Hi, In Perl, is it possible to use a range of numbers with '..' as a key in a hash? Something in like: %hash = ( '768..1536' => '1G', '1537..2560' => '2G' ); That is, the range operation is evaluated, and all members of the range are... (3 Replies)
Discussion started by: dsw
3 Replies

5. Shell Programming and Scripting

Perl Hash:Can not keep hash data in the same order that it was inserted

Can Someone explain me why even using Tie::IxHash I can not get the output data in the same order that it was inserted? See code below. #!/usr/bin/perl use warnings; use Tie::IxHash; use strict; tie (my %programs, "Tie::IxHash"); while (my $line = <DATA>) { chomp $line; my(... (1 Reply)
Discussion started by: jgfcoimbra
1 Replies

6. Shell Programming and Scripting

Assigning a hash to another hash key

Hello, I have a hash in hsh. I need to assign it to another hash globalHsh. I think the below statement does not work $globalHsh{$id} = %hsh; What is the right way to assign it? Thanks (3 Replies)
Discussion started by: rsanjay
3 Replies

7. Filesystems, Disks and Memory

data from blktrace: read speed V.S. write speed

I analysed disk performance with blktrace and get some data: read: 8,3 4 2141 2.882115217 3342 Q R 195732187 + 32 8,3 4 2142 2.882116411 3342 G R 195732187 + 32 8,3 4 2144 2.882117647 3342 I R 195732187 + 32 8,3 4 2145 ... (1 Reply)
Discussion started by: W.C.C
1 Replies

8. Shell Programming and Scripting

Print Entire hash list (hash of hashes)

I have a script with dynamic hash of hashes , and I want to print the entire hash (with all other hashes). Itried to do it recursively by checking if the current key is a hash and if yes call the current function again with refference to the sub hash. Most of the printing seems to be OK but in... (1 Reply)
Discussion started by: Alalush
1 Replies

9. Shell Programming and Scripting

Awk Hash Function.

I have a file with a format of A,2 B,2 G,3 A,2 A,3 A,2 D,7 A,2 E,2 A,2 I need to create a sum of each alphabet with the numbers assigned to it using awk. (2 Replies)
Discussion started by: dinjo_jo
2 Replies

Featured Tech Videos