The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > High Level Programming
.
google unix.com



High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Print Entire hash list (hash of hashes) Alalush Shell Programming and Scripting 1 08-06-2008 08:40 AM
Awk Hash Function. dinjo_jo Shell Programming and Scripting 2 07-16-2008 06:30 PM
Locality Sensitive Hash Function Based on Concomitant Rand Order Statistics iBot UNIX and Linux RSS News 0 07-07-2008 05:10 AM
Passing global variable to a function which is called by another function sars Shell Programming and Scripting 4 06-30-2008 11:39 AM
dmidecode, RAM speed = "Current Speed: Unknown" Santi Filesystems, Disks and Memory 0 02-16-2006 06:16 AM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 4 Weeks Ago
killerqb killerqb is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 6
Hash Function Speed

I have created my own hash table class, but am looking to speed it up. My current hash function is:

Code:
 int HashTable::hashFunc(const string &key) const
        {
		int tableSize = theLists.size();
            int hashVal = 0;
			for(int i = 0; i<key.length();  i++)
			hashVal = 37*hashVal+key[i];
			hashVal %= tableSize;
			if(hashVal<0)
			hashVal += tableSize;
			return hashVal;
        }
I am looking for an alternative function that can hash a string. That will probably yield similar results.
  #2 (permalink)  
Old 4 Weeks Ago
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,717
Your hash function is, um, unusual. Your for loop does nothing. Assuming that is what you really wanted, this is what you function is actually doing:
Code:
int HashTable::hashFunc(const string &key) const
{
  int tableSize = theLists.size();
  int hashVal = 0;
      			
  hashVal = 37*hashVal+key[key.length()];
  return hashVal % tableSize;
}
which is consierably faster - not iterating i over the length of a string.
  #3 (permalink)  
Old 4 Weeks Ago
Corona688 Corona688 is offline
Registered User
  
 

Join Date: Aug 2005
Location: Saskatchewan
Posts: 1,929
Quote:
Originally Posted by jim mcnamara View Post
Your hash function is, um, unusual. Your for loop does nothing.
Are you sure of that? The indenting looks almost random, but if you ignore it, the loop looks like it should iterate over the line below it, modifying the value of hashVal further for every character in the key string.
  #4 (permalink)  
Old 4 Weeks Ago
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,717
It is missing { }, IMO.

---------- Post updated at 09:09 ---------- Previous update was at 09:07 ----------

Code:
int HashTable::hashFunc(const string &key) const
        {
		int tableSize = theLists.size();
            int hashVal = 0;
			for(int i = 0; i<key.length();  i++) {
			hashVal = 37*hashVal+key[i];
			hashVal %= tableSize;
			if(hashVal<0)
			hashVal += tableSize;
                                      }

			return hashVal;
        }
  #5 (permalink)  
Old 4 Weeks Ago
Corona688 Corona688 is offline
Registered User
  
 

Join Date: Aug 2005
Location: Saskatchewan
Posts: 1,929
Quote:
Originally Posted by jim mcnamara View Post
It is missing { }, IMO.
Without { }, the loop will operate on the following statement instead of the following code block. If the loop had a semicolon on the end, then it would be truly pointless. Try the following code:

Code:
int n;
for(n=0; n<10; n++)
  printf("loop A %d\n", n);

for(n=0; n<10; n++);
  printf("loop B %d\n", n);
It looks to me like the intent was to only work on the following line -- why strip the value to table size every loop instead of just once -- so code blocks were left out. Better indenting would have showed the intent. For maximum clarity they could've surrounded the single line.

---------- Post updated at 09:34 AM ---------- Previous update was at 09:10 AM ----------

As for improving the hash function, there's not a lot to it as is. Slowdowns may be coming from other things. How full are your hash tables, how many collisions are you getting?

Last edited by Corona688; 4 Weeks Ago at 11:16 AM..
  #6 (permalink)  
Old 4 Weeks Ago
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,717
As to hash improvement in general - test avalanche/distribution for your data sets on these:

Code:
// XOR-Bernstein hash

unsigned xor_b_hash ( const void *key, 
                      const int len,
                      const unsigned tablesize )
{
  const unsigned char *p = (const unsigned char *)key;
  unsigned hval = 0;
  int i=0;

  for ( i = 0; i < len; i++, p++ )
    hval = 33 * hval ^ *p;

  return hval % tablesize;
}

// fowler/nol/vo hash

unsigned fnv_hash ( const void *key, 
                    const int len,
                    const unsigned tablesize )
{
  const unsigned char *p = (const unsigned char *)key;
  unsigned hval = 2166136261U;
  int i=0;

  for ( i = 0; i < len; i++, p++ )
    hval = ( hval * 16777619 ) ^ *p;

  return hval % tablesize;
}
OP's additive hash fails to treat permutations, i.e., “xyz”, “zyx”, and “xzy” all result in the same hash value.

And if the original hash is "slow", then so will these be. Did you try instrumtenting your code, or using a profiler? ...before you decided the hash algorithm was the bottleneck.
  #7 (permalink)  
Old 4 Weeks Ago
Corona688 Corona688 is offline
Registered User
  
 

Join Date: Aug 2005
Location: Saskatchewan
Posts: 1,929
Quote:
Originally Posted by jim mcnamara View Post
OP's additive hash fails to treat permutations, i.e., “xyz”, “zyx”, and “xzy” all result in the same hash value.
Once again, sorry, but look closer: It's not an additive hash. Order alters the value because it multiplies hashval by 37 each iteration before adding the next character. With a table size of 2048, it gives me:
Code:
hash("xyz") == 943
hash("zyx") == 1631
hash("xzy") == 979
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:00 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0