#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    38
    Rep Power
    2

    Arrow Hash Table Direct Chaining


    I am attempting to create a hash table for words from text file with direct chaining (linked list) for collisions.This is my first time doing hash tables, so forgive me for the wealth of bad code.
    (The hash function is just temporary, I want a working hash table before I create a better function.)

    To give you an outline of my thought process:
    The program opens the file that the user specifies, reads it word by word until EOF.
    For each word, it creates a hashtable entry.
    If an entry for the key already exists, check if the words match; if they do, update 'count', and if they don't, move onto the next node (repeat).
    If you find a NULL pointer for the key, create a node in that place.

    I get a segmentation fault for the following code:
    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    /* Create the node structure */
    typedef struct node {
    	char word[30]; 
    	struct node * next; 
    	int count; 
    } node;
    
    #define HASH_SIZE 211
    node *hashTable[HASH_SIZE];
    
    int hash ( char * string ) {
    	int i, key;
    	int sum = 0;
    	for( i=0; i<strlen(string); i++ ) {
    		sum = sum + string[i];
    	}
    	key = sum % HASH_SIZE;
    	return key;
    }
    
    void insertword ( char * string ) {
    	// convert string to lower case 
    	char lower[30];
    	int i;
    	for( i=0; i<=strlen(string); i++ ) {
    		lower[i] = tolower(string[i]);
    	}
    	// retrieve key from hash function
    	int key = hash( lower );
    	// if there is an existing pointer with that key in the table
    	if( hashTable[key] != NULL ) {
    		while( hashTable[key]->next != NULL ) {
    			// if the words match, update count
    			if( strcmp( hashTable[key]->word, lower ) == 0 ) {
    				hashTable[key]->count = hashTable[key]->count + 1;
    			//otherwise, move to the next pointer
    			} else {
    				hashTable[key] = hashTable[key]->next;
    			}
    		}
    		// when hashTable[key]->next == NULL, do the same test
    		if( strcmp( hashTable[key]->word, lower ) != 0 ) {
    			// exact word isn't in the table
    			node * new = calloc(1, sizeof(hashTable[key]));
    			new->next = NULL;
    			strcpy(new->word, lower);
    			new->count = 1;			
    			hashTable[key]->next = new;
    		}
    	} else {
    		node * new = calloc(1, sizeof(hashTable[key]));
    		new->next = NULL;
    		strcpy(new->word, lower);
    		new->count = 1;	
    	}
    }
    
    void makehashtable ( FILE *file ) {
    	char string[30];
    	int i;
    	printf("Creating hash table...\n");
    	// copy words from text file into 'string'.
    	while( fscanf(file, "%s", string) != EOF ) {
    		insertword( string );
    	}
    	printf("Hash table created.\n");
    }	
    
    int main ( int argcount, char *argvalue[] )
    {	
    	if ( argcount != 2 ) {
    		printf("Please specify a file to open\n");
    		return 0;
    	} else {
    		//assume second argument is filename and open it
    		FILE *file = fopen( argvalue[1], "r" );
    
    		// fopen returns NULL is file can't be found
            	if ( file == NULL ) {
    			printf( "Could not open file\n" );
    		} else {
    			makehashtable( file );
    			fclose( file );
    		}
    	}
    	printf("hashTable[1]: %s", hashTable[1]->word);	
    	return 0;
    }
  2. #2
  3. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,396
    Rep Power
    1871
    > node * new = calloc(1, sizeof(hashTable[key]));
    These should be
    node * new = calloc(1, sizeof(*hashTable[key]));

    Or better yet,
    node * new = calloc(1, sizeof(*node));

    You're not allocating enough memory.

    Also, don't rely on calloc to correctly initialise the members of your struct.
    It's fine for integer data, but pointers and floats are not restricted to 'all-bits-zero' meaning a NULL pointer or 0.0.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    38
    Rep Power
    2
    Ok thanks, but now none of the words are being entered into the hash table.

    I imagined it was because I didn't add this line:

    Code:
    hashTable[key] = new;
    in the 'else' section, but that seems to just get paused at 'Creating hash table...'
  6. #4
  7. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,396
    Rep Power
    1871
    > hashTable[key] = hashTable[key]->next;
    Perhaps you should use a temporary pointer to traverse your list, rather than trashing the root pointer of each list.

    Seriously, start using a debugger and single-stepping the code to follow what it actually does.

    Then create a really simple word list, containing say
    hello
    hello
    world

    where you can reasonably predict the paths through the code in advance.

    > for( i=0; i<strlen(string); i++ )
    You realise you're calling strlen() on every iteration of the loop?
    This saves work here, and in the caller.
    Code:
    	for( i=0; i<string[i] != '\0'; i++ ) {
    		sum = sum + tolower(string[i]);
    	}
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper

IMN logo majestic logo threadwatch logo seochat tools logo