C Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesC Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old August 12th, 2003, 11:23 PM
ramakrishna ramakrishna is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2003
Location: india
Posts: 8 ramakrishna User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
hash table size

hi,
i have large number of elements (each element is a structure)
each element is having a key value which is an integer 1,2,3,4,5.........maximum.
The maximum may be 10000, 100000 or even higher..

i want to have a hash table for these number of items.
i want to know what will be the best or optimal hash table size for this problem?

Say if the maximum is 100000 and the hash table size is 1000 ,
then i can store the elements in linked lists using hash key index.
the hash key index is (element key value % 1000) then
the elements will be equally distributed in the hash table i.,e.,
each hash key index has equal number of nodes (here 100)

i 'm not sure about the optimality of this size.
if the size is 10000 then there will be only 10 nodes in each list.

can anyone tel me about the optimal size of the hash table?

thank you,

Reply With Quote
  #2  
Old August 13th, 2003, 09:01 AM
mitakeet's Avatar
mitakeet mitakeet is offline
Last Day: May 28, 2005
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 4,575 mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 9 h 57 m 17 sec
Reputation Power: 21
If you are using C++ look up 'map'.
__________________

Left DevShed May 28, 2005. Reason: Unresponsive administrators.
Free code: http://sol-biotech.com/code/.
Secure Programming: http://sol-biotech.com/code/SecProgFAQ.html.
Performance Programming: http://sol-biotech.com/code/PerformanceProgramming.html.

It is not that old programmers are any smarter or code better, it is just that they have made the same stupid mistake so many times that it is second nature to fix it.
--Me, I just made it up

The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.
--George Bernard Shaw

Reply With Quote
  #3  
Old August 14th, 2003, 03:02 AM
ramakrishna ramakrishna is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2003
Location: india
Posts: 8 ramakrishna User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Smile hash table size

no i'm using C language only..

like i want to replace my large linked list into a hash table .
the maximum elements may be 10000, 100000 or even higher..

i will be using all the numbers 1,2,3...up to a maximum .(this number is a key value in my node)

so i need to know wat wil be the optimum hash table size ?

thank you,

Reply With Quote
  #4  
Old August 14th, 2003, 06:15 AM
mitakeet's Avatar
mitakeet mitakeet is offline
Last Day: May 28, 2005
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 4,575 mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 9 h 57 m 17 sec
Reputation Power: 21
If you are using sequential whole numbers as your key, why not simply allocate memory and then index into it? If you know that object 15 is located at array address 14 (remember that C starts at zero) then you can simply index into your array. Do something like this:

Code:
struct MyStruct ms = (MyStruct *) malloc(10000 * sizeof(MyStruct));
/* error checking deleted */

struct MyStruct *tmpStr = &ms[14];

tmpStr->element1 = "whatever";


The hash table is used for rapid access of information, you already seem to me to have the golden standard for one-step lookup.

If you insist on the hash table, then you need to do research on the underlying theory behind them, try googling on "hash", "C", and "Theory".

Reply With Quote
  #5  
Old August 14th, 2003, 07:59 AM
ramakrishna ramakrishna is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2003
Location: india
Posts: 8 ramakrishna User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
hash table size

hi,

u r right!
i can use simple arrays of structures.

one thing here i dont know the maximum number of jobs(structures) in advance. so for this i have to allocate memory
to each job structure when a job is actually coming( this is based on random times).

and more ever i'm deleting the job which is no longer needed , by freeing that pointer.

if i use 10000 nodes in linked list or an array of 10000 structures,
what would be the optimum choice in between these?
( in memory and execution time wise to search a key job).

explain this case clearly.. as i know little about the performance issues in the code...

thank you.

Reply With Quote
  #6  
Old August 14th, 2003, 08:15 AM
mitakeet's Avatar
mitakeet mitakeet is offline
Last Day: May 28, 2005
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 4,575 mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 9 h 57 m 17 sec
Reputation Power: 21
Why do you need to dealocate the memory of each node? Simply reuse the memory or wait until the program exits. The code I gave you is for dynamic memory, just substitute a variable for the value '10000' (but be sure to check to see if the memory was allocated properly first, check your C documentation). As for memory utilization, you need to know what the size of your structure is (that is what the sizeof operator is for). Simply take that value (a simple one line program will tell you) and multiply it by the number of elements in your array. If that number is close to the amount of true RAM (do NOT count virtual memory) then you are risking paging (which will slow your program down by as much as 1000 times). However, as is likely on today's machines, you will be using just a tiny fraction of the available RAM and it won't matter.

Without knowing more about your program I can only speculate, but I would suggest that you simply reuse your existing memory for each job as it comes in. If you only have one job at a time, then why do you need any linked list or array? If you know they only come in a few at a time (how long does it take to run a job?), just allocate enough memory to handle those jobs, plus say 20% (be sure to add error checking code for the case where too many jobs do come in). Linked lists are typically poor choices for compute bound programs, you can't get good localization of memory and wind up with a poor cache hit ratio (if this doesn't mean anything to you, don't worry, if you have a need for speed you will learn it soon).

As for searching, there are lots of algorithms available but in C you either use third party libraries or roll your own. In C++ you get lots of great stuff in the STL that is highly optimized and all free. I did a roll my own once and indexed some 27 million values in memory (only 256 MB at that day, had to use lots of compression tricks) and the program ran so fast it was actually I/O bound.

Unless you want to detail your requirments to me, my suggestion is to just allocate as much memory as you think you will need and use simple array indexing to do the 'lookups'. You will find nothing faster and most of the other tricks will also use lots of memory (i.e., the 'wasted' memory you allocate that doesn't get used by your program could just as easily be 'wasted' inside the hash structures).

Reply With Quote
  #7  
Old August 14th, 2003, 08:38 AM
ramakrishna ramakrishna is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2003
Location: india
Posts: 8 ramakrishna User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
re-using memory

hi,

i did n't get you regarding re-usage of the memory?

some details in my program:

a job (structure) comes randomly say at times t1,t2,t3..

so at time t1 , i'm allocating memory using malloc to the job and appending at the end of the job linked lsit( or wil be the first one if the list is empty).

before t2 occurs.
i am searching a key job , to know the information of the job( one of the jobs which is in the list already). i mean at time t1 there may be any number of jobs. this i can know by a varaible A(as whenever a job comes this varaiable A is incremented)

in my program i/m having lot of search calls..

Instead of using linked lists, i will try with simple dynamic arrays as u said. ( in this there is no deletion is required)

thank you,

Reply With Quote
  #8  
Old August 14th, 2003, 10:02 AM
mitakeet's Avatar
mitakeet mitakeet is offline
Last Day: May 28, 2005
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 4,575 mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 9 h 57 m 17 sec
Reputation Power: 21
Once a job is done, reuse that memory for a different incomming job. If you never have more than a few dozen jobs at any given instance, a simple scroll through an array of job IDs will be fast. Allocating and deallocating memory is an expensive process, it is best to minimize the number of times you have to do so in code that is performance oriented.

Reply With Quote
  #9  
Old August 14th, 2003, 10:16 PM
ramakrishna ramakrishna is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2003
Location: india
Posts: 8 ramakrishna User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
dynamic arrays of pointers

hi,

i wuld like to know how to use dynamic memory allocation ,

as i said earlier the job(structure) comes randomly in the program,
so i 've to allocate memory to each one.

and i have to store them in an array of pointers?

is there any limit for the size ? as if i use dynamically array of pointers to structure? and is deletion is required?

as the array size may go upto 10000000..or even higher?

thank you,

Reply With Quote
  #10  
Old August 15th, 2003, 05:50 AM
mitakeet's Avatar
mitakeet mitakeet is offline
Last Day: May 28, 2005
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 4,575 mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level)mitakeet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 1 Week 2 Days 9 h 57 m 17 sec
Reputation Power: 21
This is dynamic memory allocation with error and bounds checking:

Code:

#include <stdio.h>
#include <stdlib.h>

/* you can have other structs in your structs */
struct MyStruct{
    int element1;
    char element2[10];
    float element3; 
};


int main(){
    struct MyStruct *ms, *tmpStr;
    int size = 1000; /*this would be filled by your code */
    int memsize, whichStruct = 14;

    fprintf(stderr, "sizeof(struct MyStruct): %d bytes\n", sizeof(struct MyStruct));
    memsize = size * sizeof(struct MyStruct); /* gets the total number of bytes needed */
    fprintf(stderr, "size (%d) * sizeof(struct MyStruct) = %d bytes\n\n", size, memsize);

    memsize = size * sizeof(struct MyStruct); /* gets the total number of bytes needed */
    ms = (struct MyStruct *) malloc(memsize);
    /* if you want to be sure the memory is clean (some OSs will do it for you)
       use calloc(1, memsize) which will set all values to binary zero (NULL).
       you can also clear the memory with memset */
    if (ms ==NULL){
        fprintf(stderr, "Unable to allocate %d bytes on the heap for pointer 'ms'\n", memsize);
        exit(1);
    }

    if (whichStruct < 0 || whichStruct >= size){
        fprintf(stderr, "whichStruct (%d) is out of bounds!\n", whichStruct);
        exit(1);
    }

    tmpStr = &ms[whichStruct]; /* point at a 'random' location in array */

    tmpStr->element1 = 1;
    strcpy(tmpStr->element2, "bob");
    tmpStr->element3 = 1.901f;

    printf("before:\n");
    printf("\telement1: %d\n", tmpStr->element1);
    printf("\telement2: %s\n", tmpStr->element2);
    printf("\telement3: %f\n", tmpStr->element3);

    /* this is what I mean by reusing the same memory */
    tmpStr->element1 = 2;
    strcpy(tmpStr->element2, "fred");
    tmpStr->element3 = 9.021f;

    printf("after:\n");
    printf("\telement1: %d\n", tmpStr->element1);
    printf("\telement2: %s\n", tmpStr->element2);
    printf("\telement3: %f\n", tmpStr->element3);

    free(ms); /* deallocate memory (exiting the program will also do this, but it is bad form to rely on that) */

    return 0;
}



Unless you have to preserve your job structures for some reason for the duration of the program (if it is just for auditing, write the data to a file before you are finished), you should not have to keep allocating memory. However, if you know you will not have more than, say, 10,000 jobs before the program exits, you can save yourself a lot of headache (presuming you have the memory) by just allocating memory for the lot and not reusing it. If this is a prototype program, I would not worry about memory efficiency now, get the rest of it working flawlessly and then come back and attempt to iron out the details.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesC Programming > hash table size


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump



 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway
Stay green...Green IT