The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> C Programming
|
on-disc binary tree?
Discuss on-disc binary tree? in the C Programming forum on Dev Shed. on-disc binary tree? C programming forum discussing all C derivatives, including C#, C++, Object-C, and even plain old vanilla C. These languages are low level languages, and used on projects such as device drivers, compilers, and even whole computer operating systems.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

February 18th, 2003, 12:42 PM
|
|
matthewdoucette.com
|
|
Join Date: May 2002
Posts: 635

Time spent in forums: 10 h 8 m 16 sec
Reputation Power: 11
|
|
|
on-disc binary tree?
(in regards to my previous thread on storing millions of IPs to disc:
http://forums.devshed.com/showthrea...&threadid=50109)
Can I create a binary tree that is stored on disc? Then only modify the section(s) of the file to modify the binary tree as each node is added? I always thought you had to read in the entire file and re-write the entire file to disk whenver a change is made, but this is not the case, is it?
__________________
Matthew Doucette / Xona.com
|

February 19th, 2003, 03:07 PM
|
|
Contributing User
|
|
Join Date: Oct 2000
Location: Back in the real world.
|
|
IMHO
you would probably want to convert all pointers in your structs to relative pointers (= "-5 records from here" or "+2 records from here") and write it to disk like that.
on reading the tree you would back-convert.
but for sure, there is no way to insert data in the middle of a file, so it would need a re-organization then-and-when ("defrag"  ) or some care on loading the file.
never tried this though.... but for my recent project, 3D worlds, i'll need binary trees too in the near future. This will be my first approach until i get my a** up on reading about "BSP trees" that from my knowledge right now do take care of exactly this problem...
... just 2ect for my thoughts on this ...
|

February 20th, 2003, 08:37 AM
|
|
matthewdoucette.com
|
|
Join Date: May 2002
Posts: 635

Time spent in forums: 10 h 8 m 16 sec
Reputation Power: 11
|
|
Quote: Originally posted by M.Hirsch
IMHOthere is no way to insert data in the middle of a file, so it would need a re-organization then-and-when ("defrag" ) or some care on loading the file. | I am only storing IPs, so there will only be nodes added and nodes modified (increment counters on each IP), no nodes deleted. This removes the need for defrag. This tree will also be completely reset each day, even though it may grow to be 1 or 2 million nodes in size.
|

February 20th, 2003, 01:24 PM
|
 |
Banned ;)
|
|
Join Date: Nov 2001
Location: Woodland Hills, Los Angeles County, California, USA
|
|
|
You could also allocate a file that is large enough to hold all the possible entries. Then all you need to do is compute the offset for an entry and increment the counter. This may take up a lot more space (32 MB), but if you have plenty of hard disk space, this isn't such a bad plan. Besides, you don't need to implement a binary tree at all with this method. All you need to do is compute the offset for a given IP address and fseek(3) to that area of the file. It's quick, simple and gets the job done!
|

February 20th, 2003, 02:07 PM
|
|
matthewdoucette.com
|
|
Join Date: May 2002
Posts: 635

Time spent in forums: 10 h 8 m 16 sec
Reputation Power: 11
|
|
|
Good idea, but it does not allow me to store anything but an increment counter. I may be storing additional information with each IP, like the referrer perhaps.
Interesting idea though. Assuming I wanted to store only an increment, and only increment up to 255, it would take 256^4 * 1 byte / 1024 / 1024 / 1024 = 4 GB discspace, that is quite a lot.
There are probably IP ranges that do not exist however. Anyone know what they are?
|

February 20th, 2003, 02:35 PM
|
|
matthewdoucette.com
|
|
Join Date: May 2002
Posts: 635

Time spent in forums: 10 h 8 m 16 sec
Reputation Power: 11
|
|
|
Another idea:
A 256 branch, 4-Level, b-tree. The IPs would be "stored" in the location of their data. Each IP is four nodes of one byte each. The left nodes are the only nodes which store the IP's data.
Last edited by Doucette : February 20th, 2003 at 02:40 PM.
|

February 23rd, 2003, 12:04 PM
|
|
Registered User
|
|
Join Date: Feb 2003
Location: Right Coast
Posts: 25
Time spent in forums: 21 m 34 sec
Reputation Power: 0
|
|
You should be able to use fixed-length records (esp with IP addresses) to make a random access file?
J 
|

February 26th, 2003, 01:16 PM
|
|
matthewdoucette.com
|
|
Join Date: May 2002
Posts: 635

Time spent in forums: 10 h 8 m 16 sec
Reputation Power: 11
|
|
Quote: Originally posted by ahuimanu
You should be able to use fixed-length records (esp with IP addresses) to make a random access file?
J | This is exactly what I plan on doing. I just have never coded anything in any language where I do not write out an entire file. Is there anything special in coding a program that modifies only the middle of a file, for example?
|

February 26th, 2003, 01:36 PM
|
 |
Banned ;)
|
|
Join Date: Nov 2001
Location: Woodland Hills, Los Angeles County, California, USA
|
|
Nothing to it really. Just open the file in read-write mode, seek to the offset and write your data. Here's some sample code for you to peruse:
Code:
FILE *fp;
if ((fp = fopen("test.txt", "r+w")) == NULL) {
fprintf(stderr, "Could not open test.txt\n");
return(1);
}
/* Compute where we want to put data */
offset = ComputeOffset();
/* Seek to offset bytes from beginning of file */
fseek(fp, offset, SEEK_SET);
/* Write to the file */
fputs(data, fp);
fclose(fp);
The only thing you need to worry about is the size of the file. Like you said, it's going to be huge 
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|