Software Design
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreSoftware Design

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old January 21st, 2002, 08:34 PM
Sepodati's Avatar
Sepodati Sepodati is offline
Banned
Dev Shed God 19th Plane (14000 - 14499 posts)
 
Join Date: Dec 1999
Location: Afghanistan
Posts: 14,385 Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)Sepodati User rank is General 12nd Grade (Above 100000 Reputation Level)  Folding Points: 72870 Folding Title: Intermediate FolderFolding Points: 72870 Folding Title: Intermediate FolderFolding Points: 72870 Folding Title: Intermediate FolderFolding Points: 72870 Folding Title: Intermediate Folder
Time spent in forums: 2 Months 4 Weeks 20 h 34 m 57 sec
Reputation Power: 1784
Send a message via ICQ to Sepodati Send a message via Yahoo to Sepodati
Best method for a parent/child type relationship

Just curious what everything thinks is the best way to organize parent/child relationships.

Let's say we have something like this:
Code:
  101
  |  102
  |  |   105
  |  103
  |  104
  |      106
  |      107
  |          108
  |              109
  201
     202
     203
     204
         206
         207
             208

There can be an unlimited number of children and parents, but it's all organized as a tree.

The most popular method is probably to have two fields, parent and id. id would be the current record and parent would be the record right above it.

Since I'm mainly interested in a database solution, I'll discuss it in those terms. The problem with this layout is the gathering of all the records down to X children, where the user could set how far down to go. Once you get past the first two layers, there are a lot of additional queries to execute based on the previous queries to get the rest of the children.

Hopefully I have explained this well enough, but what I'm basically looking for is the best way to store and retrieve records organized like as a tree like this. What's the best way to pull the records out to X children and display it?

Thanks for any help.

---John Holmes...

Reply With Quote
  #2  
Old January 21st, 2002, 09:36 PM
dcaillouet's Avatar
dcaillouet dcaillouet is offline
Big Endian
Dev Shed Beginner (1000 - 1499 posts)
 
Join Date: May 2001
Location: Fly-over country
Posts: 1,173 dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 16 h 29 m 5 sec
Reputation Power: 24
I encountered a situation where I had to represent a hierachy (an org chart) in a database and it was tricky. Every "solution" had a down side. Fortunately for me, I was able to come up with a client-side solution because I was dealing with so few employees (several hundred). My solution, would have stunk for thousands of records (it involved loading the records into a client-side tree and creating a temporary table that related all sub-nodes to a top node in a branch of the tree).

Last week, the lady who works in the cube next to me brought in a book call the Guru's Guide to Transact-SQL. The author devotes the 12th chapter to different methods for storing hierachies in a database. The next time you're in a bookstore you might want to take a look at it and see if it gives you any ideas. Because the book was written for SQL Server, some of his solutions use server-side cursors which would not be viable for all databases.
http://www.amazon.com/exec/obidos/A...1159580-6675255

Reply With Quote
  #3  
Old January 22nd, 2002, 03:26 AM
Piper Piper is offline
To understand recursion, you have to understand recursion first!
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2001
Location: Germany
Posts: 174 Piper User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Hi

I suppose it is impossible to create ONE query which can drill down to the last level of children.
This is IMHO a question of the unpredictable count/depth of parent/child levels in the tree.

So I suppose one has to deal with some kind of loop-mechanism (i.e. cursors, which, of course, are not implemented in mysql ;( , YET).

Well, that wasnt very helpful, though, but I wanted to contribute to a topic which interests me personally...
Comments on this post
JimmyGosling agrees!
__________________
---BP-------------------------------------------
"Life is what happens, while you're makin' other plans..."

Reply With Quote
  #4  
Old January 22nd, 2002, 04:26 AM
Thrasher Thrasher is offline
Canta como rafaé
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2001
Location: Barcelona
Posts: 74 Thrasher User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Send a message via ICQ to Thrasher
n-ary trees

I had the same problem some time ago and I found a solution for me, but it might not be userful for you.

This solution involves limiting the number of children per each node (so getting an n-ary tree). According on your applicattion you must set the 'n' to whichever number you want. This is some sort of heap.

Instead of storing the pair id/parent_id, you only need an id. The type of the id (integer, long integer) also depends on the 'n' and the depth of the tree.

This id is not taken from any sequence. The id assigned to a node is as follows.

Code:
if node is root
then id = 1

else if node is not root and parent = p
and node is child number c (c = 0 ... n-1)
then id = (n * (p - 1)) + 2 + c


With this you are implementing a full n-ary tree. The good thing is that you are able to know the children of a node with ease.

But a better thing is to know mathematically which is the parent node of a given one.

Code:
(parent of node i) = ceil ((i - 1) / n)


This way, you only have to compute the list of nodes up to the root, and put them into a single database query

Code:
(i is the node we start at, n is the n-ary degree)
list = {}
while (i <> root) do
  insert_list (i);
  parent = ceil ((i - 1) / n)
  i = parent
end;

insert_list (i);    // do not forget to insert the root node


Now, you have in list the list of nodes of the path. And another pretty good thing is that, as the nodes are numbered, and due to the implementation, doing

Code:
select from db where id in list order by id desc


would ensure you the path from the root to the node, and

Code:
select from db where id in list order by id asc


would give you from node to root.


Hope it helped. Tell me for more info.

NOTE: Edited to correct a mistake. Used floor instead of ceil
__________________
Thrasher



'Y se ahogaron los dooos
No eran duros pa pagar, cuñaaoo !!'
El vagamundo - El risitas y su cuñao

Last edited by Thrasher : January 22nd, 2002 at 11:49 AM.

Reply With Quote
  #5  
Old January 22nd, 2002, 06:13 AM
andnaess andnaess is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2001
Location: Oslo
Posts: 1,516 andnaess User rank is Private First Class (20 - 50 Reputation Level)andnaess User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 9
Well, I just wrote a small class for handling menus based on categories of arbitrary depth. In the database I simply store a parent_id in each category, then I fetch all categories and build the tree using a very efficient algorithm in PHP. Of course, the downside is that you have to fetch all the entries, so it only works well for smaller structures (maybe a few thousand records). The algorithm that builds the tree is however very fast, and if there are 10 out of 200 categories that will actually display (for example the 10 categories under the root), then the algorithm will only need 10 operations, so it follows the number of categories to show, which normally is fairly small.

The advantage of this is that once the tree has been built, I can store my internal representation (an array), and then simply rebuild the tree from this array every time the user accesses the tree in some new way. Typically the structure changes very rarely compared to the number of times it's fetched, so if you can store it in memory it will be *very* fast, even for large structures (albeit memory consuming).
__________________
--
Regards
André Næss

Puritanism: The haunting fear that someone, somewhere may be having fun

Reply With Quote
  #6  
Old January 22nd, 2002, 06:49 AM
Thrasher Thrasher is offline
Canta como rafaé
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2001
Location: Barcelona
Posts: 74 Thrasher User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 8
Send a message via ICQ to Thrasher
andnaess:

How is that efficient algorithm in PHP done? I couldn't find a single one really efficient ...

Reply With Quote
  #7  
Old January 22nd, 2002, 07:39 AM
andnaess andnaess is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2001
Location: Oslo
Posts: 1,516 andnaess User rank is Private First Class (20 - 50 Reputation Level)andnaess User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 9
Well, like so many other efficient algorithms, it's speed is largely due to the underlying datastructure. What I do is keep a 2 dimensional array of id values (category id's). An entry might look like this:
$arr[1] = Array(1,3,5,7)

This entry means that the category with id 1 has children 3,5 and 7. Notice how the id is the key of the first dimension, and that it's also the first element in the list of children (the latter is an optimization trick.). The reason why the algorithm is fast is that it can jump from category to category directly using the array keys. In most cases a category tree (or menu) is kept rather small (otherwise people just get lost), for example, you normally just show the 1st. level, like this:

Code:
Main dishes
Side dishes
Poultry
Meat
Pasta

Then the user clicks on "Poultry" and you expand that category to get:
Code:
Main dishes
Side dishes
Poultry
  Chinese
  Thai
  Tex-mex
Meat
Pasta

My algorithm is written so that it will never try to make any recursive calls for categories that are not in this first level or in the expanded part, so I use "complete pruning" and thus the number of "iterations" is kept small even though the complete tree if fully expanded is huge (the number of "iterations" is actually equal to the number of categories that will be viewed).

But again, I must stress that this algorithm is the last in a series of three operations; first there's the complete fetch from the db, then this result is turned into the array representation, then finally, the algorithm is used to create a "view" of the tree. It is this latter part which changes all the time in the applications I need this for, and thus this approach made sense. So far I've only used it with very small structures, so I haven't worried to store the intermediate array physically (the whole process of building and displaying the tree takes 0,05 seconds or something, so there's really no point... ) But if I need to use this for something bigger, that would be an acceptable solution. Typically you could have the category manager rewrite the array representation every time changes were made to the categories, and store it serialized (in shared memory if you're *really* into performance.

Whether this approach is the right one in this case I don't know, but it has the advantage of keeping the db structure very simple, the number of db accesses to a minimum, and the code short. (The algorithm itself is something like 15 lines, and building the intermediate array is a while loop of some 20 lines).

Reply With Quote
  #8  
Old January 22nd, 2002, 08:26 AM
Onslaught's Avatar
Onslaught Onslaught is offline
/(bb|[^b]{2})/
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Nov 2001
Location: Somewhere in the great unknown
Posts: 4,840 Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 2 Days 36 m 16 sec
Reputation Power: 88
Send a message via ICQ to Onslaught
Ok, here is my shot at this.
How about, instead of having a parent/child structure, have a root/parent/child structure.
Take for instance:
Code:
root
  +-parent level
          +-child level I
          +-child level I
                 +-child level II
                 +-child level II
          +-child level I

would translate to:
Code:
1
  +-1.1
          +-1.1.1
          +-1.1.2
                 +-1.1.2.1
                 +-1.1.2.2
          +-1.1.3
So, if you wanted to get everything that was from root you could go:
select * from tree_table where root = '1'
If you wanted everything from the first parent level:
select * from tree_table where root = '1' and parent = '1'
If you wanted just the children from level 2:
select * from tree_table where root = '1' and parent = '1' and child like '2.%'
Pretty much what I am saying is keep the fields in a step structure. They could be all in one field like I listed above, they could all be in on field as the unique id and the calls could be like:
select * from tree_table where id = '1' or like '1.%' <-- for all root 1
select * from tree_table where id like '1.1%' <-- for all of parent level
select * from tree_table where id like '1.1.2%' <-- for all of child level 2

Reply With Quote
  #9  
Old January 22nd, 2002, 09:58 AM
dcaillouet's Avatar
dcaillouet dcaillouet is offline
Big Endian
Dev Shed Beginner (1000 - 1499 posts)
 
Join Date: May 2001
Location: Fly-over country
Posts: 1,173 dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level)dcaillouet User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 16 h 29 m 5 sec
Reputation Power: 24
I used a solution similar to Onslaught's in a program I worked on once. However, when I was trying to do the employee org chart program I mentioned in my previous post, this type of solution didn't work because the relationship of the hierachy changed a lot. People would move to different management / supervisor positions causing the relationship between the nodes to change. Occasionally a new level of management would be stuck in the middle of the tree causing everyone below it to move down a level.

If your data is fairly static, Onslaught's solution might have a lot of potential. If like mine it was very fluid, managing changes in the level indicators can be a problem.

Reply With Quote
  #10  
Old January 22nd, 2002, 10:43 AM
andnaess andnaess is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2001
Location: Oslo
Posts: 1,516 andnaess User rank is Private First Class (20 - 50 Reputation Level)andnaess User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 9
It seems like a very good idea, but when I started playing around with it (on paper) I got into problems. For example, given this data:
Code:
category
id       name             path        depth
===========================================
1    |   Processors     | 1         | 1
2    |   Motherboards   | 2         | 1
3    |   Socket A       | 2.3       | 2
4    |   Intel          | 1.4       | 2
5    |   AMD            | 1.5       | 2
6    |   Celeron        | 1.4.6     | 3
7    |   Pentium 4      | 1.4.7     | 3
8    |   Mobile         | 1.4.6.8   | 4
9    |   OEM            | 1.4.6.9   | 4
10   |   Thunderbird    | 1.5.10    | 3
11   |   Mobile         | 1.5.10.11 | 4


I run into problems when trying to compose queries, that is, if the user clicks Processors->Intel->Celeron->Mobile
I can't see how to query this data to make sure that what I get back is the data for this tree (i.e. the tree expanded to suit the users clicks.
Code:
*Processors
  *Intel
    *Celeron
      *Mobile
      *OEM
    *Pentium 4
  *AMD
*Motherboards


It seems like an interesting approach, but I find querying hard...

Reply With Quote
  #11  
Old January 22nd, 2002, 11:11 AM
Onslaught's Avatar
Onslaught Onslaught is offline
/(bb|[^b]{2})/
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Nov 2001
Location: Somewhere in the great unknown
Posts: 4,840 Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 2 Days 36 m 16 sec
Reputation Power: 88
Send a message via ICQ to Onslaught
I actually see the table structure a little different, if it makes any different. More like this:

Code:
id       name             path        depth
===========================================
1    |   Processors     | 1         | 1
2    |   Motherboards   | 2         | 1
3    |   Socket A       | 2.1       | 2
4    |   Intel          | 1.1       | 2
5    |   AMD            | 1.2       | 2
6    |   Celeron        | 1.1.1     | 3
7    |   Pentium 4      | 1.1.2     | 3
8    |   Mobile         | 1.1.1.1   | 4
9    |   OEM            | 1.1.1.2   | 4
10   |   Thunderbird    | 1.2.1    | 3
11   |   Mobile         | 1.2.1.1 | 4


so, to expand this level of the tree
Code:
*Processors
  *Intel
    *Celeron
      *Mobile
      *OEM
    *Pentium 4
  *AMD
*Motherboards


is would be something like:
1) use expands processors:
get the path for processors and expand it:
select path from tree_table where name = 'Processors'
This would return '1'
get all first level children for this path:
select name from tree_table where path like '1.%' and strlen(path) = 3
This would return:
Intel & AMD
(I say only get the first level children for memory and speed)
show updated tree
2) user expands Celeron
get the path for Celeron and expand it:
select path from tree_table where name = 'Celeron'
This would return '1.1.1'
get all first level children for this path:
select name from tree_table where path like '1.1.1.%' and strlen(path) = 7
This would return:
Mobile, OEM
show updated tree

Now I realize that the sql statement with the length checking would limit to a small tree, 9 elements a peice (unless you use 0 also). in the case of trees larger than this, it would be easier to pull all the children in and sort up to the level you need with the scripting language.

As far as moving children around, the only tie it has with the parent would be the path. If you wanted to move OEM(1.1.1.2) from being a child of Celeron (1.1.1) to a child of Pentium 4 (1.1.2) you would just change the path on OEM to 1.1.2.1 and it would be the first child of Pentium 4.

But, this is all theory, I plan on actually building this concept into a formal class structure, but I am too tapped for time to be able to tackel this project as of yet.

Reply With Quote
  #12  
Old January 22nd, 2002, 11:21 AM
andnaess andnaess is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2001
Location: Oslo
Posts: 1,516 andnaess User rank is Private First Class (20 - 50 Reputation Level)andnaess User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 9
Actually I was expecting to do this in a single query, typically the user will click "Celeron" and the id of the "Celeron" will be passed as a variable (look at the data, name is not unique).

The real problem is getting the siblings of nodes on the expanded path without expanding the siblings (this was something I struggled with for quite a while in my class), to do it here you will need lots of statements OR-ed together, and as you go deeper it gets worse, I fear. Doing this in script is what I do, but then there is no need for storing the entire path, parent_id is enough. (And the algorithm will be O(N) anyway)

What I was hoping with this sort of structure was that it would enable me to extract the complete tree as outlined, without resorting to any scripting. If I end up fetching all the nodes and doing the rest in PHP then I already have a good solution...

It was looking so promising Anyone else?

Reply With Quote
  #13  
Old January 22nd, 2002, 11:27 AM
Onslaught's Avatar
Onslaught Onslaught is offline
/(bb|[^b]{2})/
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Nov 2001
Location: Somewhere in the great unknown
Posts: 4,840 Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)Onslaught User rank is Second Lieutenant (5000 - 10000 Reputation Level)