Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. Banned (not really)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 1999
    Location
    Brussels, Belgium
    Posts
    14,646
    Rep Power
    4492

    Best method for a parent/child type relationship


    Just curious what everything thinks is the best way to organize parent/child relationships.

    Let's say we have something like this:
    Code:
      101
      |  102
      |  |   105
      |  103
      |  104
      |      106
      |      107
      |          108
      |              109
      201
         202
         203
         204
             206
             207
                 208
    There can be an unlimited number of children and parents, but it's all organized as a tree.

    The most popular method is probably to have two fields, parent and id. id would be the current record and parent would be the record right above it.

    Since I'm mainly interested in a database solution, I'll discuss it in those terms. The problem with this layout is the gathering of all the records down to X children, where the user could set how far down to go. Once you get past the first two layers, there are a lot of additional queries to execute based on the previous queries to get the rest of the children.

    Hopefully I have explained this well enough, but what I'm basically looking for is the best way to store and retrieve records organized like as a tree like this. What's the best way to pull the records out to X children and display it?

    Thanks for any help.

    ---John Holmes...
  2. #2
  3. Big Endian
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    May 2001
    Location
    Fly-over country
    Posts
    1,172
    Rep Power
    30
    I encountered a situation where I had to represent a hierachy (an org chart) in a database and it was tricky. Every "solution" had a down side. Fortunately for me, I was able to come up with a client-side solution because I was dealing with so few employees (several hundred). My solution, would have stunk for thousands of records (it involved loading the records into a client-side tree and creating a temporary table that related all sub-nodes to a top node in a branch of the tree).

    Last week, the lady who works in the cube next to me brought in a book call the Guru's Guide to Transact-SQL. The author devotes the 12th chapter to different methods for storing hierachies in a database. The next time you're in a bookstore you might want to take a look at it and see if it gives you any ideas. Because the book was written for SQL Server, some of his solutions use server-side cursors which would not be viable for all databases.
    http://www.amazon.com/exec/obidos/AS...159580-6675255
  4. #3
  5. No Profile Picture
    To understand recursion, you have to understand recursion first!
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2001
    Location
    Germany
    Posts
    175
    Rep Power
    14
    Hi

    I suppose it is impossible to create ONE query which can drill down to the last level of children.
    This is IMHO a question of the unpredictable count/depth of parent/child levels in the tree.

    So I suppose one has to deal with some kind of loop-mechanism (i.e. cursors, which, of course, are not implemented in mysql ;( , YET).

    Well, that wasnt very helpful, though, but I wanted to contribute to a topic which interests me personally...

    Comments on this post

    • JimmyGosling agrees
    ---BP-------------------------------------------
    "Life is what happens, while you're makin' other plans..."
  6. #4
  7. No Profile Picture
    Canta como rafaé
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2001
    Location
    Barcelona
    Posts
    74
    Rep Power
    14

    n-ary trees


    I had the same problem some time ago and I found a solution for me, but it might not be userful for you.

    This solution involves limiting the number of children per each node (so getting an n-ary tree). According on your applicattion you must set the 'n' to whichever number you want. This is some sort of heap.

    Instead of storing the pair id/parent_id, you only need an id. The type of the id (integer, long integer) also depends on the 'n' and the depth of the tree.

    This id is not taken from any sequence. The id assigned to a node is as follows.

    Code:
    if node is root
    then id = 1
    
    else if node is not root and parent = p
    and node is child number c (c = 0 ... n-1)
    then id = (n * (p - 1)) + 2 + c
    With this you are implementing a full n-ary tree. The good thing is that you are able to know the children of a node with ease.

    But a better thing is to know mathematically which is the parent node of a given one.

    Code:
    (parent of node i) = ceil ((i - 1) / n)
    This way, you only have to compute the list of nodes up to the root, and put them into a single database query

    Code:
    (i is the node we start at, n is the n-ary degree)
    list = {}
    while (i <> root) do
      insert_list (i);
      parent = ceil ((i - 1) / n)
      i = parent
    end;
    
    insert_list (i);    // do not forget to insert the root node
    Now, you have in list the list of nodes of the path. And another pretty good thing is that, as the nodes are numbered, and due to the implementation, doing

    Code:
    select from db where id in list order by id desc
    would ensure you the path from the root to the node, and

    Code:
    select from db where id in list order by id asc
    would give you from node to root.


    Hope it helped. Tell me for more info.

    NOTE: Edited to correct a mistake. Used floor instead of ceil
    Last edited by Thrasher; January 22nd, 2002 at 11:49 AM.
    Thrasher



    'Y se ahogaron los dooos
    No eran duros pa pagar, cuñaaoo !!'
    El vagamundo - El risitas y su cuñao
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jul 2001
    Location
    Oslo
    Posts
    1,516
    Rep Power
    15
    Well, I just wrote a small class for handling menus based on categories of arbitrary depth. In the database I simply store a parent_id in each category, then I fetch all categories and build the tree using a very efficient algorithm in PHP. Of course, the downside is that you have to fetch all the entries, so it only works well for smaller structures (maybe a few thousand records). The algorithm that builds the tree is however very fast, and if there are 10 out of 200 categories that will actually display (for example the 10 categories under the root), then the algorithm will only need 10 operations, so it follows the number of categories to show, which normally is fairly small.

    The advantage of this is that once the tree has been built, I can store my internal representation (an array), and then simply rebuild the tree from this array every time the user accesses the tree in some new way. Typically the structure changes very rarely compared to the number of times it's fetched, so if you can store it in memory it will be *very* fast, even for large structures (albeit memory consuming).
    --
    Regards
    André Næss

    Puritanism: The haunting fear that someone, somewhere may be having fun
  10. #6
  11. No Profile Picture
    Canta como rafaé
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2001
    Location
    Barcelona
    Posts
    74
    Rep Power
    14
    andnaess:

    How is that efficient algorithm in PHP done? I couldn't find a single one really efficient ...
    Thrasher



    'Y se ahogaron los dooos
    No eran duros pa pagar, cuñaaoo !!'
    El vagamundo - El risitas y su cuñao
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jul 2001
    Location
    Oslo
    Posts
    1,516
    Rep Power
    15
    Well, like so many other efficient algorithms, it's speed is largely due to the underlying datastructure. What I do is keep a 2 dimensional array of id values (category id's). An entry might look like this:
    $arr[1] = Array(1,3,5,7)

    This entry means that the category with id 1 has children 3,5 and 7. Notice how the id is the key of the first dimension, and that it's also the first element in the list of children (the latter is an optimization trick.). The reason why the algorithm is fast is that it can jump from category to category directly using the array keys. In most cases a category tree (or menu) is kept rather small (otherwise people just get lost), for example, you normally just show the 1st. level, like this:

    Code:
    Main dishes
    Side dishes
    Poultry
    Meat
    Pasta
    Then the user clicks on "Poultry" and you expand that category to get:
    Code:
    Main dishes
    Side dishes
    Poultry
      Chinese
      Thai
      Tex-mex
    Meat
    Pasta
    My algorithm is written so that it will never try to make any recursive calls for categories that are not in this first level or in the expanded part, so I use "complete pruning" and thus the number of "iterations" is kept small even though the complete tree if fully expanded is huge (the number of "iterations" is actually equal to the number of categories that will be viewed).

    But again, I must stress that this algorithm is the last in a series of three operations; first there's the complete fetch from the db, then this result is turned into the array representation, then finally, the algorithm is used to create a "view" of the tree. It is this latter part which changes all the time in the applications I need this for, and thus this approach made sense. So far I've only used it with very small structures, so I haven't worried to store the intermediate array physically (the whole process of building and displaying the tree takes 0,05 seconds or something, so there's really no point... ) But if I need to use this for something bigger, that would be an acceptable solution. Typically you could have the category manager rewrite the array representation every time changes were made to the categories, and store it serialized (in shared memory if you're *really* into performance.

    Whether this approach is the right one in this case I don't know, but it has the advantage of keeping the db structure very simple, the number of db accesses to a minimum, and the code short. (The algorithm itself is something like 15 lines, and building the intermediate array is a while loop of some 20 lines).
    --
    Regards
    André Næss

    Puritanism: The haunting fear that someone, somewhere may be having fun
  14. #8
  15. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    792
    Ok, here is my shot at this.
    How about, instead of having a parent/child structure, have a root/parent/child structure.
    Take for instance:
    Code:
    root
      +-parent level
              +-child level I
              +-child level I
                     +-child level II
                     +-child level II
              +-child level I
    would translate to:
    Code:
    1
      +-1.1
              +-1.1.1
              +-1.1.2
                     +-1.1.2.1
                     +-1.1.2.2
              +-1.1.3
    So, if you wanted to get everything that was from root you could go:
    select * from tree_table where root = '1'
    If you wanted everything from the first parent level:
    select * from tree_table where root = '1' and parent = '1'
    If you wanted just the children from level 2:
    select * from tree_table where root = '1' and parent = '1' and child like '2.%'
    Pretty much what I am saying is keep the fields in a step structure. They could be all in one field like I listed above, they could all be in on field as the unique id and the calls could be like:
    select * from tree_table where id = '1' or like '1.%' <-- for all root 1
    select * from tree_table where id like '1.1%' <-- for all of parent level
    select * from tree_table where id like '1.1.2%' <-- for all of child level 2
  16. #9
  17. Big Endian
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    May 2001
    Location
    Fly-over country
    Posts
    1,172
    Rep Power
    30
    I used a solution similar to Onslaught's in a program I worked on once. However, when I was trying to do the employee org chart program I mentioned in my previous post, this type of solution didn't work because the relationship of the hierachy changed a lot. People would move to different management / supervisor positions causing the relationship between the nodes to change. Occasionally a new level of management would be stuck in the middle of the tree causing everyone below it to move down a level.

    If your data is fairly static, Onslaught's solution might have a lot of potential. If like mine it was very fluid, managing changes in the level indicators can be a problem.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jul 2001
    Location
    Oslo
    Posts
    1,516
    Rep Power
    15
    It seems like a very good idea, but when I started playing around with it (on paper) I got into problems. For example, given this data:
    Code:
    category
    id       name             path        depth
    ===========================================
    1    |   Processors     | 1         | 1
    2    |   Motherboards   | 2         | 1
    3    |   Socket A       | 2.3       | 2
    4    |   Intel          | 1.4       | 2
    5    |   AMD            | 1.5       | 2
    6    |   Celeron        | 1.4.6     | 3
    7    |   Pentium 4      | 1.4.7     | 3
    8    |   Mobile         | 1.4.6.8   | 4
    9    |   OEM            | 1.4.6.9   | 4
    10   |   Thunderbird    | 1.5.10    | 3
    11   |   Mobile         | 1.5.10.11 | 4
    I run into problems when trying to compose queries, that is, if the user clicks Processors->Intel->Celeron->Mobile
    I can't see how to query this data to make sure that what I get back is the data for this tree (i.e. the tree expanded to suit the users clicks.
    Code:
    *Processors
      *Intel
        *Celeron
          *Mobile
          *OEM
        *Pentium 4
      *AMD
    *Motherboards
    It seems like an interesting approach, but I find querying hard...
    --
    Regards
    André Næss

    Puritanism: The haunting fear that someone, somewhere may be having fun
  20. #11
  21. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    792
    I actually see the table structure a little different, if it makes any different. More like this:

    Code:
    id       name             path        depth
    ===========================================
    1    |   Processors     | 1         | 1
    2    |   Motherboards   | 2         | 1
    3    |   Socket A       | 2.1       | 2
    4    |   Intel          | 1.1       | 2
    5    |   AMD            | 1.2       | 2
    6    |   Celeron        | 1.1.1     | 3
    7    |   Pentium 4      | 1.1.2     | 3
    8    |   Mobile         | 1.1.1.1   | 4
    9    |   OEM            | 1.1.1.2   | 4
    10   |   Thunderbird    | 1.2.1    | 3
    11   |   Mobile         | 1.2.1.1 | 4
    so, to expand this level of the tree
    Code:
    *Processors
      *Intel
        *Celeron
          *Mobile
          *OEM
        *Pentium 4
      *AMD
    *Motherboards
    is would be something like:
    1) use expands processors:
    get the path for processors and expand it:
    select path from tree_table where name = 'Processors'
    This would return '1'
    get all first level children for this path:
    select name from tree_table where path like '1.%' and strlen(path) = 3
    This would return:
    Intel & AMD
    (I say only get the first level children for memory and speed)
    show updated tree
    2) user expands Celeron
    get the path for Celeron and expand it:
    select path from tree_table where name = 'Celeron'
    This would return '1.1.1'
    get all first level children for this path:
    select name from tree_table where path like '1.1.1.%' and strlen(path) = 7
    This would return:
    Mobile, OEM
    show updated tree

    Now I realize that the sql statement with the length checking would limit to a small tree, 9 elements a peice (unless you use 0 also). in the case of trees larger than this, it would be easier to pull all the children in and sort up to the level you need with the scripting language.

    As far as moving children around, the only tie it has with the parent would be the path. If you wanted to move OEM(1.1.1.2) from being a child of Celeron (1.1.1) to a child of Pentium 4 (1.1.2) you would just change the path on OEM to 1.1.2.1 and it would be the first child of Pentium 4.

    But, this is all theory, I plan on actually building this concept into a formal class structure, but I am too tapped for time to be able to tackel this project as of yet.
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jul 2001
    Location
    Oslo
    Posts
    1,516
    Rep Power
    15
    Actually I was expecting to do this in a single query, typically the user will click "Celeron" and the id of the "Celeron" will be passed as a variable (look at the data, name is not unique).

    The real problem is getting the siblings of nodes on the expanded path without expanding the siblings (this was something I struggled with for quite a while in my class), to do it here you will need lots of statements OR-ed together, and as you go deeper it gets worse, I fear. Doing this in script is what I do, but then there is no need for storing the entire path, parent_id is enough. (And the algorithm will be O(N) anyway)

    What I was hoping with this sort of structure was that it would enable me to extract the complete tree as outlined, without resorting to any scripting. If I end up fetching all the nodes and doing the rest in PHP then I already have a good solution...

    It was looking so promising Anyone else?
    --
    Regards
    André Næss

    Puritanism: The haunting fear that someone, somewhere may be having fun
  24. #13
  25. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    792
    Well, was worth a shot.
  26. #14
  27. Big Endian
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    May 2001
    Location
    Fly-over country
    Posts
    1,172
    Rep Power
    30
    andnaess, wouldn't the following query get you the information you needed for a node click? Or am I missing something?

    SELECT *
    FROM category
    WHERE path LIKE path%
    AND depth = depth + 1;

    typically the user will click "Celeron" and the id of the "Celeron" will be passed as a variable
    I wouldn't pass the id, I would pass the path (which is also unique).
    Last edited by dcaillouet; January 22nd, 2002 at 01:25 PM.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jul 2001
    Location
    Oslo
    Posts
    1,516
    Rep Power
    15
    Say the user clicked Celeron, the query would then be:
    SELECT name FROM category WHERE depth = 4 AND path like '1.1%'
    This would return "Mobile" and "OEM". That approach could work if you could rely on persistence (i.e. the current tree could be stored between each query), but in a web-context I can't do that, and I need a query which returns the entire tree. When the user clicks "Celeron" I want to get these categories
    Code:
    *Processors
      *Intel
        *Celeron
          *Mobile
          *OEM
        *Pentium 4
      *AMD
    *Motherboards
    Compared to the full tree which is:
    Code:
    *Processors
      *Intel
        *Celeron
          *Mobile
          *OEM
        *Pentium 4
      *AMD
        *Thunderbird
          *Mobile
    *Motherboards
      *Socket A
    So the clue here is that given a single node, it must be possible to build the entire tree, with the path to the given node expanded. This is what I've done using my PHP code, and for a while I thought Onslaughts idea would be able to greatly simplify that code at the expense of some extra bookkeeping.

    There's an example here:
    http://andre.switch.no/test/treetest2.php
    --
    Regards
    André Næss

    Puritanism: The haunting fear that someone, somewhere may be having fun
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo