#1
  1. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2002
    Posts
    2
    Rep Power
    0

    Problem ANDing many to one relationship


    DB is MS SQL Server 2000

    I have one table that represents files on a shared filesystem and another table that represents labels that are applied to the file. Here's a text representation of the simplified table layout:

    |----------|
    | files |
    |----------|
    | file_id |
    | filename |
    |----------|

    |----------|
    | labels |
    |----------|
    | label_id |
    | file_id |
    | label |
    |----------|

    Any number of labels can be applied to a file and I need to be able to find out which files have certain labels. The problem I am having is finding files that have more than one label. For instance, "show me all the files that have 'labelone' and 'labeltwo'". I first tried representing the query like this:

    SELECT files.* FROM files
    INNER JOIN files ON files.file_id = labels.file_id
    WHERE labels.label = 'labelone' AND labels.label = 'labeltwo'

    I soon realized that this never returns any rows since a single record expressed by the join can never actually have two labels atttached to it. The only way I have been able to get the desired results is via a self join like this:

    SELECT files.*
    FROM file INNER JOIN
    labels L1 ON files.file_id = L1.file_id INNER JOIN
    labels L2 ON L1.file_id = L2.file_id
    WHERE (L1.assetlabel = 'labelone') AND (L2.assetlabel = 'labeltwo')

    This appears to work fine, but I'm wondering if there is a simpler way to express this query since this can quickly get out of hand when dealing many labels. For example, if I want to check for the existance of five different labels I have to self join the labels table as many times. Any ideas/suggestions on how to simplify this type of query? I am open to the idea of redesigning the data structure if necessary. Thanks.

    - Jamie
  2. #2
  3. Big Endian
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    May 2001
    Location
    Fly-over country
    Posts
    1,172
    Rep Power
    30
    Try something like:

    SELECT files.file_id, files.filename, COUNT(*)
    FROM files INNER JOIN labels
    ON files.file_id = labels.file_id
    WHERE labels.label IN ('labelone', 'labeltwo')
    GROUP BY files.file_id, files.filename
    HAVING COUNT(*) = 2

    Change your count to equal the number of labels in your IN command: If 5 labels then COUNT(*) = 5
  4. #3
  5. Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2002
    Location
    Netherlands
    Posts
    4
    Rep Power
    0
    Break down your problem into relevant parts.

    1. You don't need 'files' except for the filename. So forget about the 'files' part for the first part.

    2. So now, you need to get all files which have a certain label. Looking at your code, you don't need to know which file has which label, you just need the files. So, you use an aggregate query.

    select
    lbl.file_Id
    from
    labels lbl (nolock)
    group by
    lbl.file_Id
    where
    label = 'labelOne'
    or label = 'labelTwo'

    This will give you the fileId. Now, you need to know the filename. So try this:

    select
    fil.fileName,
    fil.file_Id
    from
    files fil (nolock)
    inner join
    (
    select
    lbl.file_Id
    from
    labels lbl (nolock)
    group by
    lbl.file_Id
    where
    label = 'labelOne'
    or label = 'labelTwo'
    ) lId (nolock)
    on lId.file_Id = fil.file_Id

    In SQL 2000, you could also write a function for the label table, have it return a table datatype and then inner join on the new table. This is my preferred method, since it takes a whole lot of illegible coding out of the SP's.

IMN logo majestic logo threadwatch logo seochat tools logo