Thread: closest join

    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6

    closest join


    Hi.
    This is a tough one.
    i have one table t1 (created through multiple joins, but let's not get into that) which contains
    user | pattern1 | cost

    i'm trying to join this with a different table t2 which contains
    group | pattern2 | user | priority

    the users are the same and i can join on t1.users=t2.users. the tricky part is joining on the pattern because some of the strings in pattern2 are longer than those in pattern1 and vice versa. I'm trying to match pattern2 to the closest string in pattern1 which is the same length or shorter.
    since t1 is the results of a pretty heavy query, i would very much like to run it only once
    i.e. i don't want to do anything like
    if(pattern2=(select pattern1 from t1),if(substring(pattern2,0,length(pattern2-1).....
    any ideas??

    Thanks in advance
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    could you show us some represenative strings in both tables please, some that will match and some that won't

    having a hard time figuring out what you mean by "closest"
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    sorry for being unclear.
    the patterns are numeric.
    here are some examples
    pattern1: 123, 456, 457, 45678, 4567890454, 90
    pattern2: 12345, 456789, 90

    ideally the join should match:
    12345 to 123
    456789 to 45678 only
    90 to 90
  6. #4
  7. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    Originally Posted by bobert123
    456789 to 45678 only
    why wouldn't 456789 match 4567890454?

    why wouldn't 456789 match 456?


    Originally Posted by bobert123
    90 to 90
    why wouldn't 90 also match 4567890454?
    Last edited by r937; July 24th, 2014 at 04:08 AM.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    Hi Rudy,
    As I wrote earlier
    I'm trying to match pattern2 to the closest string in pattern1 which is the same length or shorter.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    and i'm trying to match left to right. sorry if that wasn't clear
  12. #7
  13. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    Maybe you are looking for something like this:
    How to like two columns in one SQL statement - Stack Overflow

    Next step would then be to get the closest string. (Do you actually mean to match the shortest match?)

    Maybe an possible solution would be to sum the length of the two pattern, group by userid and then take the row for each userid with the row/record having the shortest (MIN) number of length?
    Last edited by MrFujin; July 24th, 2014 at 05:05 AM.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    thanks but i am trying to find the longest match, not the shortest. i can try to use the max length for that, which is a good idea. do you have any idea what the syntax of that would look like.
  16. #9
  17. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    You should really answer the questions by r937.

    You want to get pattern 1 and what it match in pattern 2, e.g.
    123 is contained/matched to 12345
    45678 is contained/matched into 456789
    90 is contained/matched into 90
    456 no match because of 45678 is a longer match to 456789
    457 no match to pattern 2
    4567890454 no match to pattern 2
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    Hi Mr Fujin, Apologies. I throught that my answer to R937 was clear.
    You want to get pattern 1 and what it match in pattern 2, e.g.
    123 is contained/matched to 12345
    45678 is contained/matched into 456789
    90 is contained/matched into 90
    456 no match because of 45678 is a longer match to 456789
    correct, this is what I am looking for. any suggestions for syntax?
  20. #11
  21. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    The syntax for the two column match can be seen in the link I provided in previous post.
    This will give you a list of all match for each user and pattern2.

    If pattern 2 for one user match two pattern 1, you want to the one with MAX (CHAR_)LENGTH().
    This will need group by on the user and pattern 2.

    Try take a look at this site:
    How to select the first/least/max row per group in SQL | Xaprb

    I am not that skilled to give a combined valid SQL syntax for this.

    Comments on this post

    • r937 agrees : 3000 bonus points for xaprb link
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2012
    Posts
    109
    Rep Power
    6
    thanks. this is very helpful

IMN logo majestic logo threadwatch logo seochat tools logo