#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2011
    Posts
    118
    Rep Power
    50

    Rule of Thumb? Multiple Joins vs. Multiple Queries?


    I have never seen any and a quick Google search didn't help me.

    Are there any rules of thumbs which help with making the decision when it is better to run a separate query vs. adding some extra tables via join to get the info in one big request. My guess is the answer is case-by-case via testing.

    I have a query where I need to join four tables for the basic information to display to the user. I have one more bit of information to grab which is two tables away. (An ID in the main query gets me the the row I need on another table which gets me the row I need on a third table which has the name I am looking up.)

    So, more or less:
    SELECT ID, other-stuff FROM a join b join c join d WHERE ...
    then
    SELECT name FROM e join f WHERE ID=id-from-before

    Vs.
    SELECT ID, name, other-stuff FROM a join b join c join d join e join f


    The tables are not enormous (yet). The place where the main information is - is growing quickly. The other lookup tables will only be a couple of hundred records.

    At this time, I can measure no significant performance difference in a big-join vs a second query.

    My preference is for the two queries because it keeps the purpose separate and easier to explain for maintenance purposes.

    But I'm wondering if there might be long term performance benefit of one big join as the tables and number of users grows.

    Thanks in advance.
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,212
    Rep Power
    4279
    Originally Posted by EEsterling
    At this time, I can measure no significant performance difference in a big-join vs a second query.
    this is going to remain the case for the foreseeable future

    after that, you can tweak the indexes, and make it work for even longer

    go fight some other battles

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. No Profile Picture
    Lost in code
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2004
    Posts
    8,317
    Rep Power
    7170
    If you're running the queries from an application language like PHP, it's quite possible that the overhead involved in running the extra queries will make using joins significantly faster. There are exceptions, but no hard and fast rules that define them. This is one of the primary reasons why WordPress is so inefficient.
    PHP FAQ

    Originally Posted by Spad
    Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around
  6. #4
  7. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    why WordPress is so inefficient.
    Ah, so nice to hear, thank you for the morsel of sanity.

    Regarding the original question, my knowledge is second-hand, by reading High-Performance MySQL, The Art of SQL and others. While everything in these books points to a nuanced, case-by-case look at each situation, my impression is that overall, as a fat rule of thumb, for four tables, the authors would often favor a single well-formulated query with JOINs rather than multiple queries, under the assumption that the optimizer should handle JOINs particularly well—plus the round-trip multi-layer latency story. In "well-formulated," there are a number of things, such as the question of using appropriate (possibly covering) indexes, and also the question of reducing the data set as fast as possible. Not an expert answer, just sharing the fruits of my reading.

    On a personal note, I had a case recently where the two-query approach was faster (benchmarking by looping 10 thousand times). That had to do with one of the two queries being cache-able, but not the big one; also, the big query was a contrived spaghetti with some left JOINs to retrieve disconnected data into a single row, probably not your case.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2011
    Posts
    118
    Rep Power
    50

    Thanks


    It is a PHP application.

    Originally Posted by ragax
    the big query was a contrived spaghetti with some left JOINs to retrieve disconnected data into a single row, probably not your case.
    That sounds exactly the same to me.

    Unfortunately, I realized I erred in thinking about the problem for the specific case that got me thinking about it.

    The big ugly query gets about 30 rows. (Probably will always be 20-50). PHP then lays out the data for each row.

    Where I erred is I forgot that I don't run the second simple query once. I run it for each row. So, I'm really making ~31 queries, not just two.

    So, maybe I should make my big ugly query bigger and uglier. But I won't bother now since I don't have a performance issue.
  10. #6
  11. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    Sounds like a fun tweaking project for a winter evening.

    I should make my big ugly query bigger and uglier
    I've started adding copious comments to large queries, but definitely need to learn more about making "ugly" queries look more presentable. Indentation also helps. But I guess the core of the problem is going from the multi-dimensional nature of a query involving several tables, conditions etc to the linear nature of written code.

IMN logo majestic logo threadwatch logo seochat tools logo