Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0

    Query excludes results with partially empty entries


    Hello everyone,

    I have a problem with the correct query. Here is an excerpt from the ER-model and a few sample data entries in the tables concerned:


    I have that statement:
    Code:
    SELECT `prefix_questions`.`id`, `text`, ROUND(AVG(answer),2)
    FROM `prefix_questions`, `prefix_answers`, `prefix_process`
    WHERE `prefix_questions`.`id` <= 15
    AND `prefix_questions`.`id` = `prefix_answers`.`question_id`
    AND `prefix_answers`.`answer` <= 5
    AND `prefix_answers`.`answer` >= 1
    AND `prefix_process`.`lecture_id` = 86
    AND `prefix_process`.`id` = `prefix_answers`.`process_id`
    GROUP BY `prefix_answers`.`question_id`
    That gives me these results:

    for lecture_id = 85
    1 Question 1 Text? 3.00
    2 Question 2 Text? 1.33
    3 Question 3 Text? 2.66

    for lecture_id = 86
    1 Question 1 Text? 1.33
    3 Question 3 Text? 2.00

    But what I need to get is this:
    for lecture_id = 86
    1 Question 1 Text? 1.33
    2 Question 2 Text?
    3 Question 3 Text? 2.00

    That is, when no answers have been given to a question at all, the question shall still be displayed but just an empty space for the non-existent average. Am I completely gone astray? How could I achieve this?

    All help greatly appreciated.
    Thanks
    J.
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    allow me to introduce you to explicit join syntax, in particular the LEFT OUTER JOIN
    Code:
    SELECT pq.id
         , pq.text
         , ROUND(AVG(pa.answer),2) AS avg_answer
      FROM prefix_questions AS pq
    LEFT OUTER
      JOIN prefix_answers AS pa
        ON pa.question_id = pq.id
       AND pa.answer BETWEEN 1 AND 5
    LEFT OUTER
      JOIN prefix_process AS pp
        ON pp.id = pa.process_id
       AND pp.lecture_id = 86
     WHERE pq.id <= 15
    GROUP 
        BY pq.id
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2015
    Location
    Canada
    Posts
    24
    Rep Power
    0
    Definitely need the explicit join syntax as above and the table aliases make queries so much easier to read.

    When I look at the model, is it correct?
    Surely the process is related to the questions directly and the answers are related to the questions...
    i.e. the process_id should be in the questions table.
  6. #4
  7. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    Originally Posted by element121
    Surely the process is related to the questions directly and the answers are related to the questions...
    i.e. the process_id should be in the questions table.
    the questions have different answers depending on which process it is

    if process_id were in the questions table, a question could belong to only one process
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0
    Originally Posted by r937
    allow me to introduce you to explicit join syntax, in particular the LEFT OUTER JOIN
    Code:
    SELECT pq.id
         , pq.text
         , ROUND(AVG(pa.answer),2) AS avg_answer
      FROM prefix_questions AS pq
    LEFT OUTER
      JOIN prefix_answers AS pa
        ON pa.question_id = pq.id
       AND pa.answer BETWEEN 1 AND 5
    LEFT OUTER
      JOIN prefix_process AS pp
        ON pp.id = pa.process_id
       AND pp.lecture_id = 86
     WHERE pq.id <= 15
    GROUP 
        BY pq.id
    When I run your query (I have set up an excerpt db with the tables and entries as in my original post) then I get These results for lecture_id 86:

    1 Question 1 text? 2.17

    2 Question 2 text? 1.33

    3 Question 3 text? 2.33

    The shown averages are wrong. They are not the average answers given to question n but the average of the averages. And it also does not matter which lecture_id I choose, the result is always as above.

    What has to be altered? Thanks
    J.
  10. #6
  11. No Profile Picture
    Super Moderator
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    Jun 2009
    Location
    Hartford, WI
    Posts
    1,465
    Rep Power
    70
    As far as the missing question 2 goes, with the manner of JOINs, it is because it would hold a value of NULL which IS NOT <= 15, so just have to add OR NULL to fix that part.

    Edit: I have to correct myself. This did bring back the 2nd question, and when trying to figure out the values, I made a full dump and noticed lecture_id 85 was then also being included. :-/ From what I am coming across so far, the WHILE is what's killing the full return of the table.

    Edit2: Is this required to be straight MySQL? If you are using another language to display the output, such as PHP, this would be extremely easier. How is the results intended to be displayed?
    Last edited by Triple_Nothing; January 12th, 2016 at 01:52 PM.
    He who knows not that he knows not is a fool, ignore him. He who knows that he knows not is ignorant, teach him. He who knows not that he knows is asleep, awaken him. He who knows that he knows is a leader, follow him.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0
    Originally Posted by Triple_Nothing

    Edit2: Is this required to be straight MySQL? If you are using another language to display the output, such as PHP, this would be extremely easier. How is the results intended to be displayed?
    I do use PHP to interface with MySQL, and yes, I have thought about using PHP to query the pieces and stitch them together. Just thought if MySQL can handle to let it handle it.

    J.
  14. #8
  15. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    Originally Posted by Triple_Nothing
    As far as the missing question 2 goes, with the manner of JOINs, it is because it would hold a value of NULL which IS NOT <= 15, so just have to add OR NULL to fix that part.
    nope

    it's pq LEFT pa LEFT pp and the WHERE condition is on pq so you don't test for NULL there

    Comments on this post

    • Triple_Nothing agrees : Ya, I've no clue where that came from... ;)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  16. #9
  17. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    Originally Posted by JLSeagull
    Just thought if MySQL can handle to let it handle it.
    it surely can

    any chance you could create a dump of sample rows?

    i want to test but i'm too lazy to code it up
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0
    Originally Posted by r937
    ... any chance you could create a dump of sample rows?
    sure can do...:
    Code:
    -- phpMyAdmin SQL Dump
    -- version 3.5.8.1
    -- http://www.phpmyadmin.net
    --
    -- Host: *******
    -- Erstellungszeit: 12. Jan 2016 um 23:41
    -- Server Version: 5.6.27-nmm1-log
    -- PHP-Version: 5.5.30-nmm1
    
    SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
    SET time_zone = "+00:00";
    
    
    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
    /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
    /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
    /*!40101 SET NAMES utf8 */;
    
    --
    -- Datenbank: `*******`
    --
    
    -- --------------------------------------------------------
    
    --
    -- Tabellenstruktur für Tabelle `prefix_answers`
    --
    
    CREATE TABLE IF NOT EXISTS `prefix_answers` (
      `id` int(2) NOT NULL AUTO_INCREMENT,
      `question_id` int(2) DEFAULT NULL,
      `process_id` int(2) DEFAULT NULL,
      `answer` int(2) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=16 ;
    
    --
    -- Daten für Tabelle `prefix_answers`
    --
    
    INSERT INTO `prefix_answers` (`id`, `question_id`, `process_id`, `answer`) VALUES
    (1, 1, 1, 2),
    (2, 2, 1, 2),
    (3, 3, 1, 3),
    (4, 1, 2, 4),
    (5, 2, 2, 1),
    (6, 3, 2, 3),
    (7, 1, 3, 3),
    (8, 2, 3, 1),
    (9, 3, 3, 2),
    (10, 1, 4, 1),
    (11, 3, 4, 2),
    (12, 1, 5, 1),
    (13, 3, 5, 2),
    (14, 1, 6, 2),
    (15, 3, 6, 2);
    
    -- --------------------------------------------------------
    
    --
    -- Tabellenstruktur für Tabelle `prefix_process`
    --
    
    CREATE TABLE IF NOT EXISTS `prefix_process` (
      `id` int(2) NOT NULL AUTO_INCREMENT,
      `lecture_id` int(2) DEFAULT NULL,
      `student_id` int(2) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ;
    
    --
    -- Daten für Tabelle `prefix_process`
    --
    
    INSERT INTO `prefix_process` (`id`, `lecture_id`, `student_id`) VALUES
    (1, 86, 11),
    (2, 86, 12),
    (3, 86, 13),
    (4, 87, 11),
    (5, 87, 14),
    (6, 87, 15),
    (7, 88, 13),
    (8, 88, 16),
    (9, 88, 17),
    (10, 88, 18);
    
    -- --------------------------------------------------------
    
    --
    -- Tabellenstruktur für Tabelle `prefix_questions`
    --
    
    CREATE TABLE IF NOT EXISTS `prefix_questions` (
      `id` int(2) NOT NULL AUTO_INCREMENT,
      `text` varchar(100) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;
    
    --
    -- Daten für Tabelle `prefix_questions`
    --
    
    INSERT INTO `prefix_questions` (`id`, `text`) VALUES
    (1, 'Question 1 text?'),
    (2, 'Question 2 text?'),
    (3, 'Question 3 text?');
    
    /*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
    /*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
    /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
    I would really love to glance over your shoulder now.

    J.
  20. #11
  21. No Profile Picture
    Super Moderator
    Devshed Beginner (1000 - 1499 posts)

    Join Date
    Jun 2009
    Location
    Hartford, WI
    Posts
    1,465
    Rep Power
    70
    How does this work? (The SUM() & COUNT() are just a returned visual for the AVG(). You can delete them.)

    Code:
    SELECT q.id, q.text, SUM(lj.answer) AS group_sum, COUNT(lj.answer) AS divided_by, ROUND(AVG(lj.answer),2) AS final_answer
    FROM prefix_questions AS q
    LEFT JOIN (
      SELECT a.process_id AS pid, a.question_id AS qid, a.answer AS answer
      FROM prefix_answers AS a
      INNER JOIN prefix_process AS p ON a.process_id = p.id
      WHERE p.lecture_id = 86
    ) AS lj ON lj.qid = q.id
    GROUP BY q.id;
    I forgot to add in the <= 15 and 1-5, so I placed back in here and removed SUM() & COUNT():
    Code:
    SELECT q.id, q.text, ROUND(AVG(lj.answer),2)
    FROM prefix_questions AS q
    LEFT JOIN (
      SELECT a.process_id AS pid, a.question_id AS qid, a.answer AS answer
      FROM prefix_answers AS a
      INNER JOIN prefix_process AS p ON a.process_id = p.id
      WHERE p.lecture_id = 86
        AND a.answer BETWEEN 1 AND 5
    ) AS lj ON lj.qid = q.id
    WHERE q.id <= 15
    GROUP BY q.id;
    Last edited by Triple_Nothing; January 14th, 2016 at 06:33 PM.
    He who knows not that he knows not is a fool, ignore him. He who knows that he knows not is ignorant, teach him. He who knows not that he knows is asleep, awaken him. He who knows that he knows is a leader, follow him.
  22. #12
  23. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    Originally Posted by JLSeagull
    I would really love to glance over your shoulder now.
    das freut mich

    thanks for the data... first thing i did is run the following query --
    Code:
    SELECT pp.lecture_id
         , pq.id
         , pq.text
         , pp.student_id
         , pa.answer
      FROM prefix_questions AS pq
    INNER
      JOIN prefix_answers AS pa
        ON pa.question_id = pq.id
       AND pa.answer BETWEEN 1 AND 5
    INNER 
      JOIN prefix_process AS pp
        ON pp.id = pa.process_id
    ORDER 
        BY pp.lecture_id
         , pq.id
         , pp.student_id
    notice i've used INNER JOINs, not restricted the lecture_id, and arranged the columns into a meaningful sequence

    results --
    Code:
    86   1   Question 1 text?   11   2      
    86   1   Question 1 text?   12   4    
    86   1   Question 1 text?   13   3  3.00
    86   2   Question 2 text?   11   2    
    86   2   Question 2 text?   12   1    
    86   2   Question 2 text?   13   1  1.33
    86   3   Question 3 text?   11   3    
    86   3   Question 3 text?   12   3    
    86   3   Question 3 text?   13   2  2.67
    87   1   Question 1 text?   11   1    
    87   1   Question 1 text?   14   1    
    87   1   Question 1 text?   15   2  1.33
    87   3   Question 3 text?   11   2    
    87   3   Question 3 text?   14   2    
    87   3   Question 3 text?   15   2  2.00
    now, if i understand you correctly, you want the average answer per question per lecture, which i've calculated by hand and placed on the last row for each question for each lecture

    with me so far?

    in the next step, we will address the issue of the missing question 2 in lecture 87
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0
    Originally Posted by r937
    ... if i understand you correctly, you want the average answer per question per lecture, which i've calculated by hand and placed on the last row for each question for each lecture

    with me so far?
    I still have to get my head around JOINS, but so far I was able to follow. And yes, that is what I need. To elucidate the whole thing for the reader: this is part of an online evaluation tool for lectures. In each lecture the participating students are asked the same questions about the lecture which results in ranks from 1 to 5. Sometimes there is also the option to select that this particular question is not applicable to the lecture which is to be evaluated. Hence the condition that the answer is to be between 1 and 5. Sometimes students decide not to answer a question or two, then the average is calculated from fewer answers. But if all students decide not to answer a particular question, then there won't be an average to calculate. This query is supposed to extract the results for each lecture and if that was all it had to do, I would be just fine. But it is a requirement to also compare these results to the averages over all lectures. And here the script has to fill the gap that is created from missing averages, because for the overall comparison, there will always be a value to compare with.

    Maybe this highlights a little why I am so bound on having that "missing" result returned. And so I eagerly await step 2 :-)

    Thanks for the great help so far!
    J.
  26. #14
  27. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2014
    Posts
    14
    Rep Power
    0
    Originally Posted by Triple_Nothing
    How does this work? ...

    Code:
    SELECT q.id, q.text, ROUND(AVG(lj.answer),2)
    FROM prefix_questions AS q
    LEFT JOIN (
      SELECT a.process_id AS pid, a.question_id AS qid, a.answer AS answer
      FROM prefix_answers AS a
      INNER JOIN prefix_process AS p ON a.process_id = p.id
      WHERE p.lecture_id = 86
        AND a.answer BETWEEN 1 AND 5
    ) AS lj ON lj.qid = q.id
    WHERE q.id <= 15
    GROUP BY q.id;
    This works as requested. Maybe it is still time for a New Years resolution and I shall try to "get" JOINS, they seem to make life that much easier.

    Thank you
    J.
  28. #15
  29. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,712
    Rep Power
    4288
    i realize i neglected to come back to this problem, sorry

    has it been solved to your satisfaction, J?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo