#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5

    a big mysql table issue


    Hi,

    I have a table with a structure like as follows:

    Code:
    CREATE TABLE IF NOT EXISTS `india_sessdata` (
      `RecNo` int(10),
      `Level` tinyint(4),	
      `Username`,
      `Day_Status`,
      `TimeIn`,
      `Attempts`,
      `AW_ R1`,  `AW_ R2`,  `AW_PC`,  `AW_DM`,  `AW_TT`,  `AW_SD`,  
      `AV_SP`,  `AV_PC`,  `AV_DM`,  `AV_TT`,  `AV_SD`,  `AV_SP`,  `AV_ST`,
      `AO_S1`,  `AO_PC`,  `AO_DM`,  `AO_TT`,  `AO_SD`,  `AO_SP`,  `AO_ST`,
    
      `BW_ R1`,  `BW_ R2`,  `BW_PC`,  `BW_DM`,  `BW_TT`,  `BW_SD`,
      `BV_SP`,  `BV_PC`,  `BV_DM`,  `BV_TT`,  `BV_SD`,  `BV_SP`,  `BV_ST`,
      `BO_S1`,  `BO_PC`,  `BO_DM`,  `BO_TT`,  `BO_SD`,  `BO_SP`,  `BO_ST`,
    
      `CW_ R1`,  `CW_ R2`,  `CW_PC`,  `CW_DM`,  `CW_TT`,  `CW_SD`,
      `CV_SP`,  `CV_PC`,  `CV_DM`,  `CV_TT`,  `CV_SD`,  `CV_SP`,  `CV_ST`,
      `CO_S1`,  `CO_PC`,  `CO_DM`,  `CO_TT`,  `CO_SD`,  `CO_SP`,  `CO_ST`,
    
      PRIMARY KEY (`RecNo`)	
    ) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=79 ;
    These are scores that are generated during a game that a child plays and are stored for each session. However in the first level 1, only the A category scores are saved. In level 2 A and B category scores are saved and in level 3 and above all A,B & C category scores are saved.

    I have a few questions to ask regarding this.

    1. Is it all right to have a single table save all the scores even though with a bit of redundency.
    2. If the table becomes very big (say a million rows) would it effect the speed of the database if there are a huge number of simultaneous users.
    3. How can this table be split to normalize it.
    4. What other precautions should be taken to ensure a that the database performs at good speed and efficiently.

    If some questions are vague kindly forgive.
    Thanks very much.
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    1. yes it is (and i don't see where your redundancy is)

    2. possibly... depends on your SELECT queries, what they key on... presumably you will search by username, so that columns should have an index

    3. have a separate scores table for the category scores -- one column to hold the score type (e.g. 'BW-DM') and a second to hold the score

    4. indexing the right columns, writing efficient queries, and tuning the server

    Comments on this post

    • Will-O-The-Wisp agrees
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    One method could be having a second entity-type table, where you would store result like this:
    Code:
    KEY	VALUE
    AW_R1	Value_for_AW_R1
    AW_R2	Value_for_AW_R2
    etc.
    Only issue to this would be if all fields for specific category are to be filled out.

    For this, an alternative table layout could be to have the fields without the front letter (leaving out the category):
    Code:
      `W_ R1`,  `W_ R2`,  `W_PC`,  `W_DM`,  `W_TT`,  `W_SD`,  
      `V_SP`,  `V_PC`,  `V_DM`,  `V_TT`,  `V_SD`,  `V_SP`,  `V_ST`,
      `O_S1`,  `O_PC`,  `O_DM`,  `O_TT`,  `O_SD`,  `O_SP`,  `O_ST`,
    and instead add a category type field, where you only accept the value of A, B and C.

    [EDIT]
    And of course was r937 just that faster.

    Comments on this post

    • Will-O-The-Wisp agrees
    Last edited by MrFujin; May 7th, 2015 at 02:26 AM. Reason: Typos
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5
    Hi ! Thank you SQL Consultant and Mr.Fujin very much for the reply.

    @ SQL Consultant. As regards the redundancy, I am sorry I did not make it clear that when the child is in level one ( and the child can take say a month to clear level 1), The only scores he would be generating would be the A category scores. Once he moves to level 2, (say in another 1 month after level 1 ) he would be generating A and B category scores and finally say he reaches level 3 in the 3rd month he would be finally generating all category of scores. namely A,B and C. So that's where I guess the redundancy would occur.

    Since records are generated in this table on data inserts, the RecNo being an auto increment field, I have not set an indexes as I thought that would serve the same purpose. My queries do need to collect the data from the table for each user to display it graphically. So for example if there is a user John, my select query would select the scores data for John from the day he started till date and then display that graphically. Would indexing on username help reduce the query time ? I think that's what you mean. Please clarify this.

    As regards normalizing the data I would be glad if you could illustrate through a small example using my table. I learn fast using examples !

    @ Mr. Fujin: I think I understand the 2nd example better than the first, though the first seems to be simpler. I will ponder some more on both these before I ask any further questions on them. If you could elaborate the first example just a little more it would be simpler for me to understand.

    Thank you both very much for your help. I really appreciate it.
  8. #5
  9. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    Originally Posted by ajoo
    So that's where I guess the redundancy would occur.
    so the same child can have multiple rows in this table? uh oh...

    Would indexing on username help reduce the query time ? I think that's what you mean. Please clarify this.
    yes

    As regards normalizing the data I would be glad if you could illustrate through a small example using my table. I learn fast using examples !
    okay, sure

    heep your child table, but remove all the scores columns and replace them with this single table --
    Code:
    CREATE TABLE child_scores
    ( username
    , category
    , scoretype
    , score
    );
    INSERT INTO child_scores VALUES
     ( 'freddie' , 'A' , 'R1' , 4.03 )
    ,( 'freddie' , 'A' , 'R2' , 5.21 )
    ,( 'freddie' , 'A' , 'PC' , 1.66 )
    ,( 'freddie' , 'A' , 'DM' , 9.37 )
    , ...
    Last edited by r937; May 7th, 2015 at 01:09 PM.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  10. #6
  11. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    My first example is the third point in r937 post.

    In the first example, you could have a table like this:
    Code:
    CREATE TABLE IF NOT EXISTS `india_sessdata_catvalues` (
      `RecNo` ,
      `CategoryName`,
      `CategoryValue`,
    )
    This table will then have to be joined to your table based on the record number (recNo).

    The difference with this is that you will have each category and its value on own row.
  12. #7
  13. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    too late again, mister fu
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  14. #8
  15. Lord of the Dance
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Oct 2003
    Posts
    4,109
    Rep Power
    2010
    And you even stole my idea of splitting the category and scoretype.

    Comments on this post

    • r937 agrees : love you too <3
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5
    Hi ! Thanks r937 and Mr. Fujin once again for the reply. I think I will try the to split the table now as I seem to get the idea from the example by r937. If I still have any questions I will revert again !

    Grateful to you both for the help.

    Thanks loads !
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5
    HI ! I am back with another question after thinking about the solutions offered by r937 and Mr. Fujin.

    Considering that as suggested, I create another table where I store just the score_types and score along with each unique username to connect to the parent table, I realize that I require yet another field, namely Time_in, which will store the scores on a given date. Therefore in the child table I will need to put a new column called Time_in and fill it up. If the number of scores calculated each day are say about 80. Then each of those 80 scores will have the same Time_in. Now would not this lead to a lot of redundancy? 80 same date values to the scores each day.

    As I mentioned earlier that I need to display the scores graphically across days, so I would need the time_in in my select query to collect all scores on a given time_in. ( Time_in is same for all scores of a given level on a given day. Once the user has generated these scores on a given day (Time_in) he cannot login again before 24 hours. )

    Looking for the experts opinion and a solution to this issue.
    Thanks loads
  20. #11
  21. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    you seem to be concerned with redundancy

    consider a company with thousands of employees, and an address table for those employees

    now consider that most of those employees will have the same province or state code in their addresses

    this is ~not~ redundancy!!
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5
    Hi r937,

    Thanks loads for the reply. The reason I am talking about redundancy is that tables are normally normalized to reduce that. Please correct me if I am wrong. I got the idea that we are creating these child tables to normalize the bigger table.

    I'll start from the beginning and rephrase my question. My table holds scores for all levels. Some of those are scores are not recorded in the initial levels. I thought of using a dummy value of zero for those scores that are not required in the initial levels. Then I was informed that such large tables can slow down the database operations and so I thought of reducing the table sizes and breaking them up. SO i read up about Tables and redundancy and provided the example above.

    My question then is that if my table has about 90 columns of scores, all of which are required, ( ignoring the earlier statement that some scores in some levels are not recorded), would this table, simply because of the number of columns involved and with a large number of users ( rows), get slowed down or bogged down the application. ( Assuming that the SQL queries are well optimized for speed)?

    I hope I am able to explain and ask my question wisely.
    Looking forward to your reply.
    Thanks very much and much obliged.
  24. #13
  25. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,663
    Rep Power
    4288
    Originally Posted by ajoo
    The reason I am talking about redundancy is that tables are normally normalized to reduce that.
    well, sort of

    here are some quotes from a very quick google search --

    "The objective is to isolate data so that additions, deletions, and modifications of an attribute can be made in just one table and then propagated through the rest of the database using the defined foreign keys."

    "The purpose of normalization is to maximize the efficiency of a database."

    "In creating a database, normalization is the process of organizing it into tables in such a way that the results of using the database are always unambiguous and as intended."

    "There are three main reasons to normalize a database. The first is to minimize duplicate data, the second is to minimize or avoid data modification issues, and the third is to simplify queries."

    redundancy, per se, as my example with the province/state code shows, is not necessarily a concern


    Originally Posted by ajoo
    My question then is that if my table has about 90 columns of scores, all of which are required, ( ignoring the earlier statement that some scores in some levels are not recorded), would this table, simply because of the number of columns involved and with a large number of users ( rows), get slowed down or bogged down the application. ( Assuming that the SQL queries are well optimized for speed)?
    no

    consider for a second what you'd have to do if you wanted to add some more score types

    in your 90-column model, you'd have to ALTER the table, and write special code to handle the new columns

    in the normalized table, you'd have to do... absolutely nothing

    Comments on this post

    • Will-O-The-Wisp agrees
    • ajoo agrees
    Last edited by r937; May 12th, 2015 at 07:42 AM.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    44
    Rep Power
    5
    Great answer !! clears many of my doubts. Also a very unambiguous and precise example of the advantage of a normalised table other than removing redudancy.

    Thank you very much. r937

IMN logo majestic logo threadwatch logo seochat tools logo