#1
  1. A Change of Season
    Devshed Loyal (3000 - 3499 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    3,246
    Rep Power
    220

    Is this table designed properly to store millions of records?


    Hi;

    I got a site with 400 unique visitors per minute. We expect it to get "really big" (sorry Rudy) since we store every click.

    We need to store the stats this way, I wanna make sure this table can handle it.

    Is there potential for issues? Like slowing site down (that i could prevent with indexing) or anything else?

    Code:
    CREATE TABLE `analytics` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `start` varchar(255) NOT NULL,
      `end` varchar(255) NOT NULL,
      `duration_seconds` int(11) NOT NULL,
      `ip` varchar(255) NOT NULL,
      `url` varchar(255) NOT NULL,
      `server_name` varchar(255) NOT NULL,
      `full_url` varchar(255) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=1132431 DEFAULT CHARSET=utf8;
    Thanks
    Last edited by English Breakfast Tea; September 13th, 2017 at 06:37 AM.
  2. #2
  3. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,436
    Rep Power
    9645
    - Don't know what start and end are but varchar(255) doesn't sound right
    - Page duration will not be negative and will certainly not be up to 68 years long
    - The longest IP address is 79 characters (a minor point considering the nature of VARCHAR)
    - Having url, server_name, and full_url sounds redundant
    - Is your auto_increment really starting at 1132431? In fact I don't think I would bother with an id column anyways - what does it gain you?
    - Indexes?

    Comments on this post

    • r937 agrees : masterful
    Last edited by requinix; September 13th, 2017 at 08:30 AM.
  4. #3
  5. A Change of Season
    Devshed Loyal (3000 - 3499 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    3,246
    Rep Power
    220
    Originally Posted by requinix
    - Don't know what start and end are but varchar(255) doesn't sound right
    - Page duration will not be negative and will certainly not be up to 68 years long
    - The longest IP address is 79 characters (a minor point considering the nature of VARCHAR)
    - Having url, server_name, and full_url sounds redundant
    - Is your auto_increment really starting at 1132431? In fact I don't think I would bother with an id column anyways - what does it gain you?
    - Indexes?
    Here is sample data

    How about:

    - start and end varchar 30
    - Duration is maximum 60 minutes, so int 2
    - IP varchar 79


    Also.... what are the real issues with varchar 255?

    Thanks
  6. #4
  7. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,436
    Rep Power
    9645
    If start and end are timestamps then you should be using timestamps - either DATETIME or TIMESTAMP. Never strings. Certainly never strings with two-digit years.

    Don't split the duration into minutes and seconds. That's just creating more work for you. Keep it as seconds but cap it at, say, 65535s (SMALLINT) which is >18h - anything larger is probably an anomaly, like someone leaving the page open overnight. In fact anything more than one hour is probably anomalous; you could go lower but floating-point or DECIMAL but it'll be more than 2 bytes so I'd keep the bounds checks in code.

    There's no issue with VARCHAR(255) for an IP address. The "V" means varying, as in the value won't use 255 bytes but however many bytes are in the value itself plus one for the length of the string (so "12.34.56.78" would take 11+1 bytes). So the same IP address in a VARCHAR(79) vs. a VARCHAR(255) will be the same. Personally I wouldn't use 255 because to me that value means "I don't know how long it will be but it's no more than 255", while (eg) 100 means "this is a particular number I know is enough to hold the value". 79 is fine too but I like round numbers.
    Last edited by requinix; September 13th, 2017 at 10:21 AM.

IMN logo majestic logo threadwatch logo seochat tools logo