#1
  1. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102

    Partioning question for 5.6


    Hello everyone,

    Currently I have a (4) huge tables with lots of data within them. Now currently our production databases are 5.0.51a and I know that although Partioning did not come about until 5.1, there is a old school work arround but it means their is great risk of fragmention and rewrite our business code to see the different tables/partions.

    So testing 5.6, and enabling partiting support...I also know that in Mysql 5.6 you can only partion by Range, List, Hash, and Key. Now MySQL does not support Date partioning, but you could do Range Partioning and use a column that was; DATE, TIME, or DATETIME.

    So here is the describe on one of my tables:


    mysql> desc ClosedPeriodActivitySummary;
    +-------------+--------------------------------------+------+-----+-------------------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------+--------------------------------------+------+-----+-------------------+-------+
    | FundId | int(11) | NO | MUL | NULL | |
    | BegDate | date | NO | MUL | NULL | |
    | EndDate | date | NO | MUL | NULL | |
    | Category | enum('CASH','MARKET') | YES | MUL | NULL | |
    | ItemName | varchar(32) | YES | MUL | NULL | |
    | Amount | decimal(16,2) | YES | | NULL | |
    | Direction | enum('INFLOW','OUTFLOW','NET-VALUE') | YES | MUL | NULL | |
    | LastUpdated | timestamp | NO | | CURRENT_TIMESTAMP | |
    +-------------+--------------------------------------+------+-----+-------------------+-------+
    8 rows in set (0.01 sec)

    So if i wanted to build a Partioned table for this table, and wanted to use RANGE Partioning...and wanted to base the Partions off the column EndDATE which is data type 'YYYY-MM-DD', and I wanted say the parttions to be named for the last day of each month for 2012 and 2013 (24) total

    It would look like the following:

    create table ClosedPeriodActivitySummary (
    (column one),
    BegDate date null,
    EndDate null,
    (column four),
    (column five),
    (column six),
    (coulmn seven),
    (column eight)
    )

    PARTITION BY RANGE (EndDate) (
    PARTITION jan2012 VALUE LESS THAN (2012-01-31),
    PARTITION feb2012 VALUE LESS THAN (2012-02-28),
    PARTITION mar2012 VALUE LESS THAN (2012-02-31),

    (many more monthly partitions until)
    PARTITION dec2013 VALUE LESS THAN (2013-12-31)
    );

    Is that basically correct?

    Next under this setup what happens if someone does a query from a specific partition say feb2012 which would have any entry from date 2012-02-01 till 2012-02-28. Now say they only wanted from the specific date of 2012-02-15, would they just get entrys from just that specific date of the 15th or would they get all entrys from within the partition from the 1st through the 28th?
  2. #2
  3. Hockey face
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    St. Catharines, Canada
    Posts
    8,144
    Rep Power
    1316
    when you say huge, what size is huge? are you partitioning when you don't have to?

    Do you have hundreds of millions of rows of data in that specific table for instance?
  4. #3
  5. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Sorry it took so long to get back to you but my youngest son came down with the flue and I came down with a nasty cold(not the flue).

    Ok to answer the size qestion for a one month time on the four tables:

    mysql> select count(*) from ClosedPeriodActivitySummary where BegDate between '2013-01-01' and '2013-01-31';
    +----------+
    | count(*) |
    +----------+
    | 3120 |
    +----------+
    *** Avg_row_length 89


    mysql> select count(*) from ClosedPeriodBalancesSummary where BegDate between '2013-01-01' and '2013-01-31';
    +----------+
    | count(*) |
    +----------+
    | 2232 |
    +----------+
    *** Avg_row_length 139


    mysql> select count(*) from ClosedPeriodPositions where PeriodBeginDate between '2013-01-01' and '2013-01-31';
    +----------+
    | count(*) |
    +----------+
    | 11668 |
    +----------+
    *** Avg_row_length 912

    mysql> select count(*) from ClosedPeriodTransactions where PeriodBeginDate between '2013-01-01' and '2013-01-31';
    +----------+
    | count(*) |
    +----------+
    | 76079 |
    +----------+
    *** Avg_row_length 942


    So the stats for a average 30 days period is:
    3120 x 89 = 277689 bytes ~ 0.26482 MB
    2232 x 139 = 310248 bytes ~ 0.29588 MB
    116688 x 912 = 106419456 bytes ~ 101.4895 MB
    76079 x 942 = 71666418 bytes ~ 69.34642 MB

    Total in MB for One Month Average is 171.39662 MB

    then times say 2 years of data would be:
    171.39662 MB x 24 = 4113.51888 MB

    now that all adds up if it was just in FOUR tables, but it would be best to partion the four tables via RANGE and then reference a column that is a DATE and break the partitions down into months.

    Was my above syntax correct?
    Last edited by ByGoneYrs; February 15th, 2013 at 09:02 AM.
  6. #4
  7. No Profile Picture
    Problem Solver
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jan 2001
    Location
    Stockholm, Sweden
    Posts
    4,495
    Rep Power
    534
    Originally Posted by ByGoneYrs
    mysql> select count(*) from ClosedPeriodPositions where PeriodBeginDate between '2013-01-01' and '2013-01-31';
    +----------+
    | count(*) |
    +----------+
    | 11668 |
    +----------+
    *** Avg_row_length 912

    ...

    Since the count above was 11668 it looks like you got this 10x too high:
    116688 x 912 = 106419456 bytes ~ 101.4895 MB
    Which means that total should be about 80MB per month instead.

    Then times say 2 years of data would be:
    80MB x 24 = 1920MB

    Which means that two years of database can fit in the RAM of my laptop and the biggest table will contain only 200,000 rows.
    Which in my opinion is too small to start using Partitioning.
    It only makes the solution more complex and really no benefit.


    But another suggestions if you haven't already done so use the innodb_file_per_table option. That way you can reclaim space with just a simple ALTER TABLE foo ENGINE=InnoDB; a very good best practice.

    And regarding your question:
    Next under this setup what happens if someone does a query from a specific partition say feb2012 which would have any entry from date 2012-02-01 till 2012-02-28. Now say they only wanted from the specific date of 2012-02-15, would they just get entrys from just that specific date of the 15th or would they get all entrys from within the partition from the 1st through the 28th?
    The query always returns the correct result that you queried for regardless of if the table is partitioned or not, the only difference is how it is executed in the background. In a partitioned table when you query on the partition key the database is able to determine by the partition key which partition it should examine and searches only that file for the matching records.

    But this is the special case, so unless you basically always search on the partition key or your file sizes are limiting then you don't really benefit from partitioning.

    The best of luck to you
    /Stefan
  8. #5
  9. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Originally Posted by sr
    Which means that total should be about 80MB per month instead.

    Then times say 2 years of data would be:
    80MB x 24 = 1920MB

    Which means that two years of database can fit in the RAM of my laptop and the biggest table will contain only 200,000 rows.
    Which in my opinion is too small to start using Partitioning.
    It only makes the solution more complex and really no benefit.


    But another suggestions if you haven't already done so use the innodb_file_per_table option. That way you can reclaim space with just a simple ALTER TABLE foo ENGINE=InnoDB; a very good best practice.

    And regarding your question:

    The query always returns the correct result that you queried for regardless of if the table is partitioned or not, the only difference is how it is executed in the background. In a partitioned table when you query on the partition key the database is able to determine by the partition key which partition it should examine and searches only that file for the matching records.

    But this is the special case, so unless you basically always search on the partition key or your file sizes are limiting then you don't really benefit from partitioning.

    The best of luck to you
    That is the current data size, BUT the company intends in adding alot more different Bank's info into those four tables and the size will BLOOM if you understand what I am talking about. Currently it is only 5 banks, and there are like 100 more. Also realize as well, there are hundreds of other tables of all sizes in that same database as well too. These four tables will be used for reporting source base.
    Last edited by ByGoneYrs; February 15th, 2013 at 11:08 AM.
  10. #6
  11. No Profile Picture
    Problem Solver
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jan 2001
    Location
    Stockholm, Sweden
    Posts
    4,495
    Rep Power
    534
    Originally Posted by ByGoneYrs
    That is the current data size, BUT the company intends in adding alot more different Bank's info into those four tables and the size will BLOOM if you understand what I am talking about.
    I wish I had had a penny for every time I've heard that.

    Originally Posted by ByGoneYrs
    Also realize as well, there are hundreds of other tables of all sizes in that same database as well too.
    This has no impact whatsoever on the decision if you should partition these four tables or not.
    It only tells you that your server could need more CPU/RAM/disk space than if it was only these four tables.

    The thing is if you know _guaranteed_ that it will "bloom" to about 10 times the size of what we have discussed previously within the next year (i.e get up to 50 banks within a year), then yes I would probably consider looking at it and see if it could benefit from partitioning.
    But I kind of speculate a bit about the past growth rate when I see that you are still running 5.0.51. :P

    Summary:
    1. I definitely think you should upgrade MySQL from the antique 5.0.51 to 5.6.10 if this is a good time to do it.
    2. While doing that watch out for the change in precedence of the comma join syntax which took place in 5.1 (in case you have any of these odd "SELECT ... FROM a,b LEFT JOIN c ON ..." in your application).
    3. But I would wait with the partitioning until the size has grown bigger, you can always do it later when you actually need it.

    Just my 2c
    Take care
    /Stefan
  12. #7
  13. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2012
    Location
    Southern New Jersey, USA
    Posts
    205
    Rep Power
    102
    Ok on a couple different issues...

    Currently we are using Community Ed 5.0.51a...but if we decide to stay with MySQL we would up grade to MySQL 5.6.10 or above. Now my DEV team wants to move to MariaDB 10.1 and obtain support for MariaDB from a company called SkySQL.

    Now the other side of the house(where I am sitting myself) wants to move to Oracle Standard 11.2.0.3 datbases for production and staging with a support contract from Oracle, and I further would like to use PostgreSQL 9.3 and buy the added code extensions and support for PostgreSQl from EnterpriseDB. Their add on add both Materialized views and Standby database to postgreSQL and support as well too. for Dev/TEST. This way we could use DEV/TEST with Postgresql, then try the code in Staging which would be Oracle, and then when all checks out roll into Production which is Oracle.

    Everything I have seen and heard from Oracle insiders, they are trying to make MySQL into Oracle Light, with eventually changing it enough to be like Oracle Standard and then just kill off MySQL. If we were to remain, I would rather switch to MariaDB then to stay on MySQL.

    We are currently changing our whole Infrastructure and one that get ironed out, then the database side of it will be resolved next.

IMN logo majestic logo threadwatch logo seochat tools logo