1. Join Date
    Aug 2001
    Rep Power

    DB design - flat files or in a table

    ok, first an explanation of the project:

    the client is a utilities company, and they will be providing the application with CSV files that correlate to usage data for a group of meters over a certain time period. the CSV file will have 2 dates, a start date and an end date .

    here's the tricky part.
    CSV file #1 can have a start date of Feb 1 and an end date of Feb 30
    CSV file #2 can have a start date of Feb 1 and an end date of Feb 15
    CSV file #3 can have a start date of Feb 1 and an end date of Feb 30 (same as CSV file #1), except certain data can be different for specific meters in that file.

    So simply importing the CSV files into a table by defining the starting and end dates won't work b/c of multiple data sets for the same bounding dates.

    Options i've considered:
    1. Creating another column in the table that is a csv_file_id. In other words, each file that's uploaded to the application gets assigned an id in another table, and this foreign key is also inputted into the meters table. So searches in the meters table will be done based on csv_file_id , not start / end dates.

    2. Create a table for each CSV file. While this will create more tables for MySQL to handle, in a 12-month period, there will be no more than 50 or so tables generated from the CSV files. This will dramatically cut down on the query time as more CSV files are entered, as each CSV file can have upwards of 1,500 lines.

    3. Only generate a temporary table for each CSV file at a time when it is called by the application to perform queries on it. This utilizes the same smaller-sized table as in #2, but the idea of generating a temporary table each time a query is called could become too taxing as the # of users of the application increases (the application will handle possibly 25 simultaneous users in the beginning, but that # may grow into the hundreds).

    I'm leaning towards #2, as 50 tables isn't that many for MySQL to handle, and the query time to pick out the required CSV file won't be necessary.

  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Toronto Canada
    Rep Power
    i like #1

    50 files of 1500 lines each is a trivially small table -- okay, medium-sized -- and with proper indexes, all the queries you can think of to run against it will perform fine

    dramatically fine


IMN logo majestic logo threadwatch logo seochat tools logo