September 11th, 2013, 05:55 AM
A big database +1million rows/day
i have to create a big database , and as i'm a beginner with mysql, i've created a modele but i'm not sure that it will be the best one for my queries.
so every day , i have to insert the content of 11 files in my DB, every file is associated to a node , and one platfom contains the 11 files.
all the files have the same structure , and every file have several rows with the same structure : Object: parameter1 = value1 , value2, value3,....;
one parameter have several values, we have an other file contains the pairs parameter/value .
so i created a static table ( descriptive_table) , that contains all the possible pairs parameter/value .
and for the 11 fichier, i created one table for 1 file, i could create one table for the 11 files , but i don't wanna have slow queries .
the 11 tables have a foreing key (id_ref) references ( descriptive_table)
my problem is that , i don't have a primary key for the 11 tables , each ligne is identified by all the columns, and i don't know how to choose the index, so i'am afraid i will have a slow queries .
CREATE TABLE IF NOT EXISTS `Description_table` (
`id_ref` int(30) NOT NULL AUTO_INCREMENT,
`object` varchar(30) NOT NULL,
`parameter` varchar(30) NOT NULL,
`userEngLab` varchar(30) NOT NULL,
`class` char(1) NOT NULL DEFAULT 'U',
`comment` tinytext NOT NULL,
PRIMARY KEY (`id_ref`),
KEY `object` (`object`,`class`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
CREATE TABLE IF NOT EXISTS `aim` (
`platform` varchar(30) NOT NULL,
`value` varchar(100) NOT NULL,
`date` date NOT NULL,
`id_ref` int(30) NOT NULL,
KEY `date` (`date`),
KEY `platform` (`platform`),
KEY `id_ref` (`id_ref`),
KEY `value` (`value`),
KEY `value_2` (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `aim`
ADD CONSTRAINT `aim_ibfk_1` FOREIGN KEY (`id_ref`) REFERENCES `Description_table` (`id_ref`) ON DELETE CASCADE ON UPDATE CASCADE;
i hope you can help me.
thank you so much.
September 12th, 2013, 03:34 AM
September 16th, 2013, 08:01 AM
1. Do the data in the 11 files belong together?
As it sounds like the 11 files contains data that belongs together.
And are you going to perform any analysis on all this data?
If the answer is yes then I would suggest _not_ to separate them in different tables.
Placing data in different tables with the notion of getting performance gains can be very misleading.
If you are still going to select from all tables and analyze etc then the performance gain is basically nothing because the database server still has the same total amount of rows and same amount of MB to shuffle around to find a specific record.
Only thing you added is getting a more complex application.
id int not null primary key auto_increment
on the other table(s) as well even if you don't use it.
The reason is that InnoDB stores the data as an index oriented table based on the primary key.
And then it also uses the primary key to find the records when you use other indexes so other indexes can become very large because the "pointer" to the data in the table can become very large.
And inserts becomes very slow when the InnoDB engine can't write at the end of the tables (auto_increment) and instead needs to find an exact position somewhere in the index and shuffle around data to write the newly inserted record.
So basically always have a primary key as int even if you don't really need it.