November 13th, 2012, 11:05 AM
I'm working on a geolocation system and I'm having a little bit of difficulty getting the performance I want. I'm running PostgreSQL 9.2.1 and I have 3 tables.
'location' has a few columns but the only two I reference in my query are 'locid' and 'country'. locid is the primary key, bigint, and country is a varchar(30) although all countries are expressed in their ISO 3166-1 alpha-2 2 character codes.
'ipblocks' has three columns. 'locid', 'startipnum' and 'endipnum'. 'locid' is a bigint, 'startipnum' and 'endipnum' are inet.
'src_addr' has a few columns as well. The ones I use are 'srcaddr' which is of type inet.
When I run my query for small countries it doesn't take a whole lot of time. When I get into the bigger countries - the ones with more ip blocks and location codes - the query takes longer. Italy, for example, takes 2 minutes to run. I'm looking to get all countries down to ~1 second or less.
My first idea to do this is to add the locid to the srcaddr table which would allow me to get rid of the sub queries to link the 'src_addr.srcaddr' to the 'ipblocks' which then gives me a link to location. A colleague says this would be a last resort since it would break referential integrity, and I don't want to offend him by asking him which referential integrity would be broken by adding a foreign key to linked data so I though maybe I would ask you guys.
Please note that I am aware there is more referential integrity issues here than me wanting to add a foreign key to location table from the src_addr table. IE lack of a primary key in ipblocks. I'm picking up where my predecesor left off.
November 14th, 2012, 09:43 AM
Okay, I know I mentioned that I use PostgreSQL 9.2.1 but this is a normalization question. Not a Postgres question. Doesn't it more qualify under generic database management rather than PostgreSQL?