November 27th, 2002, 08:37 AM
Benchmarking database frameworks
I've been told I'm supposed to go and benchmark the frameworks of a number of data frameworks, JDBC, SQLJ, ADO. NET amongst others.
Does anyone have any idea as to what would make an interesting schema to test them against? Better yet, what kind of issues would make such a test(s) interesting, though impartial?
What kind of technical details should I keep an eye out for - DDL/DML issues etc.
What about URLs, does anyone know of some literature I should take a look at to answer these questions - I'm not asking for a schema, I'd rather find out what's important for situations like this (havn't done somethink like this before unfortunately).
December 3rd, 2002, 06:22 AM
I have never had to do what you are trying to do so this is just an opinion. Don't confuse this with advice based on experience.
I would test against my own schema. In other words, if I had a specific database that was going to be used after the testing, I would use that database. Why use some generic test database and hope that your test results carry over to the real database? Just use the real one and you won't have any nasty surprises.
There are so many parts to a high-performance database (disk speed, RAM, CPU speed, # of CPUs, indexes, network latency, connection pooling, client-side code vs. server-side code, etc.) that the method of connecting makes up a small portion of the overall picture. There are large databases that use ADO. There are large databases that use JDBC. You might want to give more consideration to what languages your programmers are proficient at to decide your method of connectivity.
Of course, if you're just testing this as a school assignment for purely academic purposes then none of that matters. Sun has created a Pet Shop database / program to show off the features of Java (this was not intended to be a benchmark). Microsoft has come out with a .NET version. Microsoft also has a couple of sample databases and applications (Northwind Traders, etc.)
When testing, shutdown the server between tests so you can be sure that any cached data has been cleared. Also things tend to fail in an exponential not a linear fashion. Don't make the mistake of thinking that because you get a certain performance level with 25 connections and a gig of data that you can extrapolate what will happen at 250 connections with a terabyte of data.