February 24th, 2011, 12:27 AM
Limittations of using xml as Database
I am developing an application which uses xml file as database. i want to know the limitations of xml file when it use as Database.
February 24th, 2011, 08:03 AM
XML is a terrible format to use as a database. XML is really only suitable to use for read-only information, once you start writing to an XML file you will run into massive performance problems and consistency problems.
The performance problems are the biggest concern.
First, XML is an extremely verbose format. This means that there is a ton of overhead for every piece of data that you store. This is not a huge concern for a small configuration file, but for an actual database this is a major problem. Say you have a field named "is_active" in a database table with 1000 rows. With XML you have to repeat that column name not once, but twice for every row. That's 1000*9*2 = 18,000 bytes of overhead just for a single column in a table with only 1000 rows.
Second, assuming you're storing your XML file on the filesystem, only one user can write to the database at a time. This is a massive problem for multi-user systems where more than one user may want to write to the database at once (very common, especially in web applications). If one user is writing to the database all other users will be forced to wait for them to finish before anyone else can write to it.
Third, XML is a very expensive format to parse and read. Again, this is not a problem for small configuration files, but for a large database where you need to search for information or retrieve specific records this is a massive problem. XML is also a very linear format, you can't perform a binary search on it easily or neatly, which will make your database perform extremely poorly. You'll need to read the entire database into memory to read it effectively, which is a very bad idea.
There are a lot of other reasons too, but those are the major three: XML has terrible overhead characteristics, terrible multi-user write performance, and terrible read performance (all characteristics which are critically important for a database).
Comments on this post