November 6th, 2003, 02:10 PM
why not XML
I know there are some darn smart cats around here, I wanted to run a thought experiment by you guys and see what kind of feedback I can get. After viewing some of the backend code for postgresql, and studying RDBMS's in general, I am beginning to wonder about the organization of data in the backend. I am mostly curious what would be the ramifications of choosing XML as a data representation. How would this affect indexing structures, evaluation of relational operators, and so forth? Is this a case of machine representation of data at this level of abstraction causing too much of a performance issue to make this possible? I believe there are query languages available for XML, like XQL, and there are many, many api's for dealing with xml data, so what makes current approaches more viable than something along these lines? I have looked into OODBMS's somewhat in the past, but I find the ideas confusing, too much of a blend of concepts. For instance, OODBMS's based on Java seem to be soooo incredibly bloated and unreasonable that I find mysefl saying, just build yourself an XML-style data store, and use the Java API's for XML handling (JAXB, JAXP, etc.) to get this same end goal without all the weird abstractions???
November 6th, 2003, 02:57 PM
Argghh... don't get me started!
XML as a representation should have nothing to do with the backend. The real question is: how exactly do you want to logically query your data. Now, if you want to argue for XML as a native storage mechanism, that is essentially just a regression to the older-style hierarchical database systems which were thrown away in the 80s. Hierarchies sound nice and intuitive, until you actually try to create large, complex, changeable systems, and then you realize they are a nightmare.
One of the major points of the relational database management system was to completely separate physical storage/retrieval implementation from the logical operations developers and users want to perform with the data.
As to the efficiency of XML as a storage mechanism, personally I can't think of anything worse. Read further down in this Joel on Software article.
Now, maybe I misunderstand what you are getting at. Were you saying that at the core data storage level, you think PostgreSQL might benefit from storage in an XML format, or that you as the developer, not caring about the machine storage, would just prefer to receive your data in an XML format?
It's good to see someone interested in getting a grasp on the internals of database management, etc... Here is the best tip I can give. If you really want to challenge your thinking in this area, read this book by C.J. Date, one of the founding thinkers in the database industry. Then, spend some time at his website, www.thethirdmanifesto.com.
Also, great series of articles by Fabian Pascal:
Not that I am saying we should do away with XML... I think the best use for XML is just as it was originally intended: as a document mark-up system (XUL, anyone?). But XML as DBMS is just finding a way to re-create problems that have already been solved.
November 6th, 2003, 03:43 PM
Hahahah.... sorry I'm not laughing at metaBarf, but I just knew that this thread would evoke that response from rycamor.
PostgreSQL, it's what's for dinner...
November 6th, 2003, 07:43 PM
thanks much, I appreciate having a perspective from someone more familiar with the ideas.
this is more along the lines of what I was thinking, but I Was also wondering how you would go from SQL, which people have taken the time to figure out the algorithms for evaluating quite a variety of evaluation strategies for the relational algebra combinations and how they can be implemented in different ways (JOIN syntax I guess is an example?), to being able to quickly traverse a "representation" of the data as XML... what made me think of it is the ease with which other languages are able to deal with XML documents, mostly because I have been mysefl looking at trying to use XML in more ways, and it seems very intuitive me for a lot of projects (currently we are doing it with data handling).... I had a very hard time trying to convince the peopla t my work to take the time to use XML for their project, but they went with LaTeX instead. They wanted to take all of their documents from Word to LaTeX and they struggled with it much more than me, who got to the same point that they did at the same time of being able to take the document, break it down, then turn it into a format that allows for symbolic associations (the main benefit would be only editing poritions of the system, and havin the changes applied throughout).
From the very beginning I was thinking about how I was going to break down the docs to the point where I could store them in postgres, and then rebuild them with a set of keys unique to each "document"... Now that I got distracted from that for so long and learned a bunch about using XML and all the nice API's for it, I began to wonder more about how it could be applied in otehr areas, similiar to how I heard about how they are working on makign a desktop environment based on XML.
I very much appreciate the references, I hope that I would be ablet o soon afford some of those texts. My last seemster of school is coming up, so those books will prob. cost a pretty penny but I am going to try and get them used so I can get more outside resources.
November 6th, 2003, 07:48 PM
Well, the real question is simply how to query a relational database to receive and store hierarchical data. Actually, the relational model can handle hierarchical data just fine, but it takes a different way of thinking. I Don't have time to discuss it now, but search the database forums here and you will find some examples.
November 8th, 2003, 06:31 PM