Every time we hear the exclamation “we’re going to use XML” a small shiver runs through our collective technical spines. What exactly did the speaker mean by this statement? Did they mean they planned on using a well known XML-based vocabulary or language to store their data? Or maybe they meant they were going to define their own XML-based language to store or exchange data? Or maybe they were going to have an ad-hoc application of XML syntax rules to undocumented collections of tags with no specification? Unfortunately it is very likely that the third point is the case.
If you don’t follow XMLese what we’re saying here is that most likely the “XML fan” plans on inventing a set of tags like
<qty> and so on and storing their data in it. For example,
Now there seems to be nothing wrong with this. The well-formedness rules of XML will force us to nest our tags properly, case them properly, use quotes appropriately, not forget all the close tags and the various other little markup things we should do that browsers tend to correct for us. That’s all fine and well, well-formedness rules are great to enforce syntax, but what about semantics?
What is the specification of this language? What tags are allowed? What should they contain? What do they mean? With such a simple language this isn’t too difficult to deal with, but what is going to happen when life gets a bit more complicated? Suddenly we are adding in
Don’t get us wrong, XML is a tremendous technology, but it seems doomed to repeat all the failings of its predecessor, SGML, and then some given its wider use. Yet we do think it will ultimately succeed and in data interchange and storage projects you really must consider XML data formats, they are the way to go. XML can be the “concrete” foundation of your site or application but just like with that useful substance, you can create quite a monstrous mess that is hard to deal with if you misuse it.