I've been very pleased at the response to my article “Thinking XML: The
XML decade”. I
like to write two types of articles: straightforward expositions of
running code, and pieces providing more general perspective on some
aspect of technology. I've felt some pressure in the publishing world
lately to dispense with some of the latter in favor of puff pieces for
the hottest technology fad; right now, having "SOA", "AJAX" or "podcast"
in the title is the formula for selling a book or article. I've been
lucky throughout my career to build relationships with editors who trust
my judgment, even if my writing is so often out of the mainstream. As
such, whenever an article that's not of any obvious appeal touches a
chord and provokes intelligent response, I welcome it as some more
ammunition about following the road less trampled in my future writing.
Karl Dubost of the W3C especially
appreciated my effort
to bring perspective to the inevitable tensions that emerged as XML took
hold in various communities.
Uche Ogbuji has written an excellent article The XML Decade. The
author is going through XML development history as well as tensions
between technological choices in XML communities. One of the fundamental
choices of creating XML was to remove the data jail created by some
application or programming languages.
Mike Champion, with whom I've often jousted (we have very different
ideas of what's practical) was very kind in "People are reflecting on
XML after 10
years".
A more balanced assessment of the special [IBM Systems Journal issue]
is from Uche Ogbuji. There is quite a nice summary of the very
different points of view about what XML is good for and how it can be
used. He reminds us that today's blog debates about
simplicity/complexity, tight/loose coupling, static/dynamic typing, etc.
reflect debates that go back to the very beginning. I particularly like
his pushback on one article's assertion that XML leverages the value of
"information hiding" in OO design.
It was a really big leap for me from OO (and structured
programming/abstract data type) orthodoxy to embrace of XML's open data
philosophy (more on that in “Objects. Encapsulation.
XML?”). It did help that
I'd watched application interface wars from numerous angles in my
career: RPC granularity, mixins and implementation versus interface
inheritance, SQL/C interface coupling, etc. It start to become apparent
to me that something was wrong when we were translating natural business
domain problems into such patently artificial forms. RDBMS purists have
been making a similar point for ages, but in my opinion, they just want
to replace N-tier applications artifice with their own brand of
artifice. XML is far from some magic mix for natural expression of the
business domain, but I believe that XML artifacts tend to be
fundamentally more transparent than other approaches to computing. In
my experience, I've found that it's easier to maintain even a
poorly-designed XML-driven system than a well-designed system where the
programming is preeminent.
Mike's entry goes on to analyze the usefulness, in perspective, of the
10 guiding principles for XML. When he speaks of "a more balanced view"
he's contrasting the SlashDot thread on the
article, which
is mostly filled with the sort of half-educated nonsense that drove me
from that site a few years ago (these days I find the most respectable
discussion on reddit.com). Poor Liam Quinn spent
an awful lot of patience on a gang of inveterate flame-throwers. Besides his calm
explanations the best bits in that thread were on HL7 and ASN.1.
Supposedly the new version 3 [HL7] standard (which uses the "modeling
approach") will be much more firm with the implementors, which will
hopefully mean that every now and then one implementation will actually
be compatible with another implementation. I've looked over their
"models" and they've modelled a lot of the business use-case stuff for
patient data, but not a lot of the actual data itself. Hopefully when
it's done, it'll come out a bit better baked than previous versions.
That does not sound encouraging. I know Chimezie, my brother and Copia
co-conspirator, is doing some really exciting work on patient data
records. More RDF than XML, but I know he has a good head for the
structured/unstructured data continuum, so I hope his work propagates
more widely than just the Cleveland Clinic Foundation.
J. Andrew Rogers made the point that Simon St.Laurent and I (among
others) have made in the many cases where people misuse XML rather than
something more suitable, such as ASN.1.
The "slow processing" is caused by more than taking a lot of space.
XML is basically a document markup but is frequently and regular used as
a wire protocol, which has very different design requirements if you
want a good standard. And in fact we already have a good standard for
this kind of thing called "ASN.1", which was actually engineered to be
extremely efficient as a wire protocol standard. (There is also an ITU
standard for encoding XML as ASN.1 called XER, which solves many of the
performance problems.)
Of course, I think he goes a bit too far.
The only real advantage XML has is that it is (sort of) human
readable. Raw TLV formatted documents are a bit opaque, but they can be
trivially converted into an XML-like format with no loss (and back)
without giving software parsers headaches. There is buckets of irony
that the deficiencies of XML are being fixed by essentially converting
it to ASN.1 style formats so that machines can parse them with maximum
efficiency. Yet another case of computer science history repeating
itself. XML is not useful for much more than a presentation layer, and
the fact that it is often treated as far more is ridiculous.
I'd actually argue that XML is suited for a (semi-structured) model
layer, not a presentation layer. For one thing, wire efficiency often
counts in presentation as well. But his essential point is correct that
XML is an awful substitute for ASN.1 as a wire protocol. By the same
token, the Web services stack is an awful substitute for CORBA/OMA and
even Microsoft's answers to same. It seems the industry is slowly
beginning to realize this. I love all the many articles I see with
titles such as "We know your SOA investment is stuck firmly in the
toilet, but honest, believe us, there is an effective way to use this
stuff. Really."
Anyway later on in that sub-thread:
The company I work for has had a lot of success with XML, and are
planning to move the internal data structure for our application from
maps to XML. There is one simple reason for our sucess with it: XSLT. A
customer asks for output in a specific format? Write a template. Want to
display the data on a web page? Write a template that converts to HTML.
Want to print to PDF? Write a template that converts to XSL, and use one
of many available XSL->PDF processors. Want to use PDF forms to input
data? Write a template to convert XFDF to our format. Want to import
data from a competitor and steal their customer? You get the picture.
Bingo! The secret to XML's value is transformation. Pass it on.
In "XML at 10: Big or
Little?" Mark
writes:
What the article ultimately ends up being about is the "Big" idea of
XML vs. the oftentimes "Little" implementation of it. The Big idea is
that XML can be used in a bottom-up fashion to model the grammar of a
particular problem domain in an application- and context-independent
manner. The little implementation is when XML is essentially used as a
more verbose protocol for data interchange between existing
applications. I would guess that is 90+ percent of what it is currently
used for.
He's right, and I this overuse is one of the reasons XML so often
elicits reflex
hostility. Then
again, anything that doesn't elicit such hostility in some quarters is,
of course, entirely irrelevant. I think it's a good thing that cannot
be said of XML.
[Uche Ogbuji]