XML 2006 Synopsis: Are we there yet?

Well, XML 2006 came and went with a rather busy bang. My presentation on using XSLT to generate Xforms (from XUL/XHTML) was well attended and I hoped it helped increase awareness on the importance and value of XForms, (perhaps) the only comprehensive vehicle by which XML can be brought to the web in the way proponents of XML have had in mind for some time. As Simon puts it:

XML pretty (much) completely missed its original target market. SGML culture and web developer culture seemed like a poor fit on many levels, and I can't say I even remember a concerted effort to explain what XML might mean to web developers, or to ask them whether this new vision of the Web had much relationship to what they were doing or what they had planned. SGML/XML culture and web culture never really meshed.

Most of the questions I received had to do with our particular choice of FormsPlayer (an Internet Explorer plugin) instead of other alternatives such as Orbeon, Mozilla, Chiba, etc. This was a bit unfortunate and an indication of a much larger problem in this particular area of innovation we lovingly coin 'web 2.0'. I will get back to this later.

I was glad to hear John Boyer tell me he was pleasantly surprised to see mention of the Rich Web Application Backplane W3C Note. Mark Birbeck and Micah Dubinko (fellow XForms gurus and visionaries in their own rights) didn't let this pass over their radar, either.

I believe the vision outlined in that note is much more lucid than a lot of the hype-centered notions of 'web 2.0' which seem more focused on painting a picture of scattered buzzwords ('mash-ups', AJAX etc..) than commonalities between concrete architectures.

Though this architectural style accommodates solutions based on scripting (AJAX) as well as more declarative approaches, I believe the primary value is in freeing web developers from the 80% of scripting that is a result of not having an alternative (READ: browser vendor monopolies) than being the appropriate solution for the job. I've jousted with Kurt Kagle before on this topic and Mark Birkeck has written extensively on this as well.

In writing the presentation, I sort of stumbled upon some interesting observations about XUL and XForms:

  • XUL relies on a static, inarticulate means of binding components to their behavior
  • XForms relies on XPath for doing the same
  • XUL relies completely on javascript to define the behavior of it's widgets / components
  • A more complete mapping from XUL to XForms (than the one I composed for my talk) could be valuable to those more familiar with XUL as a bridge to XForms.

At the very least, it was a great way to familiarize myself with XUL.

In all, I left Boston feeling like I had experienced a very subtle anti-climax as far as innovation was concerned.
If I were to plot a graph of innovative progression over time, it would seem to me that the XML space has plateaued as of late and political in-fighting and spec proliferation has overtaken truly innovative ideas. I asked Harry Halpin about this and his take on it was that perhaps "XML has won". I think there is some truth to this, though I don't think XML has necessarily made the advances that were hoped in the web space (as Simon St. Laurent put it earlier).

There were a few exceptions however

XML Pipelines

I really enjoyed Norm Walsh's presentation on XProc and it was an example of scratching a very real itch: consensus on a vocabulary for XML processing workflows. Though, ironically, it probably wouldn't take much to implement in 4Suite as support for most (if not all) of the pipeline operations are already there.

I did ask Norm if XProc would support setting up XPath variables for operations that relied on them and was pleased to hear that they had that in mind. I also asked about support for non-standard XML operations such as XUpdate and was also pleased to hear that they had that covered as well. It was worth noting that XUpdate by itself could make the viewport operation rather redudant.

The Semantic Web Contingent

There was noticeable representation by semantic web enthusiasts (myself, Harry Halpin, Bob Ducharm, Norm Walsh, Elias Torres, Eric Prud'hommeux, Ralph Hodgson, etc..) and their presentations had somewhat subdued tones (perhaps) so as not to incite ravenous bickering from narrow-minded enthusiasts. There was still some of that however as I was asked by someone why RDF couldn't be persisted natively as XML, queried via XQuery, and inferred over via extension functions! Um... right... There is some irony in that as I have yet to find a legitimate reason myself to even use XQuery in the first place.

The common scenario is when you need to query across a collection of XML documents, but I've personally preferred to index XML documents with RDF content (extracted from a subset of the documents), match the documents via RDF, isolate a document and evaluate an XPath against it essentially bypassing the collection extension to XPath with a 'semantic' index. Ofcourse, this only makes sense where there is a viable mapping from XML to RDF, but where there is one I've preferred this approach. But to each his/her own..

Content Management API's

I was pleasantly surprised to learn from Joel Amousou that there is a standard (a datastore and language-agnostic? standard) for CMS APIs. called JSR-170. The 4Suite repository is the only Content Mangement System / API with a well though-out architecture for integrating XML & RDF persistence and processing in a way that emphasizes their strengths with regard to content management. Perhaps there is some merit in investigating the possibility of porting (or wrapping) the 4Suite repository API as JSR-170? Joel seems to think so.


Micheal Kay had a nice synopsis of the value of generating XSLT from XSLT – a novel mechanism I've been using for some time and it was interesting to note that one of his prior client projects involved a pipeline that started with an XForm, post-processed by XSLT and aggregated with results from an Xquery (also generated from XSLT).

Code generation is a valuable pattern that has plenty unrecognized value in the XML space and I was glad to see Micheal Kay highlight this. He had some choice words on when to use XSLT and when to use XQuery that I thought was on point: Use XSLT for re-purposing, formatting and use Xquery for querying your database.


Finally, I spent quite some time with Harry Halpin (chair of the GRDDL Working Group) helping him installing / using the 4Suite / RDFLib client I recently wrote for use with the GRDDL test suite. You can take what I say with a grain of salt (as I am a member and loud, vocal supporter), but I think that GRDDL will end up having the most influential impact in the semantic web vision (which I believe is much less important than the technological components it relies on to fulfill the vision) and XML adoption on the web than any other, primarily because it allows content publishers to leverage the full spectrum of both XML and RDF technologies.
Within my presentation, I mention an architectural style I call 'modality segregation' that captures the value proposition of XSLT for drawing sharp, distinguishable boundaries (where there were once none) between:

  • content
  • presentation
  • meaning (semantics)
  • application behavior

I believe it's a powerful idiom for managing, publishing, and consuming data & knowledge (especially over the web).

Harry demonstrated how easy it is to extract review data, vocabulary mappings, and social networks (the primary topic of his talk) from XHTML that would ordinarily be dormant with regards to everything other than presentation.
We ran into a few snafus with 4Suite when we tried to run Norm Walsh's hCard2RDF.xslt against Dan Connolleys web site and Harrys home page. We also ran into problems with the client (which is mostly compliant with the Working Draft).

I also had the chance to set Harry up with my blazingly fast RETE-based N3 reasoner, which we used to test GRDDL-based identity consolidation by piping multiple GRDDL results (from XHTML with embedded XFN) into the reasoner, performing an OWL DL closure, and identifying duplicate identities via Inverse Functional Properties (smushing)

As a result of our 5+ hour hackathon, I ended up writing 3 utilities that I hope to release once I find a proper place for them:

  • FOAFVisualizer - A command-line tool for merging and rendering FOAF networks in a 'controlled' and parameterized manner
  • RDFPiedPipe - A command-line tool for converting between the syntaxes that RDFLib supports: N3, Ntriples, RDF/XML
  • Kaleidos - A library used by FOAFVisualizer to control every aspect of how an RDF graph (or any other network structure) is exported to a graphviz diagram via BGL-Python bindings.

In the final analysis, I feel as if we have reached a climax in innovation only to face a bigger challenge from politics than anything else:

  • RDFa versus eRDF
  • SPARQL without entailment versus SPARQL with OWL entailment
  • XHTML versus HTML5
  • Web Forms versus XForms
  • Web 2.0 versus Web 3.0
  • AJAX versus XForms
  • XQuery versus XSLT
  • XQuery over RDF/XML versus SPARQL over abstract RDF
  • XML 1.0 specifications versus the new 2.0 specifications

The list goes on. I expressed my concerns about the danger of technological camp warefare to Liam Quin (XML Activity Lead) and he concurred. We should spend less time arguing over whether or not my spec is more l33t than yours and more time asking the more pragmatic questions about what solutions works best for the problem(s) at hand.

[Uche Ogbuji]

via Copia
2 responses
Wow.  Tons to comment on.  I do want to start with one matter, though.  You got a segfault in 4Suite because you were using a very experimental branch.  HEAD is not even alpha quality right now.  4Suite branch policy is that stable is the 1.0maint branch.  make sure you're on that if you want stable 4Suite.  We're turbocharging XPath performance on HEAD, and there's a good deal to fix, still.
ah goodness.

check out simon st laurent on o'reilly- he came away v positive on xquery.

datadirect just started using xquery.com (which i thought was owned by jason hunter of mark logic but...).

i asked why,given xquery was dead. Mr robie wondered "where that rumour came from" - not a rumour, mr robie, an analysis based on developer discussions i have.

so it was nice to see you remaining skeptical.

to michael kay's point, the majority devs are sub optimal at building database queries.