Validation on the rack

Mark Baker's "Validation considered harmful" touched off a fun series of responses.

[C]onsider the scenario of two parties on the Web which want to exchange a certain kind of document. Party A has an expensive support contract with BigDocCo that ensures that they’re always running the latest-and-greatest document processing software. But party B doesn’t, and so typically lags a few months behind. During one of those lags, a new version of the schema is released which relaxes an earlier stanza in the schema which constrained a certain field to the values “1″, “2″, or “3″; “4″ is now a valid value. So, party B, with its new software, happily fires off a document to A as it often does, but this document includes the value “4″ in that field. What happens? Of course A rejects it; it’s an invalid document, and an alert is raised with the human [administrator], dramatically increasing the cost of document exchange. All because evolvability wasn’t baked in, because a schema was used in its default mode of operation; to restrict rather than permit.

Upon reading this I had 2 immediate reactions:

  1. Yep. Walter Perry was going on about all this sort of thing a long time ago, and the industry would be in a much saner place, without, for example crazy ideas such as WS-Kaleidoscope and tight binding of documents to data records (read WXS and XQuery). For an example of how Perry absolutely skewered class-conscious XML using a scenario somewhat similar to Mark's, read this incisive post. To me the perils of bondage-and-discipline validation are as those of B&D datatyping. It's all more example of the poor design that results when you follow twopenny Structured Programming too far and let early binding rule the cosmos.

  2. Yep. This is one of the reasons why once you use Schematron and actually deploy it in real-life scenarios where schema evolution is inevitable, you never feel sanguine about using a grammar-based schema language (not even RELAX NG) again.

Dare's response took me aback a bit.

The fact that you enforce that the XML documents you receive must follow a certain structure or must conform to certain constraints does not mean that your system cannot be flexible in the face of new versions. First of all, every system does some form of validation because it cannot process arbitrary documents. For example an RSS reader cannot do anything reasonable with an XBRL or ODF document, no matter how liberal it is in what it accepts. Now that we have accepted that there are certain levels validation that are no-brainers the next question is to ask what happens if there are no constraints on the values of elements and attributes in an input document. Let's say we have a purchase order format which in v1 has a element which can have a value of "U.S. dollars" or "Canadian dollars" then in v2 we now support any valid currency. What happens if a v2 document is sent to a v1 client? Is it a good idea for such a client to muddle along even though it can't handle the specified currency format?

Dare is not incorrect, but I was surprised at his reading of Mark. When I considered it carefully, though, I realized that Mark did leave himself open to that interpretation by not being explicit enough. As he clarified in comment to Dare:

The problem with virtually all uses of validation that I've seen is that this document would be rejected long before it even got to the bit of software which cared about currency. I'm arguing against the use of validation as a "gatekeeper", not against the practice of checking values to see whether you can process them or not ... I thought it goes without saying that you need to do that! 8-O

I actually think this is a misunderstanding that other readers might easily have, so I think it's good that Dare called him on it, and teased out the needed clarification. I missed it because I know Mark too well to ever imagine he'd ever go so far off in the weeds.

Of course the father of Schematron would have a response to reckon with in such debate, but I was surprised to find Rick Jelliffe so demure about Schematron. His formula:

schema used to validating incoming data only validates traceable business requirements

Is flash-bam-alakazam spot on, but somewhat understated. Most forms of XML validation do us disservice by making us nit-pick every detail of what we can live with, rather than letting us make brief declarations of what we cannot live without. Yes Schematron's phases provide a powerful mechanism for elagant modularization of expression of rules and requirements, but long before you go that deep Schematron sets you free by making validation an open rather than a closed operation. The gains in expressiveness thereby provided are near astonishing, and this is despite the fact that Schematron is a less terse schema language than DTD, WXS or RELAX NG.

Of course XML put us on the road to unnecessary bondage and discipline on day one when it made it so easy, and even a matter of recommendation, to top each document off with a lordly DTD. Despite the fact that I think Microformats are a rickety foundation for almost anything useful at Web scale, I am hopeful they will act as powerful incentive for moving the industry away from knee-jerk validation.

[Uche Ogbuji]

via Copia

XHTML to Excel as well as OOo

In my article "Use XSLT to prepare XML for import into OpenOffice Calc" I discussed how to use XSLT to format XML for import into OpenOffice Calc.

The popular open source office suite OpenOffice.org is XML-savvy at its core. It uses XML in its file formats and offers several XML-processing plug-ins, so you might expect it to have nice tools built in for importing XML data. Unfortunately, things are not so simple, and a bit of work is required to manipulate general XML into delimited text format in order to import the data into its spreadsheet component, Calc. This article offers a quick XSLT tool for this purpose and demonstrates the Calc import of records-oriented XML. In addition to learning a practical trick for working with Calc, you might also learn a few handy XSLT techniques for using dynamic criteria to transform XML.

I do wonder about formulae, though. Via Parand Tony Darugar I found "Excel Reports with Apache Cocoon and POI", which (in the section starting with the header "Rendering Machinery") shows that you can just as easily use such a technique for MS Excel. Good to know. I've recently had reason to figure out a system for aggregating reports to and from spreadsheets and XML, and I'd possibly have to deal with simple formulae. I guess I'll cross that bridge if I ever for sure get to it, and the full OpenDocument saved-file format is always an option, but if anyone does happen to have any quick pointers, I'd be grateful.

[Uche Ogbuji]

via Copia

Facet Segregation, AspectXML, Web Publishing, and XML 2006

The XML 2006 conference proceeding were due today

"The Essence of Declarative, XML-based Web Applications: XForms and XSLT"

The term Facet Segregation is used in this paper to identify a methodology for separating and abstracting individual aspects of XHTML and its derivative, related dialects into single-purpose vocabularies and using XSLT to compile more expressive languages from them. It can be considered an application of AspectXML for the composition of web dialects. It extends the well-established method(s) of separating content from presentation (used by DocBook XSLT stylesheets, for instance) with both Gleaning Resource Description Dialects of Languages (GRDDL) and the value proposition presented in this paper. It is within this larger picture that the value proposition of separating presentation markup from more specific application logic is proposed.

In the paper I demonstrate how XUL and a simple vocabulary for high-level user interface components (at a level more abstract than the widgets and controls commonly associated with user interface languages) can be used to compose a user interface for editing Atom entries. From this abstracted user interface document, an XForm is generated using an XSLT stylesheet that matches XUL, XHTML, XForms and the elements in the abstract vocabulary. I chose not to drill down too deeply into the example and spent more time describing how the value of using XSLT to generate XForms in this way fits in the larger picture (i.e, web architecture, rich web application backplane, semantic web architecture, and document composition).

I also attempt to describe a general framework of using XSLT to facilitate the extraction of various representations of the same document each of which facilitate a particular facet of web content:

  • Presentation
  • UI Behavior
  • Semantics or 'Faithful Rendition'

Faithful rendition is a term used in the GRDDL specification that has grown on me and represents the authors intent of the 'meaning' of a document as determined by the application of a GRDDL transform:

By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition of the source document, or some portion of the source document, that preserves its meaning in RDF.

The general idea (from 500,000 feet) here is that the means is the same but to different ends and enables an unprecedented level of richness for web content - when used in combination. I use the term Facet Segregation to describe this general approach. I ended up prefering this to 'Modality Segregation'. There is some correlation with Aspect Programming and AspectXML. The diagram below is my attempt to capture this:

The (Docbook) in the diagram identifies the most common usecase for the oldest of the three 'mechanisms' (seperation of content from presentation - typically facilitated by Docbook XSLT stylesheets).

[Uche Ogbuji]

via Copia

From Fourthought to Kadomo

I founded Fourthought in June, 1998 with three other friends from college. Eight and a half years doesn't sound that long when I say it, but the near-decade fills my rear view mirror so completely that I can scarcely remember having done anything before it. That's probably a good thing as it means I don't much remember the years of perfunctory consulting at places such as IBM Global Services and Sabre Decision Technologies prior to making the leap to relative independence. It was in part the typical entrepreneurial yen of the immigrant and in part the urge to chart my own high-tech career course that drove me to take the risk and endure the ups and downs of running a consultancy.

And I did say Fourthought is in the rear-view mirror. Last week I accepted a position at The Kadomo Group, a very young solutions company focused in the semantic Web space. Kadomo was founded by Eric miller, former Semantic Web Activity Lead at the W3C. Eric and I have always looked for ways we could work together considering our shared interest in how strategic elements of the semantic Web vision can be brought to bear in practice. He and the other bright and energetic folks coming together under the Kadomo banner were a major part of my decision to join. It was also made clear to me that I would have a sizeable role in shaping all aspects of the company. I would be able, and in fact encouraged to continue my leadership in open source projects and community specification development. Last but not least the culture of the company is set up to suit my lifestyle very well, which was always one tremendous benefit of Fourthought.

--> Without a doubt we have the seeds at Kadomo to grow something much greater than Fourthought was ever likely to be. The company has not neglected resources for high-caliber business development, operations nor marketing. Committing to these resources was something we always had a hard time doing at Fourthought, and this meant that even though we had brilliant personnel, strong client references and a market profile disproportionate to the resources we devoted to marketing, we were never able to grow at a fraction of our potential. I've learned many of these lessons the hard way, and it seems clearly to me that Kadomo is born to greater ambition. One good sign is that I'll just be Chief Technical architect, allowed to focus primarily on the company's technology strategy. I will not be stranded juggling primary sales, operations as well as lead consultant responsibilities. Another good sign is that product development is woven into the company's foundation, so I can look forward to greater leverage of small-company resources.

Considering my primary responsibility for technology strategy it may seem strange to some that I'd join a semantic Web company, knowing
that I have expressed such skepticism of the direction core semantic Web technology has taken lately. I soured on the heaping helping of gobbledygook that was laden on RDF in the post-2000 round of specs, I soured on SPARQL as a query language when it became clear that it was to be as ugly and inelegant as XQuery. There have been some bright spots of lightweight goodness such as GRDDL and SKOS but overall, I've found myself more and more focused on XML schema and transform technology. My departure point for the past few years has been that a well-annotated syntactic Web can meet all the goals I personally have for the semantic Web. I've always been pretty modest in what I want from semantics on the Web. To put it bluntly what interests me most is reducing the cost of screen-scraping. Of course, as I prove every day in my day job, even such an unfashionable goal leads to the sorts of valuable techniques that people prefer to buzz about using terms such as "enterprise mashups". Not that I begrudge folks their buzzwords, mind you.

I still think some simplified version or profile of RDF can be very useful, and I'll be doing what I can to promote a pragmatic approach to semantic Web at Kadomo, building on the mountains of XML that vendors have winked and nodded into IT and the Web, much of it a hopeless congeries. There is a ton of problem in this space, and I believe, accordingly, a ton of opportunity. I think mixing in my somewhat diffractive view of semantic Web will make for interesting discussion at Kadomo, and a lot of that will be reflected here on Copia, which, after all, I share with Chimezie, one of the most accomplished users of semantic Web technology to solve real-world problems.

One ongoing opportunity I don't plan to leave behind is my strong working relationship with the Web Platform Engineering group at Sun. With recent, hard-earned success in hand, and much yet to accomplish, we're navigating the paper trail to allow for a smooth transition from my services as a Fourthought representative to those as a Kadomo representative.

I hope some of you will consider contacting Kadomo to learn more about our services and solutions. We're just getting off the ground but we have a surprising amount of structure in place for bringing focus to our service offerings, and we have some exciting products in development of which you'll soon be hearing more. If you've found my writings useful or examples of my work agreeable, do keep me in mind as I plough into my new role.keep in touch-->.

Updated to reflect the final settling into Zepheira.  Most other bits are still relevant

[Uche Ogbuji]

via Copia

What's good and bad in agile methodology?

Frank Kelly has some good Thoughts on Agile Methods - XP and the like.. He's a skeptic of dynamic languages, and of course I'm an avid user of and advocate for these, so I was almost put off by his first couple of paragraphs, but I think in the end he nails the essential points.

Whether you create a design or not - the second you write a line of code you are realizing a design - it may be in your head but it's still a design. If you are on a team of more than about 7-8 developers then to really "scale" communication a written and agreed upon design is a very helpful (I would say 'necessary') task on the path to to success.

but, as he admits:

As I've said before Agile has taught me that at many times "less is more" - so I tend to write smaller design documents with lots more pictures and try to always keep them under 20-30 pages or so. From there on, 1-on-1 and team meetings can help get the details out. Also you can farm out individual component designs to individual developers - each creating a small 5-10 page document. Beyond that you do get the "glaze over" effect in people's eyes.

This is exactly right, and it has been my attitude to agile methodology. You can't ignore design, but you can make it less of a white elephant. As I wrote in "What is this ‘agility’?"

It’s not easy to come to criticism of BDUF. After all, it brings to the young profession of software engineering the rigor and discipline that have established other engineering disciplines so respectably. No one would commission a skyscraper or build a jet plane without mountains of specifications, models, surveys and procedural rules. It would seem that similar care is the only solution for the bad reputation software has earned for poor quality.

Despite this, there has been a steady movement lately toward “agile” techniques, which contradict BDUF. Agile boosters claim that BDUF is impractical for all but the largest and most mission-critical systems, and causes a lot of problems because inevitable change in requirements and environment are very difficult to accommodate in the process. The track is laid out during analysis and design, and any variation therefrom is a catastrophic derailment. Agile methods focus on flexibility and accommodation of change, including greater involvement of the end user throughout the process.

One area where I tended to disagree with Frank is in his discussion of the "waterfalls" approach to the software development life cycle (SDLC):

Here's my issue with rejecting waterfall - it's like rejecting Gravity - that's all well and good if you live in a parallel universe where the laws of physics don't apply :-)

He goes on to imply that you are rejecting integration testing and design when you reject waterfalls. I strongly disagree. Let's say there are strong and weak agile methodology supporters, and that Frank and I are examples of the weak sort based on our attitude towards design, (with our belief that some design is always necessary). I think the part of waterfalls that most weak agile supporters reject is the part that gives its name, i.e. irreversible flow between stages of the SDLC. The problem with waterfalls is that it is the contrary to iteration, and I think iterative development is important. I think Frank does as well, given his acceptance of more, smaller releases, so I think our difference is less substantive and more a different understanding of what it is in waterfalls that agile methodology rejects.

[Uche Ogbuji]

via Copia

Time for Mac?

I've decided to get a new laptop by the end of the year. My current Dell Inspiron 8600 one is a fountain of constant annoyance--I used to swear by Dell for laptops; that's so over. So I was considering either Lenovo or the Acer Ferrari series. My developer colleagues at Sun swear by the latter for power-user features and UNIX friendliness (some of them run Ubuntu, some Solaris). But more and more I've been wondering: is it time to consider a MacBook Pro for my laptop? primary machine? My right arm, just about?

We already have two Macs in the house, the high-end iMac G4 Lori got for her birthday 3 years ago and the high-end iMac 24-inch she got for her birthday in October. For the largely multimedia stuff she does, they are excellent, but I've never warmed to OS X, and I've spent a fair amount of time on her computer. I miss little things such as multiple desktops and the rapid back-and forth between GUI and command line. On OS X, as on Windows, going to the command line feels like going to a different land. And yes, I've heard there are multiple desktop add-ons for OS X, and I agree that Expose alleviates some of the need for multiple desktops, and I know that technically you can do everything OS X related on the command line, you just have to get used to some different conventions and layout. Despite all that, I've just never warmed to OS X.

Some of that might just be the fact that I don't use it as regularly. Probably if I did switch to OS X I would get used to power-user features and warm up pretty quickly. I'd have to learn to not resist all the magic that OS X places between you and the UNIX OS, appreciating that the magic is what provides the "just works" factor. I've long believed that excepting a few rough spots such as video projectors, Linux computers (with modern desktops such as GNOME or KDE) are much more likely to "just work" in any given scenario than Windows computers. In my observation OS X has both well beaten. I say this even though I've found that Ubuntu comes with a huge "just works" boost.

In the end my most important criterion is my colleagues. I know several people with similar work patterns to me who have moved from Linux to OS X. A few have become fed up and switched back. In a couple of cases the problem was performance, that was back in the mobile G4 era. I hear a lot of that's better now with Intel Core Duo. I do think that more of these folks have enjoyed the switch than have regretted it.

My leaning is more and more towards making the move. In the end it comes down to always challenging my comfort and shaking up my routine. The general stimulation of the platform switch might boost my energy and productivity, unless it's a disaster and proves a sap instead.

I've done some research on the Linux -> Mac developer switch experience, and I plan to do a good deal more today so that I can come to a rapid decision and claim the expense this year. I'd love to hear from any others who are or were in a similar situation. What are your thoughts?

[Uche Ogbuji]

via Copia

Atom Feed Semantics

Not a lot of people outside the core Semantic Web community actually want to create RDF, but extracting it from what's already there can be useful for a wide variety of projects. (RSS and Atom are first and relatively easy steps that direction.)

Terminal dump

chimezie@Zion:~/devel/grddl-hg$ python GRDDL.py --debug --output-format=n3 --zone=https:--ns=aowl=http://bblfish.net/work/atom-owl/2006-06-06/# --ns=iana=http://www.iana.org/assignments/relation/ --ns=some-blog=http://example.org/2003/12/13/  https://sommer.dev.java.net/atom/2006-06-06/transform/atom-grddl.xml
binding foaf to http://xmlns.com/foaf/0.1/
binding owl to http://www.w3.org/2002/07/owl#
binding iana to http://www.iana.org/assignments/relation/
binding rdfs to http://www.w3.org/2000/01/rdf-schema#
binding wot to http://xmlns.com/wot/0.1/
binding dc to http://purl.org/dc/elements/1.1/
binding aowl to http://bblfish.net/work/atom-owl/2006-06-06/#
binding rdf to http://www.w3.org/1999/02/22-rdf-syntax-ns#
binding some-blog to http://example.org/2003/12/13/
Attempting a comprehensive glean of  https://sommer.dev.java.net/atom/2006-06-06/transform/atom-grddl.xml
@@fetching:  https://sommer.dev.java.net/atom/2006-06-06/transform/atom-grddl.xml
@@ignoring types: ('application/rdf+xml', 'application/xml', 'text/xml', 'application/xhtml+xml', 'text/html')
applying transformation https://sommer.dev.java.net/atom/2006-06-06/transform/atom2turtle_xslt-1.0.xsl
@@fetching:  https://sommer.dev.java.net/atom/2006-06-06/transform/atom2turtle_xslt-1.0.xsl
@@ignoring types: ('application/xml',)
Parsed 22 triples as Notation 3
Attempting a comprehensive glean of  http://www.w3.org/2005/Atom

Via atom2turtle_xslt-1.0.xslt and Atom OWL: The GRDDL result document:

@prefix aowl: <http://bblfish.net/work/atom-owl/2006-06-06/#>.
@prefix iana: <http://www.iana.org/assignments/relation/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix some-blog: <http://example.org/2003/12/13/>.
[ a aowl:Feed;
     aowl:author [ a aowl:Person;
             aowl:name "John Doe"];
     aowl:entry [ a aowl:Entry;
             aowl:id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^<http://www.w3.org/2001/XMLSchema#anyURI>;
             aowl:link [ a aowl:Link;
                     aowl:rel iana:alternate;
                     aowl:to [ aowl:src some-blog:atom03]];
             aowl:title "Atom-Powered Robots Run Amok";
             aowl:updated "2003-12-13T18:30:02Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>];
     aowl:id "urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6"^^<http://www.w3.org/2001/XMLSchema#anyURI>;
     aowl:link [ a aowl:Link;
             aowl:rel iana:alternate;
             aowl:to [ aowl:src <http://example.org/>]];
     aowl:title "Example Feed";
     aowl:updated "2003-12-13T18:30:02Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>].

Planet Atom's feed

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .
 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
 @prefix foaf: <http://xmlns.com/foaf/0.1/> .
 @prefix iana: <http://www.iana.org/assignments/relation/> .
 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
[] a :Feed ;
:id "http://planetatom.net/"^^xsd:anyURI;
:title "Planet Atom" ;
:updated "2006-12-10T06:57:54.166890Z"^^xsd:dateTime;
:generator [ a :Generator;
            :uri <>;
            :generatorVersion "";
            :name """atomixlib"""];
 :entry [  a :Entry;
           :title "The Darfur Wall" ;
           :author [ a :Person; :name "James Tauber"] ;
           :link [ a :Link;
                     :rel iana:alternate ;
                     :to [ :src <http://jtauber.com/blog/2006/12/10/the_darfur_wall>;]          
           ];
:updated "2006-12-10T00:13:34Z"^^xsd:dateTime;
:published "2006-12-10T00:13:34Z"^^xsd:dateTime;
:id "http://jtauber.com/blog/2006/12/10/the_darfur_wall"^^xsd:anyURI; ]

[Uche Ogbuji]

via Copia

XML 2006 Synopsis: Are we there yet?

Well, XML 2006 came and went with a rather busy bang. My presentation on using XSLT to generate Xforms (from XUL/XHTML) was well attended and I hoped it helped increase awareness on the importance and value of XForms, (perhaps) the only comprehensive vehicle by which XML can be brought to the web in the way proponents of XML have had in mind for some time. As Simon puts it:

XML pretty (much) completely missed its original target market. SGML culture and web developer culture seemed like a poor fit on many levels, and I can't say I even remember a concerted effort to explain what XML might mean to web developers, or to ask them whether this new vision of the Web had much relationship to what they were doing or what they had planned. SGML/XML culture and web culture never really meshed.

Most of the questions I received had to do with our particular choice of FormsPlayer (an Internet Explorer plugin) instead of other alternatives such as Orbeon, Mozilla, Chiba, etc. This was a bit unfortunate and an indication of a much larger problem in this particular area of innovation we lovingly coin 'web 2.0'. I will get back to this later.

I was glad to hear John Boyer tell me he was pleasantly surprised to see mention of the Rich Web Application Backplane W3C Note. Mark Birbeck and Micah Dubinko (fellow XForms gurus and visionaries in their own rights) didn't let this pass over their radar, either.

I believe the vision outlined in that note is much more lucid than a lot of the hype-centered notions of 'web 2.0' which seem more focused on painting a picture of scattered buzzwords ('mash-ups', AJAX etc..) than commonalities between concrete architectures.

Though this architectural style accommodates solutions based on scripting (AJAX) as well as more declarative approaches, I believe the primary value is in freeing web developers from the 80% of scripting that is a result of not having an alternative (READ: browser vendor monopolies) than being the appropriate solution for the job. I've jousted with Kurt Kagle before on this topic and Mark Birkeck has written extensively on this as well.

In writing the presentation, I sort of stumbled upon some interesting observations about XUL and XForms:

  • XUL relies on a static, inarticulate means of binding components to their behavior
  • XForms relies on XPath for doing the same
  • XUL relies completely on javascript to define the behavior of it's widgets / components
  • A more complete mapping from XUL to XForms (than the one I composed for my talk) could be valuable to those more familiar with XUL as a bridge to XForms.

At the very least, it was a great way to familiarize myself with XUL.

In all, I left Boston feeling like I had experienced a very subtle anti-climax as far as innovation was concerned.
If I were to plot a graph of innovative progression over time, it would seem to me that the XML space has plateaued as of late and political in-fighting and spec proliferation has overtaken truly innovative ideas. I asked Harry Halpin about this and his take on it was that perhaps "XML has won". I think there is some truth to this, though I don't think XML has necessarily made the advances that were hoped in the web space (as Simon St. Laurent put it earlier).

There were a few exceptions however

XML Pipelines

I really enjoyed Norm Walsh's presentation on XProc and it was an example of scratching a very real itch: consensus on a vocabulary for XML processing workflows. Though, ironically, it probably wouldn't take much to implement in 4Suite as support for most (if not all) of the pipeline operations are already there.

I did ask Norm if XProc would support setting up XPath variables for operations that relied on them and was pleased to hear that they had that in mind. I also asked about support for non-standard XML operations such as XUpdate and was also pleased to hear that they had that covered as well. It was worth noting that XUpdate by itself could make the viewport operation rather redudant.

The Semantic Web Contingent

There was noticeable representation by semantic web enthusiasts (myself, Harry Halpin, Bob Ducharm, Norm Walsh, Elias Torres, Eric Prud'hommeux, Ralph Hodgson, etc..) and their presentations had somewhat subdued tones (perhaps) so as not to incite ravenous bickering from narrow-minded enthusiasts. There was still some of that however as I was asked by someone why RDF couldn't be persisted natively as XML, queried via XQuery, and inferred over via extension functions! Um... right... There is some irony in that as I have yet to find a legitimate reason myself to even use XQuery in the first place.

The common scenario is when you need to query across a collection of XML documents, but I've personally preferred to index XML documents with RDF content (extracted from a subset of the documents), match the documents via RDF, isolate a document and evaluate an XPath against it essentially bypassing the collection extension to XPath with a 'semantic' index. Ofcourse, this only makes sense where there is a viable mapping from XML to RDF, but where there is one I've preferred this approach. But to each his/her own..

Content Management API's

I was pleasantly surprised to learn from Joel Amousou that there is a standard (a datastore and language-agnostic? standard) for CMS APIs. called JSR-170. The 4Suite repository is the only Content Mangement System / API with a well though-out architecture for integrating XML & RDF persistence and processing in a way that emphasizes their strengths with regard to content management. Perhaps there is some merit in investigating the possibility of porting (or wrapping) the 4Suite repository API as JSR-170? Joel seems to think so.

Meta-stylesheets

Micheal Kay had a nice synopsis of the value of generating XSLT from XSLT – a novel mechanism I've been using for some time and it was interesting to note that one of his prior client projects involved a pipeline that started with an XForm, post-processed by XSLT and aggregated with results from an Xquery (also generated from XSLT).

Code generation is a valuable pattern that has plenty unrecognized value in the XML space and I was glad to see Micheal Kay highlight this. He had some choice words on when to use XSLT and when to use XQuery that I thought was on point: Use XSLT for re-purposing, formatting and use Xquery for querying your database.

GRDDL

Finally, I spent quite some time with Harry Halpin (chair of the GRDDL Working Group) helping him installing / using the 4Suite / RDFLib client I recently wrote for use with the GRDDL test suite. You can take what I say with a grain of salt (as I am a member and loud, vocal supporter), but I think that GRDDL will end up having the most influential impact in the semantic web vision (which I believe is much less important than the technological components it relies on to fulfill the vision) and XML adoption on the web than any other, primarily because it allows content publishers to leverage the full spectrum of both XML and RDF technologies.
Within my presentation, I mention an architectural style I call 'modality segregation' that captures the value proposition of XSLT for drawing sharp, distinguishable boundaries (where there were once none) between:

  • content
  • presentation
  • meaning (semantics)
  • application behavior

I believe it's a powerful idiom for managing, publishing, and consuming data & knowledge (especially over the web).

Harry demonstrated how easy it is to extract review data, vocabulary mappings, and social networks (the primary topic of his talk) from XHTML that would ordinarily be dormant with regards to everything other than presentation.
We ran into a few snafus with 4Suite when we tried to run Norm Walsh's hCard2RDF.xslt against Dan Connolleys web site and Harrys home page. We also ran into problems with the client (which is mostly compliant with the Working Draft).

I also had the chance to set Harry up with my blazingly fast RETE-based N3 reasoner, which we used to test GRDDL-based identity consolidation by piping multiple GRDDL results (from XHTML with embedded XFN) into the reasoner, performing an OWL DL closure, and identifying duplicate identities via Inverse Functional Properties (smushing)

As a result of our 5+ hour hackathon, I ended up writing 3 utilities that I hope to release once I find a proper place for them:

  • FOAFVisualizer - A command-line tool for merging and rendering FOAF networks in a 'controlled' and parameterized manner
  • RDFPiedPipe - A command-line tool for converting between the syntaxes that RDFLib supports: N3, Ntriples, RDF/XML
  • Kaleidos - A library used by FOAFVisualizer to control every aspect of how an RDF graph (or any other network structure) is exported to a graphviz diagram via BGL-Python bindings.

In the final analysis, I feel as if we have reached a climax in innovation only to face a bigger challenge from politics than anything else:

  • RDFa versus eRDF
  • SPARQL without entailment versus SPARQL with OWL entailment
  • XHTML versus HTML5
  • Web Forms versus XForms
  • Web 2.0 versus Web 3.0
  • AJAX versus XForms
  • XQuery versus XSLT
  • XQuery over RDF/XML versus SPARQL over abstract RDF
  • XML 1.0 specifications versus the new 2.0 specifications

The list goes on. I expressed my concerns about the danger of technological camp warefare to Liam Quin (XML Activity Lead) and he concurred. We should spend less time arguing over whether or not my spec is more l33t than yours and more time asking the more pragmatic questions about what solutions works best for the problem(s) at hand.

[Uche Ogbuji]

via Copia

Today's XML“wot he said”

Is it just me, but isn't XQuery just a tardy ugly solution looking for a problem? And thinking it's going to excite people who write Mashups is hopeful, but possible if we see it supported in the browser, I guess. But the syntax. Ugh!
Paul Downey

I wasn't able to attend XML 2006 because I knew the e-commerce launch I've been working with at Sun was too close. This week I've been in marathon planning sessions planning the next phase of the Sun project. I'm in the company of clever, engaging people who've traveled from various U.S. and European spots, so I get a little conference flavor in a much more focused setting. But I'm still getting a bit of the XML 2006 fix through attendee Weblogs.

The first thing that came to mind was "WTF! That's a hella lotta XQuery at the conference". I certainly don't miss that part. And now they have programming extensions of some sort? XQueryP? My stars! what a yucky idea. The enterprisey set seem to like XQuery, and I grant there's more substance to XQuery than some enterprisey fads such as WS-Jigsaw. So FLWR and friends are here to stay, whether I like it or not. No worries. I'll just arrange my affairs so that I have to deal with the muck as little as possible. But that doesn't mean I can't enjoy the occasional barb from a fellow nay-sayer. Wot Paul said.

[Uche Ogbuji]

via Copia