Today's XML wot he said

...simple - just don't use script in XSLT unless you really really really have to. Especially on the server side - XSLT script and ASP.NET should never meet. Use XSLT extension objects instead. As simple as it is.

Oleg Tkachenko—"XSLT scripting (msxsl:script) in .NET - pure evil"""

Amen, f'real. When XSLT 1.1 first emerged the first thing that jumped out from the spec and punched me in the face was the embedded script facility. I made a fuss about it:

In general, I think the re-introduction of xml:script is execrable. XSLT 1.0 had perhaps the most elegant extension model possible, and xsl:script ruins this by destroying the opacity of extensions to XSLT processors. Language bindings may make sense in the realm of CORBA or DOM, where the actual expression of the program is done in the bound language, but XSLT is XSLT, and introducing the need for language bindings only reduces general interoperability while giving a small boost to interoperability between small axes of implementations.

I even worked with some like-minded folk to put together a petition. I have no idea whether that was instrumental in any way, but soon enough XSL 1.1 was dead and replaced with XSLT 2.0, which was built on XPath 2.0 and thus had other big problems, but at least no xsl:script.

xsl:script does live on in some implementations, and notably MSXML, as you can see from Oleg's post. You can also see some of the problems. XSLT and many more general-purpose languages make for uncomfortable fit and it can be hard for platform developers and users make things work smoothly and reliably. More important than memory leaks, script-in-xsl is a huge leak of XSLT's neat abstraction, and I think this makes XSLT much less effective. For one thing users are tempted to take XSLT to places where it does not fit. XSLT is not a general-purpose language. At the same time users tend not to learn good XSLT design and techniques because they scripting becomes an escape hatch. So an script user in XSLT generally cripples the language at the same time he is over-using it. An unfortunate combination indeed.

Oleg advocates XSLT extensions rather than scripting, which is correct, but I do want to mention that once you get used to writing extensions, it can be easy to slip into habits as bad as scripting. I've never been tempted to implement a Python scripting extension in 4XSLT, which would be easy, but that didn't stop me from going through a phase of overusing extensions. I think I've fully recovered, and the usage pattern I definitely recommend is to write the general-purpose code in a general-purpose language (Python, C#, whatever) and then call XSLT for the special and narrow purpose of transforming XML, usually for the last mile of presentation. It seems obvious, and yet it's a lesson that seems to require constant repetition.

[Uche Ogbuji]

via Copia

EXSLT/XML/JSON complications

Bruce D'Arcus commented on my entry "Creating JSON from XML using XSLT 1.0 + EXSLT", and following up on his reply put me on a bit of a journey. Enough so that the twists merit an entry of their own.

Bruce pointed out that libxslt2 does not support the str:replace function. This recently came up in the EXSLT mailing list, but I'd forgotten. I went through this thread. Using Jim's suggestion for listing libxslt2 supported extensions (we should implement something like that in 4XSLT) I discovered that it doesn't support regex:replace either. This is a serious pain, and I hope the libxslt guys can be persuaded to add implementations of these two very useful functions (and others I noticed missing).

That same thread led me to a workaround, though. EXSLT provides a bootstrap implementation of str:replace, as it does for many functions. Since libxslt2 does support the EXSLT functions module, it's pretty easy to alter the EXSLT bootstrap implementation to take advantage of this, and I did so, creating an updated replace.xsl for processors that support the Functions module and exsl:node-set. Therefore a version of the JSON converter that does work in libxslt2 (I checked) is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:func="http://exslt.org/functions"
    xmlns:str="http://exslt.org/strings"
    xmlns:js="http://muttmansion.com"
    extension-element-prefixes="func">

  <xsl:import href="http://copia.ogbuji.net/files/code/replace.xsl"/>
  <xsl:output method="text"/>

  <func:function name="js:escape">
    <xsl:param name="text"/>
    <func:result select='str:replace($text, "&apos;", "\&apos;")'/>
  </func:function>

  <xsl:template match="/">
var g_books = [
<xsl:apply-templates/>
];
  </xsl:template>

  <xsl:template match="book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)"/>',
first: '<xsl:value-of select="js:escape(author/first)"/>',
last: '<xsl:value-of select="js:escape(author/last)"/>',
publisher: '<xsl:value-of select="js:escape(publisher)"/>'
}
  </xsl:template>

</xsl:transform>

One more thing I wanted to mention is that there was actually a bug in 4XSLT's str:replace implementation. I missed that fact because I had actually tested a variation of the posted code that uses regex:replace. Just before I posted the entry I decided that the Regex module was overkill since the String module version would do the trick just fine. I just neglected to test that final version. I have since fixed the bug in 4Suite CVS, and you can now use either str:replace or regex:replace just fine. Just for completeness, the following is a version of the code using the latter function:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:func="http://exslt.org/functions"
    xmlns:regex="http://exslt.org/regular-expressions"
    xmlns:js="http://muttmansion.com"
    extension-element-prefixes="func">

  <xsl:output method="text"/>

  <func:function name="js:escape">
    <xsl:param name="text"/>
    <func:result select='regex:replace($text, "&apos;", "g", "\&apos;")'/>
  </func:function>

  <xsl:template match="/">
var g_books = [
<xsl:apply-templates/>
];
  </xsl:template>

  <xsl:template match="book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)"/>',
first: '<xsl:value-of select="js:escape(author/first)"/>',
last: '<xsl:value-of select="js:escape(author/last)"/>',
publisher: '<xsl:value-of select="js:escape(publisher)"/>'
}
  </xsl:template>

</xsl:transform>

[Uche Ogbuji]

via Copia

Creating JSON from XML using XSLT 1.0 + EXSLT

The article “Generate JSON from XML to use with Ajax”, by Jack D Herrington, is a useful guide to managing data in XML on the server side, and yet using JSON for AJAX transport for better performance, and other reasons. The main problem with the article is that it uses XSLT 2.0. Like most cases I've seen where people are using XSLT 2.0, there is no reason why XSLT 1.0 plus EXSLT doesn't do the trick just fine. One practical reason to prefer the EXSLT approach is that you get the support of many more XSLT processors than Saxon.

Anyway, it took me all of 10 minutes to cook up an EXSLT version of the code in the article. The following is listing 3, but the same technique works for all the XSLT examples.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:func="http://exslt.org/functions"
    xmlns:str="http://exslt.org/strings"
    xmlns:js="http://muttmansion.com"
    extension-element-prefixes="func">

  <xsl:output method="text" />

  <func:function name="js:escape">
    <xsl:param name="text"/>
    <func:result select='str:replace($text, "&apos;", "\&apos;")'/>
  </func:function>

  <xsl:template match="/">
var g_books = [
<xsl:apply-templates/>
];
  </xsl:template>

  <xsl:template match="book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)"/>',
first: '<xsl:value-of select="js:escape(author/first)"/>',
last: '<xsl:value-of select="js:escape(author/last)"/>',
publisher: '<xsl:value-of select="js:escape(publisher)"/>'
}
  </xsl:template>

</xsl:transform>

I also converted the code to a cleaner, push style from what's in the article.

[Uche Ogbuji]

via Copia

The Versatility of XForms

I'll be giving a presentation at the upcoming XML 2006 Conference in Boston on Tuesday December 5th at 1:30pm: The Essence of Declarative, XML-based Web Applications: XForms and XSLT.

I've been doing some hardcore XSLT/XForms development over the last 2 years or so and have come to really admire the versatility of using XSLT to generate XForms user interfaces. Using XSLT to generate XHTML from compact XML documents is a well known design pattern for seperating content from presentation. Using XSLT to generate XHTML+XForms takes this to the nth degree by seperating content from both presentation and behavior (The Model View Controller design pattern).

The icing on the cake is the XPath processing capabilities native to both XSLT and XForms. It makes for easily-managed and relatively compact applications with very little redundancy.

The presentation doesn't cover this, but the XForm framework also includes transport-level components / mechanisms that are equally revolutionary in how they tie web clients into the overall web architecture context very comprehensively (Rich Web Application Backplane has good coverage of patterns to this effect). I've always thought of XForms as a complete infrastructure for web application development and AJAX as more of an interim, scripting gimick that enables capabilities that are a small portion of what XForms has to offer.

[Uche Ogbuji]

via Copia

Parsing RDF from XSLT Prospectively

4Suite repository Document Definitions can now support both XML and text-based serialization of RDF. Document Definitions essentially facilitate database replication of XML to RDF (within a content management system that persists both). The mechanism is similar to transactional data replication in database management systems where modifications to a table triggers the replication. Previously, they were only expected to output to RDF/XML - which has well-known issues.

Now, the repository persistence driver attempts to parse the resulting RDF syntax based on the XSLT output method. This allows for a hueristic to prospectively attempt to accomodate non-XML syntax (such as Notation 3 - the only substantial RDF text-based syntax) as well as RDF/XML (and even TriX).

The main advantage for these syntax alternatives is a faster, more efficient parse time in addition to more human readable syntax (especially for data that was meant to be expressed in this way). This switching off the xsl:output method is analagous to switching off HTTP header content-type values for remote RDF graphs (where the parsing is also a bottleneck).

Ofcourse, 4Suite's aging RDF library doesn't properly perist N3 formulae (which are logic syntactic sugar specific to Notation 3) from the parser it uses.

Imagine using a Document Definition to, say, replicate SWRL's XML syntax into Notation 3's implication syntax for a logic programming database:

if x1 hasParent x2, x2 hasSibling x3, and x3 hasSex male, then x1 hasUncle x3

SWRL Rule

<ruleml:imp> 
  <ruleml:_rlab ruleml:href="#example1"/>
  <ruleml:_body> 
    <swrlx:individualPropertyAtom  swrlx:property="hasParent"> 
      <ruleml:var>x1</ruleml:var>
      <ruleml:var>x2</ruleml:var>
    </swrlx:individualPropertyAtom> 
    <swrlx:individualPropertyAtom  swrlx:property="hasBrother"> 
      <ruleml:var>x2</ruleml:var>
      <ruleml:var>x3</ruleml:var>
    </swrlx:individualPropertyAtom> 
  </ruleml:_body> 
  <ruleml:_head> 
    <swrlx:individualPropertyAtom  swrlx:property="hasUncle"> 
      <ruleml:var>x1</ruleml:var>
      <ruleml:var>x3</ruleml:var>
    </swrlx:individualPropertyAtom> 
  </ruleml:_head> 
</ruleml:imp>

Notation 3 Rule

{ ?x1 :hasParent ?x2; ?x2 :hasSibling ?x3; ?X3 :hasSex :Male } => { ?x1 :hasUncle ?x3 }.

Now imagine using GRDDL to publish a common set of rules as SWRL, with a profile to transform them to Notation 3 for scutters that understand.

Chimezie Ogbuji

via Copia

LazyWeb Ho! Detecting whether a browser supports XML+XSLT

I'm wrapping up applyxslt, a WSGI middleware module to serve separate XML and XSLT to browser that can handle it (using the stylesheet PI. For browsers that can't it would intercept the response and perform the XSLT transform for the browser, sending on the result. BTW, for more on WSGI Middleware, see “Mix and match Web components with Python WSGI”.

My biggest uncertainty is the best way to determine whether a browser can handle XML+XSLT. I doubt anything in the Accept header would help, so I'm left having to list all User-Agent strings for browsers that I know can handle this (basically Firefox, Opera, and recent Mozilla, Safari and MSIE).

So far I'm deriving my User-Agent list from several sources, including

Really the Wikipedia list is all I needed, but I found and worked with the other ones first.

So based on that here is the list of User-Agent string patterns I am treating as evidence the browser does understand XML+XSLT (Python/Perl regex):

.*MSIE 5.5.*
.*MSIE 6.0.*
.*MSIE 7.0.*
.*Gecko/2005.*
.*Gecko/2006.*
.*Opera/9.*
.*AppleWebKit/31.*
.*AppleWebKit/4.*

Note: this hoovers up a few browser versions I'm not entirely sure of: Minimo, AOL Explorer and OmniWeb. I'm fine with some such uncertainty, but if anyone has any suggestions for further refinement of this list, let me know. I'd like to keep it updated.

[Uche Ogbuji]

via Copia

What does GRDDL have to do with Intelligent Agents?

GRDDL. What is it? Why the long name? It does something very specific that requires a long name to describe it. Etymology of biological names includes examples of the same phenomenon in a different discipline. I starting writing on this weblog mainly as a way to regularly excercise my literary expression, so (to that end) I'm going to try to explain GRDDL in as few words as I can while simultaneously embelishing.

It is a language (or dialect) translator. It Gleans (gathers or harvests) Resource Descriptions. Resource Descriptions can be thought to refer to the use of constructs in Knowledge Representation. These constructs are often used to make assertions about things in sentence form - from which additional knowledge can be infered. However, it is also the 'Resource Description' in RDF (no coincidence there). RDF is the target dialect. GRDDL acts as an intelligent agent (more on this later) that performs translations from specific (XML) vocabularies, or Dialects of Languages to abstract RDF syntax.

Various languages can be used but there is a natural emphasis on a language (XSLT) with a native ability to process XML.

GRDDL is an XML & RDF formalism in what I think is a hidden pearl of web architecture: a well-engineered environment for distributed processing by intelligent agents. It's primarily the well-engineered nature of web architecture that lends the neccessary autonomy that intelligent agents require. Though hidden, there is much relevance with contemporaries, predecessors, and distant cousins:

It earns its keep mostly with small, well-designed XML formats. As a host language for XSLT it sets out to be (perhaps) a bridge across the great blue and red divide of XML & RDF. To quote a common parlance: watch this space.

 

Chimezie Ogbuji]

via Copia

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

Subtitle: Learn how to manipulate XML in the Firefox browser using JavaScript features
Synopsis: In this third article of the XML in Firefox 1.5 series, you learn to manipulate XML with the JavaScript implementation in Mozilla Firefox. In the first two articles, XML in Firefox 1.5, Part 1: Overview of XML features and XML in Firefox 1.5, Part 2: Basic XML processing, you learned about the different XML-related facilities in Mozilla Firefox, and the basics of XML parsing, Cascading Style Sheets (CSS), and XSLT stylesheet invocation.

Continuing with the series this article provides examples for loading an XML file into Firefox using script, for applying XSLT to XML, and for loading XML with references to scripts. In particular the latter trick is used to display XML files with rendered hyperlinks, which is unfortunately still a bitt of a tricky corner of the XML/Web story. I elaborate more on this trick in my tutorial “Use Cascading Stylesheets to display XML, Part 2”.

See also:

[Uche Ogbuji]

via Copia