Copia

LazyWeb Ho! Detecting whether a browser supports XML+XSLT

I'm wrapping up applyxslt, a WSGI middleware module to serve separate XML and XSLT to browser that can handle it (using the stylesheet PI. For browsers that can't it would intercept the response and perform the XSLT transform for the browser, sending on the result. BTW, for more on WSGI Middleware, see “Mix and match Web components with Python WSGI”.

My biggest uncertainty is the best way to determine whether a browser can handle XML+XSLT. I doubt anything in the Accept header would help, so I'm left having to list all User-Agent strings for browsers that I know can handle this (basically Firefox, Opera, and recent Mozilla, Safari and MSIE).

So far I'm deriving my User-Agent list from several sources, including

Wikipedia (the daddy of all User-Agent lists I've seen)
"Masquerading Your Browser", by Eric Giguere (the "Common User-Agent Values" section) and
"Understanding user-agent strings"

Really the Wikipedia list is all I needed, but I found and worked with the other ones first.

So based on that here is the list of User-Agent string patterns I am treating as evidence the browser does understand XML+XSLT (Python/Perl regex):

.*MSIE 5.5.*
.*MSIE 6.0.*
.*MSIE 7.0.*
.*Gecko/2005.*
.*Gecko/2006.*
.*Opera/9.*
.*AppleWebKit/31.*
.*AppleWebKit/4.*

Note: this hoovers up a few browser versions I'm not entirely sure of: Minimo, AOL Explorer and OmniWeb. I'm fine with some such uncertainty, but if anyone has any suggestions for further refinement of this list, let me know. I'd like to keep it updated.

[Uche Ogbuji]

via Copia

What does GRDDL have to do with Intelligent Agents?

GRDDL. What is it? Why the long name? It does something very specific that requires a long name to describe it. Etymology of biological names includes examples of the same phenomenon in a different discipline. I starting writing on this weblog mainly as a way to regularly excercise my literary expression, so (to that end) I'm going to try to explain GRDDL in as few words as I can while simultaneously embelishing.

It is a language (or dialect) translator. It Gleans (gathers or harvests) Resource Descriptions. Resource Descriptions can be thought to refer to the use of constructs in Knowledge Representation. These constructs are often used to make assertions about things in sentence form - from which additional knowledge can be infered. However, it is also the 'Resource Description' in RDF (no coincidence there). RDF is the target dialect. GRDDL acts as an intelligent agent (more on this later) that performs translations from specific (XML) vocabularies, or Dialects of Languages to abstract RDF syntax.

Various languages can be used but there is a natural emphasis on a language (XSLT) with a native ability to process XML.

GRDDL is an XML & RDF formalism in what I think is a hidden pearl of web architecture: a well-engineered environment for distributed processing by intelligent agents. It's primarily the well-engineered nature of web architecture that lends the neccessary autonomy that intelligent agents require. Though hidden, there is much relevance with contemporaries, predecessors, and distant cousins:

It earns its keep mostly with small, well-designed XML formats. As a host language for XSLT it sets out to be (perhaps) a bridge across the great blue and red divide of XML & RDF. To quote a common parlance: watch this space.

Chimezie Ogbuji ]

via Copia

Discussion group for Atom protocol implementations in Python

I've had discussions about implementing Atom protocol in Python with man colleagues, and I decided to create a proper forum for discussion, and so the Google group atom-protocol-python was born.

A group dedication to discussion among developers and users of Python libraries and tools for processing the atom protocol, either as client or server.

Honestly, the idea of an Atom store, and of an Atom client is so broad that I expect there to be several implementations in Python. This group is to be very open, and I'd love for even folks working on competing implementations to join up, so we can at least discuss interoperability.

And don't forget there is also an Atom IRC channel where we can discuss Atom syntax and Atom protocol. And while I'm plugging stuff, I shan't forget Planet Atom.

[Uche Ogbuji]

via Copia

“Mix and match Web components with Python WSGI”

Subtitle: Learn about the Python standard for building Web applications with maximum flexibility
Synopsis: Learn to create and reuse components in your Web server using Python. The Python community created the Web Server Gateway Interface (WSGI), a standard for creating Python Web components that work across servers and frameworks. It provides a way to develop Web applications that take advantage of the many strengths of different Web tools. This article introduces WSGI and shows how to develop components that contribute to well-designed Web applications.

Despite the ripples in the Python community over Guido's endorsement of Django (more on this in a later posting), I'm not the least bit interested in any one Python Web framework any more. WSGI has set me free. WSGI is brilliant. It's certainly flawed, largely because of legacy requirements, but the fact that it's so good despite those flaws is amazing.

I wrote this article because I think too many introductions to WSGI, and especially middleware, are either too simple or too complicated. In line with my usual article writing philosophy of what could I have read when I started out to make me understand this topic more clearly, I've tried to provide sharp illustration of the WSGI model, and a few clear and practical examples. The articles I read that were too simple glossed over nuances that I think should really be grasped from the beginning (and are not that intimidating). In the too-complicated corner is primarily PEP 333 itself, which is fairly well written, but too rigorous for an intro. In addition, I think the example of WSGI middleware in the PEP is very poor. I'm quite proud of the example I crafted for this article, and I hope it helps encourage more people to create middleware.

I do want to put in a good word for Ian Bicking and Paste. He has put in tireless effort to evangelize WSGI (it was his patient discussion that won me over to WSGI). In his Paste toolkit, he's turned WSGI's theoretical strengths into readily-available code. On the first project I undertook using a Paste-based framework (Pylons), I was amazed at my productivity, even considering that I'm used to productive programming in Python. The experience certainly left me wondering why, BDFL or no BDFL, I would choose a huge mega-framework over a loosely-coupled system of rich components.

[Uche Ogbuji]

via Copia

“Dynamic SVG features for browsers”

Subtitle: Build on SVG basics to create attractive, dynamic effects in your Web projects
Synopsis: Learn how to use dynamic features of Scalable Vector Graphics (SVG) to provide useful and attractive effects in your Web applications. SVG 1.1, an XML language for describing two-dimensional vector graphics, provides a practical and flexible graphics format in XML. Many SVG features provide for dynamic effects, including features for integration into a Web browser. Build on basic SVG techniques introduced in a previous tutorial.
Lead-in: SVG is a technology positioned for many uses in the Web space. You can use it to present simple graphics (as with JPEG) or complex applications (as with Macromedia Flash). An earlier tutorial from June 2006 introduced the basic features of the format. This tutorial continues to focus on SVG for Web development, as it demonstrates dynamic effects that open up new means of enhancing Web pages. The lessons are built around examples that you can view and experiment with in your favorite browser.
Developed by the W3C, SVG has the remarkable ambition of providing a practical and flexible graphics format in XML, despite the notorious verbosity of XML. It can be developed, processed, and deployed in many different environments -- from mobile systems such as phones and PDAs to print environments. SVG's feature set includes nested transformations, clipping paths, alpha masks, raster filter effects, template objects, and, of course, extensibility. SVG also supports animation, zooming and panning views, a wide variety of graphic primitives, grouping, scripting, hyperlinks, structured metadata, CSS, a specialized DOM superset, and easy embedding in other XML documents. Many of these features allow for dynamic effects in images. Overall, SVG is one of the most widely and warmly embraced XML applications.
Dynamic SVG is a hot topic, and several tutorials and articles are available that include fairly complicated examples of dynamic SVG techniques. This tutorial is different because it focuses on a breadth of very simple examples. You will be able to put together the many techniques you learn in this tutorial to create effects of whatever sophistication you like, but each example in this tutorial is simple, clear, and reasonably self-contained. The tutorial rarely deals with any SVG objects more complex than the circle shape, and it keeps embellishments in scripting and XML to a minimum. The combination of simple, step-by-step development and a focus on real-world browser environment makes this tutorial unique.

Pay attention to that last paragraph. There are many SVG script/animation tutorials out there, including several by IBM developerWorks, but I found most of them don't really suit my learning style, and i set out to write a tutorial that would have been ideal for me when I was first learning dynamic SVG techniques. The tutorial covers CSS animation, other scripting techniques and SMIL declarative animations. It builds on the earlier tutorial “Create vector graphics in the browser with SVG”.

Is USPTO abandoning XML in its electronic filing system?

I wrote an article a while back, "Thinking XML: Patent filings meet XML" in which I covered, among other things, the various patent agancies' efforts to support electronic filing. Many of these efforts are XML-based. Except now perhaps the USPTO's (EFS-Web) isn't. There were a lot of gnarly aspects of the EFS-Web process, and I had heard from some users who ended up abandoning the system. It looks as if the USPTO is trying to address these problems by chucking the whole approach and just having people upload PDFs (via XML.org Daily Newslink--yeah. I'm way behind). I wonder whether they also considered supporting ODF, at least as an alternative to PDF. It seems to me that what they needed was broader, not narrower format and tool support.

[Uche Ogbuji]

via Copia

XML Universal names (namespaces): To fuse or not to fuse

I ran into Ken MacLeod on the Atom IRC channel today. Actually I think I've chatted with him before but I didn't know the nick I was responding to was Ken. Certainly a fortuitous discovery, but more importantly Ken drew my attention to an old Weblog posting I'd somehow missed. In it he makes two separate points that I think overrun in perhaps a confusing way. Firstly he advocates that XML APIs strictly treat a node's universal (i.e. namespace-qualified) name as a tightly bound unit. I (arbitrarily) call this fusing the universal name. An example is APIs such as ElementTree that use James Clark notation for namespaces.

The second point is that some XML APIs have really ungainly syntax for handling namespaces, with SAX and DOM being the worst offenders (we both agree that the Java/IDL heritage of these APIs is the worst problem).

To take the first point first, I disagree that APIs based on fused names are superior. Yes an XML universal name should conceptually be a unit, but in practice it is not, and people often have a real need to work severally with either piece of the tuple. I used the analogy of complex numbers in the IRC discussion. The complex number 3 + 2j is a single number, but there is nothing wrong with an API's making it easy for a user to manipulate its real and imaginary parts. It's up to the developer not to somehow abuse this flexibility.

The second point is well taken, but I strongly believe that it is not the lack of fused names that makes an API poor. SAX (SAX 2, to be precise) is poor because of the redundancy and odd structural conventions in reporting information such as prefixes. DOM (DOM Level 2, to be precise) is poor because of the bewildering decision to maintain namespace declarations as separate attribute objects in addition to the redundant information offered as node object properties. Both owe much of their weakness to concerns for backwards compatability with non-namespace-aware APIs.

For the most part APIs that were born and bred in the era of namespaces are much less tortured, regardless of how tightly or loosely they expose the parts of a universal name. If anything, I believe that it's better for the API to make it easy for the developer to separate namespace name, local name and prefix. Yes, even prefix, because the golden world in which prefixes are irrelevant does not exist. It was ruined by the advent of QNames in content, or "hidden namespaces". Yes, this happens to be one negative side-effect of the success of XSLT and XPath, which were great specs for the most part, but also represented the first real triumph of the hidden namespaces idea that has left such a mess in its wake.

Ken ended his Weblog post with a proposed notation for fused names, which is just like one I had mulled and discarded for Amara, He also showed me a derived convention for his Orchard software, which I didn't know was still in development (it looks very interesting). This convention looks just like the mapping accessors I somewhat grudgingly added to Amara in February. I'm still open to changing this until Amara 1.2, so I'll give the whole matter some thought (including the Orchard approach). Feedback is welcome. I think the fact that Ken and I independently came up with a such a series of similar ideas makes me think we're on the right track.

I do want to make sure it's clear that giving the user a better API for namespaces is not bound to insistence on fused names.

[Uche Ogbuji]

via Copia

Some thoughts on QNames in content (including proposal for a better, ahem, name)

I'll cut your ass in half and leave you with a semi-colon

—Mr. Man

QNames in content have been on my brain today. See a follow up posting for more on why.

First of all I think we should find a new name for this phenomenon, because I don't think QNames qua QNames are key to the problem. For one thing, you have the problem even if you only use a prefix in content, as XSLT does in, say the extension-element-prefixes, or XPath in html:*/html:span (the second step is a QName, but not the first). I think a better name for this problem is "hidden namespaces" because that's exactly the problem: the document depends on a construct that is hiding a namespace in a separate layer where generic processing cannot see it.

Whatever the name, I re-read today a couple of important documents regarding the issue. First of all there is the TAG finding on QNames, which is unfortunately not much more than an agglomeration of existing wisdom. Norm Walsh, the editor of that document wrote of a more radical direction as part of his "XML 2.0" article. I like his ideas (though I'm partial to Jeffrey Yasskin's ampersand variation, and I hope conversation soon drives towards something along those lines. XML is almost ten years old, and I see nothing wrong with a bit of a shake-up.

Until then, I think we can deploy two safeguards to protect ourselves from the subtle problems of namespaces. I call them: "sanity within the document, and registries without". The two components are very different in character.

Firstly, we need to discard the idea of in-document scoping of namespaces. It seemed a great idea at the time, even to me, but in practice it's a mess, and Joe English was the first one to illuminate the mess in the light of a brilliant metaphor (Google cache of original since XML-DEV is down now). (See my article "Principles of XML design: Use XML namespaces with care" for more on this). If we can rely on sanity in XML documents we can at least simplify state processing a good deal. Ideally all the XML sources in an XML processing pipeline would emit sane XML.

Secondly, I think the time has come for namespace registries. It would definitely be nice to build on the unfortunately stalled RDDL, but whereas the goal of RDDL is to provide human readable information, what I think we really need in a namespace registry is a little nugget of machine-readable data. Drumroll please...

A list of preferred prefixes for a namespace (supporting lookup of namespace name to well-known prefix, and vice versa). I know this will be controversial. Prefixes are supposed to be insignificant. Users should have flexibility to use whatever prefix, blah blah blah. I'm sorry, but that's all theoretically nice, but we have practical problems to solve. The fact that the most powerful constructs in XPath depend for their semantics on the whimsy of prefix choices should bother you a bit. The fact that Canonical XML had to abandon the idea of normalizing prefixes should bother you even more. It's time to just say that xsl means "The XSLT namespace" (yeah, yeah: "what version?" etc.—hard problems would still remain) and that if you choose to use it for a different namespace, you're technically compliant to namespaces, but you're asking for a heaping help of trouble, buddy.

For now I'm just throwing out ideas to help organize my thoughts, and for discussion. It seems to me that if we could rely on authors and tools, supported by a registry, to produce sane documents that (wherever possible) used essentially reserved prefixes, including, of course, for hidden namespaces, we could simplify namespace-aware processing a great deal. I can think of some practical hurdles for the registry idea, but I can't think of any reason why it's not even worth a try.

[Uche Ogbuji]

via Copia

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

Subtitle: Learn how to manipulate XML in the Firefox browser using JavaScript features
Synopsis: In this third article of the XML in Firefox 1.5 series, you learn to manipulate XML with the JavaScript implementation in Mozilla Firefox. In the first two articles, XML in Firefox 1.5, Part 1: Overview of XML features and XML in Firefox 1.5, Part 2: Basic XML processing, you learned about the different XML-related facilities in Mozilla Firefox, and the basics of XML parsing, Cascading Style Sheets (CSS), and XSLT stylesheet invocation.

Continuing with the series this article provides examples for loading an XML file into Firefox using script, for applying XSLT to XML, and for loading XML with references to scripts. In particular the latter trick is used to display XML files with rendered hyperlinks, which is unfortunately still a bitt of a tricky corner of the XML/Web story. I elaborate more on this trick in my tutorial “Use Cascading Stylesheets to display XML, Part 2”.

I Wish XForms was Recursively Declarative

I've become a big fan of declarative problem solving lately, which is one of the reasons I really enjoy composing web-based user interfaces with XSLT and XForms. However, I was thinking about how I would build an XForm to edit a very recursive structure, an EBNF instance as an XML document. I thought it would be nice to define a widget (an xf:group) for each of the more major components of a grammar and (in XSLT push fashion) recursively render a form for editing an instance of the grammar.

After all, XSLT was the main reason I really like the idea of schematron for document validation. The XML infoset is perfect match for capturing an EBNF, since it is purely syntactic and very recursive. So, it's a shame I couldn't take advantage of an XML-based user interface's processing mechanism (like XForms) to render an edit form in the same way an xsl:apply-template with a mode would.

Imagine:

Grammar Instance

SELECT * WHERE { OPTIONAL { GRAPH ?provenance { ?person a foaf:Person } } }

Grammar Instance (as an XML Document)

<SelectQuery>
  <AllVariables/>  
  <Where>
    <GroupGraphPattern>
      <GraphPattern>
        <OPTIONAL/>
        <GroupGraphPattern>
          <GraphPattern>
            <GRAPH graphName="?provenance">
            <GroupGraphPattern>
               <BasicGraphPattern>?person a foaf:Person</BasicGraphPattern>
            </GroupGraphPattern>
          </GraphPattern>      
        </GroupGraphPattern>
    </GroupGraphPattern>
  </Where>
</SelectQuery>

XForm snippet

<xf:group ref="SelectQuery/Where">
    <xf:group ref="GroupGraphPattern" mode="push">
      <fieldset>
        <legend>A SPARQL GroupGraphPattern</legend>
        <xf:apply-templates-equivalent mode="push"/>
      </fieldset>
    </xf:group>
</xf:group>

Which would render a radial set of fieldsets, one for each GroupGraphPattern in the recursive structure. Somewhat related: Quadtrees in Javascript and CSS

I guess I can see how having to maintain the dependencies in this scenario would be something similar to having an XSLT processor bound to a 'live' XML instance - very expensive.

Chimezie Ogbuji

via Copia

Copia

Ogbujis on an abundance of topics

Tag xml

LazyWeb Ho! Detecting whether a browser supports XML+XSLT

What does GRDDL have to do with Intelligent Agents?

Discussion group for Atom protocol implementations in Python

“Mix and match Web components with Python WSGI”

“Dynamic SVG features for browsers”

Is USPTO abandoning XML in its electronic filing system?

XML Universal names (namespaces): To fuse or not to fuse

Some thoughts on QNames in content (including proposal for a better, ahem, name)

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

I Wish XForms was Recursively Declarative

Grammar Instance

Grammar Instance (as an XML Document)

XForm snippet