Creating JSON from XML using XSLT 1.0 + EXSLT

The article “Generate JSON from XML to use with Ajax”, by Jack D Herrington, is a useful guide to managing data in XML on the server side, and yet using JSON for AJAX transport for better performance, and other reasons. The main problem with the article is that it uses XSLT 2.0. Like most cases I've seen where people are using XSLT 2.0, there is no reason why XSLT 1.0 plus EXSLT doesn't do the trick just fine. One practical reason to prefer the EXSLT approach is that you get the support of many more XSLT processors than Saxon.

Anyway, it took me all of 10 minutes to cook up an EXSLT version of the code in the article. The following is listing 3, but the same technique works for all the XSLT examples.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:func="http://exslt.org/functions"
    xmlns:str="http://exslt.org/strings"
    xmlns:js="http://muttmansion.com"
    extension-element-prefixes="func">

  <xsl:output method="text" />

  <func:function name="js:escape">
    <xsl:param name="text"/>
    <func:result select='str:replace($text, "&apos;", "\&apos;")'/>
  </func:function>

  <xsl:template match="/">
var g_books = [
<xsl:apply-templates/>
];
  </xsl:template>

  <xsl:template match="book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)"/>',
first: '<xsl:value-of select="js:escape(author/first)"/>',
last: '<xsl:value-of select="js:escape(author/last)"/>',
publisher: '<xsl:value-of select="js:escape(publisher)"/>'
}
  </xsl:template>

</xsl:transform>

I also converted the code to a cleaner, push style from what's in the article.

[Uche Ogbuji]

via Copia

"Thinking XML: Review of RFC 3470: Guidelines for the use of XML"

"Thinking XML: Review of RFC 3470: Guidelines for the use of XML"

Thinking XML author Uche Ogbuji continues with the theme of XML best practices. In the previous installment "Good advice for creating XML," you looked at XML design recommendations from experts. In this article, you'll find recommendations from the Internet Engineering Task Force (IETF), an organization whose technical papers drive most Internet protocols. The IETF's XML recommendations are gathered together in RFC 3470: "Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols."

[Uche Ogbuji]

via Copia

"Tip: Remove sensitive content from your XML samples with XSLT"

"Tip: Remove sensitive content from your XML samples with XSLT"

Do you need to share samples of your XML code, but can't disclose the data? For example, you might need to post a sample of your XML code with a question to get some advice with a problem. In this tip, Uche Ogbuji shows how to use XSLT to remove sensitive content and retain the basic XML structure.

I limited this article to erasing rather than obfuscating sensitive content, which can be done with XSLT 1.0 alone. With EXSLT (or even XSLT 2.0) you can do some degree of obfuscation, allowing you to possibly preserve elements of character data that are important to the problem under discussion. Honestly, though, I prefer to solve this problem with even more flexible tools. As a bonus the following is a bit of 4Suite/SAX code that uses a SAX filter to obfuscate character data by adding a random shift to the ordinal of each character in the Unicode alphanumeric class. This way if exotic characters were part of the problem you're demonstrating, they'd be left alone. It's easy to use the code as a template, and usually all you have to change is the obfuscate function or the obfuscate_filter class in order to fine-tune the workings.

import re
import random
from xml.sax import make_parser, saxutils
from Ft.Xml import CreateInputSource, Sax

RANDOM_AMP = 15
ALPHANUM_PAT = re.compile('\w', re.UNICODE)

def obfuscate(old):
    def mutate(c):
        return unichr(ord(c.group())+random.randint(-RANDOM_AMP,RANDOM_AMP))
    return ALPHANUM_PAT.subn(mutate, old)[0]

class obfuscate_filter(saxutils.XMLFilterBase):
    def characters(self, content):
        saxutils.XMLFilterBase.characters(self, obfuscate(content))
        return

if __name__ == "__main__":
    XML = "http://cvs.4suite.org/viewcvs/*checkout*/Amara/demo/labels1.xml"
    parser = make_parser(['Ft.Xml.Sax'])
    filtered_parser = obfuscate_filter(parser)
    handler = Sax.SaxPrinter()
    filtered_parser.setContentHandler(handler)
    filtered_parser.parse(CreateInputSource(XML))

This code uses recent fixes and capabilities I checked into 4Suite CVS last week. I think all the needed details to understand the code are in the SAX section of the updated 4Suite docs, which John Clark has already posted.

[Uche Ogbuji]

via Copia

Versa: Pattern Matching (article)

My Versa article (Versa: Path-Based RDF Query Language) is up but I've recently been tinkering with Emeka and haven't been able to post about it. I wanted Emeka functional so people could familiarize themselves with the language by example instead of specification deciphering. Simple saying ".help" in a channel where he is located (#swig,#swhack,#4suite,#foaf) should be sufficient. Please, if his commands interfere with an existing bot's, please let me know.

The article is based (in part) on an earlier paper I wrote on Versa. I reworked it to focus more on the use patterns in common with other existing query languages (SPARQL primarily) to make the point that RDF querying is truely not in it's infancy any more. I also wanted to use it as a spring board to suggest some possible enhancements to an already (IMHO) expressive syntax (mostly burrowed from N3)

My hope is to spark some conversation across the opposing ends as well as get people familiar with the language for the betterment of RDF and RDF querying.

See an exchange between Dave Beckett and myself on the #swig scratchpad.

[Uche Ogbuji]

via Copia