Lifting XSLT into application domain with extension functions?

The conversation continues about the boundary between traditional presentation languages such as XSLT and XML toolkits in traditional application languages such as Python. See earlier installments "Sane template-like output for Amara" and "Why allow template-like output for Amara?". M. David Peterson, the mind behind XSLT blog responded to the latter post, and his response deserves a post of its own:

This is an excellent post that brings out some important points. I do wonder however if the solution can be solved utilizing a base of underlying functions that can then be implemented via 2 mechanisms:

  • The XSLT (potentially 2.0 given 1.0 support already exists) engine which will properly invoke the necessary sequence of functions to process and transform the input XML.
  • Amara, which will invoke a similar combination of functions, however in a way more in line with the Python architecture and programmers mentality.

Being a novice Python programmer, at best!, makes it difficult to suggest this solution with a whole lot of confidence... as such, I won't and simply leave it as something that, if possible, might be worth consideration as it will give leave the door open for the reverse situation, e.g. someone like myself who sees the value Amara and Python offer but given my background would prefer to work with XML via XSLT (2.0 preferably of course :D) except in cases where its obvious the platform offers a much simpler and elegant solution. In cases like this (which is sounds as if this particular situation qualifies) I am definitely more interested in the fastest way in and out of a process than I am in firing up the transformation object to perform something that is rediculously easy using the platform API.

This is an interesting good point, especially because it is something that we already tried to do in 4Suite. When we first started writing the 4Suite repository, we built XSLT scripting into its very DNA. You can do just about anything to the repository through a set of specialized XSLT extensions. You can write entire Web sites (including CRUD function) in XSLT, and indeed, there are quite a few examples of such sites, including:

These are all 100% XSLT sites, implemented using 4Suite's repository as the HTTP server, request processing engine (via XSLT script), database back end, etc. You don't even need to know any Python to write such sites (Wendell Piez proved this when he created The Sonneteer). You just need to know XSLT and the 4Suite repository API as XSLT functions. These APIs mirror the Python APIs (and to some extent the command line and HTTP based API). Mike Olson insisted that this be a core architectural goal, and we got most of the way there. It served the originally narrow needs quite well.

We all saw the limitations of this approach early, but to some extent succumbed to the it-all-looks-like-a-nail fallacy. It was so easy for me, in particular, to churn out 4Suite repo based sites that I used the approach even where it wasn't really the best one. I'm definitely through with all that, though. I've been looking to do a better job of picking the best tool for each little task. This is why I've been so interested in CherryPy lately. I'd rather use a specialized tool for HTTP protocol handling than the just-good-enough version in 4Suite that is specialized for handing off requests to XSLT scripts. Lately, I've found myself advising people against building end-to-end sites in 4Suite. This is not to strand those who currently take advantage of this approach, but to minimize the legacy load as I try to steer the project towards a framework with better separation of concerns.

When I look at how such a framework would feel, I think of my own emerging choices in putting together such sites:

  • 4Suite repository for storing and metadata processing of XML documents
  • Amara for general purpose XML processing in Python code
  • XSLT for presentation logic
  • CherryPy for protocol serving

My own thoughts for a 4Suite 2.0 are pretty much grounded in this thinking. I want to separate the core libraries (basically XML and RDF) from the repository, so that people can use these separately. I also want to de-emphasize the protocol server of the repository, except as a low-level REST API. That having been said, I'm far from the only 4Suite developer and all I can do is throw in my own vote on the matter.

But back to the point raised, the idea of parallel functions in XSLT and another language, say Python, is reasonable, as long as you don't fall into the trap of trying to wrench XSLT too far off its moorings. I think that XSLT is a presentation language, and that it should always be a presentation language. With all the big additions in XSLT 2.0, the new version is still really just a presentation language. Based on my own experience I'd warn people against trying to strain too hard against this basic fact.

And remember that if you make XSLT a full model logic language, that you have not solved the problem of portability. After all, if you depend on extensions, you're pretty much by definition undermining portability. And if XSLT becomes just another competitor to Java, ECMAscript, Python, etc., then you haven't gained any portability benefit, since some people will prefer to use different languages from XSLT in their model logic, and XSLT code is obviously not portable to these other languages. Sure you have platform portability, but you already have this with these other languages as well.

[Uche Ogbuji]

via Copia
7 responses
Wow!  I didn't realize my comments would really gain much notice much less their own post.  I appreciate this none-the-less.

One of the things I struggle the most with is at what point does a language like XSLT, which by definition and design is a Domain Specific Language, cross the line to become a more general purpose language like Python.  The irony in this is that XSLT from its very inception had its focus on becoming a Functional Programming Language like Lisp.  And of course the more I learn about Python the more I realize that it too has its roots in Lisp.  But its obvious that Python is much better suited to handle everything an application might need, if not as fine tuned to handle the XML transformation side of things such as well as something designed specifically for this task: Enter XSLT.

To me there are some obvious pieces of XSLT 2.0 that are absolutely necessary and do not cross the  DSL/General Purpose line (or in the case of multiple output doesn't cross the file system read/write line instead only allowing you to write with no built-in ability read the directory -- a line that is blurred of course by extension functions that could give you this ability if you explicity build the external function that can provide this information to your XSLT application).

Anyway, in my opinion the following is absolutely mandatory (and in many cases provided by EXSLT):

- Everything becomes a sequence, no more RTF's

- Multiple Output

- Date processing

- Simplified Grouping

- Extended string processing functions

-- Regular Expressions being the key to this

- Conditional logic embedded into XPath to eliminate or at least reduce the current verobosity of choose[when|otherise] blocks that get WAY overused by those who do not have a full understanding of using the XML to guide your transformations via templates.

- User-defined functions

What is absolutely still needed:

- Dynamic evaluation of XPath

What could be done away with:

- I'll leave this one to interpretation :)

It seems to me it would be within the spirit of all things DSL to develop a list in which we could state "Ok, to stay within the confines of a DSL you need to avoid the following...".  But where would you even start with something like that.  You would have to make it specific to each language in which claims to be a DSL as it would be difficult at best to create a generalized list -- although definitely not impossible.

I appreciate your follow-up post Uche!  This is all very much worth quite a bit of thought, something no doubt you have spent some considerable time doing just that.  I wonder if this is a topic that might be best suited for a Wiki-based discussion in which we could go through each and every piece of the XPath 2.0 and XSLT 2.0 specifications and over the next few months try to get a good feel for just what it is we have with these two specifications, what we could do without, and then what we most definitely need.

Copia The conversation continues about the boundary between traditional presentation languages such as XSLT and XML toolkits in traditional application languages such as Python. See earlier installments "Sane template-like output for Amara" and "Why allow template-like output for Amara?". M....
I think a discussion on the XSLT Wiki would be fine for this, but my own contribution might be a bit limited until I get to know XSLT 2.0 better.  I can certainly weigh in on the stuff that is also applicable to XSLT 1.0 + EXSLT.
Hi guys,

Another good article Uche. It clears my mind of what you have in yours regarding the future of 4Suite.

I totally agree that each tool/language has its own benefit in every occasion. I would not use Python to do the complete chain of events of my application but i would not use XSLT either for doing everything. A bit of both where it's relevant.

I also agree that people trying to fit all aspects of a programing language in XSLT are mistaken. Although it's interesting on the theory level, I just feel it's a waste of time as long as you can use another language which does the job already.

I follow David's comment regarding what is compulsory in XSLT to make it more powerful without making it unusable!

That being said, as I'm not an XSLT guru noor do I know much about XSLT2, I have the feeling lots of people in the community do not agree on which direction exactly XSLT as language should evolve.

I suppose it's quite normal and it will take time. But I also fear it will split the XSLt community in those who felt XSLT1 was sufficient and stays simple enough, and those who will only use XSLT2.

Time will tell.

- Sylvain
I think that probably one of the most important things to transpire from such an effort (no need to rush it either... we've still got a few months before we get within the outside barriers of a recommendation within site) would be to discover just what it is that makes XSLT 2.0 that much different than EXSLT so your expertise in this area will definitely be a HUGE benefit.

My thought is that if we compose (or ask community members to compose some real-world examples for us) a variety of problems and tasks that all fall within the scope of the technologies in question and then create:

- An XSLT 1.0/XPath 1.0 only solution (at least as far as possible, filling in the holes with common  platform solutions)

- the previous with EXSLT 1.0

- an XSLT 2.0/XPath 2.0 solution

- an Amara/Python/XSLT 1.0 + EXSLT solution

- and since it happens to be my personal strength a solution using the .NET platform with XSLT 1.0 + EXSLT.NET

In the end the worst that can happen is we find some really cool solutions using a variety of tools and if we do it right we should discover just what it is that XSLT 2.0/XPath 2.0 is capable of that any other reasonable combination of tools is not


or possibly the reverse: We may discover that when working with Python and Amara coupled with the existing XSLT 1.0/EXSLT capabilities of that the XSLT 2.0 benefits are blurred at best.  And to be honest, I wouldnt be surprised if this were the case. 

That may sound strange from someone who:

a) Has the equivalent of a first graders understanding of Python.

b) Is such a staunch advocate of XSLT 2.0

The reason for such an odd acceptance that XSLT 2.0 may have met its match in Amara and Python comes from my belief that its very possible this is the case and that as we move more towards a decentralized world in which each client will, in and of itself, become a miniature messaging processing center, transforming this format to that format, this language to that language, etc... it will be the client that will begin to gain the most benefit from having an XSLT 2.0/XPath 2.0 and beyond processor coupled, more than likely with a client version of SQL Server, mySQL, etc... all in which use a data-type less(?) implementation of XQuery to extract the data that has been syncronized via hundreds or even thousands of server nodes across the internet to then sort, merge, and transform them into something viewable/consumable by a client application of some sort to then take any potential user input, transform it into Yet Another Serialized Format of This or That or Whatever Else to be transported to Yet Another Destination and so on and so forth.

With the strength of and as such (I am assuming as I really don't know this for sure) Amara on the server side of things then maybe the real challenge isn't determining which is better, but which is better suited for the platform it happens to call home: client or server.  Could there be a client-side Amara-lite processor with XSLT 2.0/XPath 2.0 support that then communicates with 4Suite and Amara on the server side of things which in turns communicates with thousands of other nodes keeping them in a constant, syncronized state, acting as the key link between commerce and consumer?  The possibilities seem endless to me for a scenario similar to this...  pretty cool stuff if it works out that way :)

Ok, maybe that was a little too much information for a comment better suited for a blog post or better yet within a wiki.  In fact I know just the Wiki[1]. 

But first, to sum up the above I don't believe that the future of XSLT is on the server so much as on the client in which XML is the only format that it deals with anymore and therefore simply needs a specialized tool made available to client-side applications that is fine-tuned for handling the constant state of messaging -- conscious or unconscious to our own minds -- that will be constantly taking place, handling all the details from soccer practice reminders sent out to ordering the groceries from to whatever else we happen to find to our fancy when that day comes.

Or maybe not... ;)

Cheers Uche :)  Looking forward to having some fun and gaining some knowledge all at the same time... besides, I REALLY need to learn Python (then Amara... need to take baby steps to start with :) and this should definitely help!


[NOTE: I installed MediaWiki instance on months and months ago and forgot it was there...  Makes it's nice and clean and contamination free to play around with whatever becomes of the above...]
Hi David,

You said :


(...) as we move more towards a decentralized world in which each client will, in and of itself, become a miniature messaging processing center, transforming this format to that format, this language to that language (...)


I totally agree with you. I really think that strong structured conception methods such as described in Merise or UML are methods of the past. Not that they are bad or useless but I think they don't fit the world as it is now. They are too static, too heavy, too far from how people work.

What people want now is being able to dump whatever heterogeneous data they may work with and being able to search them, retrieve them, quickly edit them, transform them, send them as a simple attachment.

What people use all the time today are a browser and a email client.

Those application are the one that should carry an application today, not the OS itself. That's why the X technology is so poweful. It's the only one to be comprehensive enough : XML for describing/carrying content, XPath for navigating through it, XQuery for querying it, XSLT for transforming it, RelaxNG for validating it, etc. It allows developers to finally see the client side as side where they could write their application in a safe and standard way. Even Javascript has benefited from it!

As I stated in a previous comment, I love the X technologies above because in my mind I don't have the feeling of programing in a limited programing language, no! In my mind it's like programing a bit of Python, .Net or whataver on a content which happens to be XML (which can come from a XML database, a transformation from XSLT, a XPath query result, etc.)

It really changes the way you code IMO and the way you think Software.

People feel so often XMl is burden and overrated. Fine by me, I won't try to convince them... but man I have fun with it!

- Sylvain
David, Sylvain,

Yes.  As someone said recently (can't find the exact quote), the data is the platform.  The data and how you pipeline it from data sources to data clients is more important than the technology used for this pipelining, and if you do your part rightly, you can swap technologies in and out as you see fit.

Let's keep this line of inquiry open as XSLT heads toward 2.0, CherryPy to 2.1, Amara to 2.0 (yes sooner than you might think :-) ), and so on.