XHTML to Excel as well as OOo

In my article "Use XSLT to prepare XML for import into OpenOffice Calc" I discussed how to use XSLT to format XML for import into OpenOffice Calc.

The popular open source office suite OpenOffice.org is XML-savvy at its core. It uses XML in its file formats and offers several XML-processing plug-ins, so you might expect it to have nice tools built in for importing XML data. Unfortunately, things are not so simple, and a bit of work is required to manipulate general XML into delimited text format in order to import the data into its spreadsheet component, Calc. This article offers a quick XSLT tool for this purpose and demonstrates the Calc import of records-oriented XML. In addition to learning a practical trick for working with Calc, you might also learn a few handy XSLT techniques for using dynamic criteria to transform XML.

I do wonder about formulae, though. Via Parand Tony Darugar I found "Excel Reports with Apache Cocoon and POI", which (in the section starting with the header "Rendering Machinery") shows that you can just as easily use such a technique for MS Excel. Good to know. I've recently had reason to figure out a system for aggregating reports to and from spreadsheets and XML, and I'd possibly have to deal with simple formulae. I guess I'll cross that bridge if I ever for sure get to it, and the full OpenDocument saved-file format is always an option, but if anyone does happen to have any quick pointers, I'd be grateful.

[Uche Ogbuji]

via Copia

Thinking“Thinking XML: The XML decade”

I've been very pleased at the response to my article “Thinking XML: The XML decade”. I like to write two types of articles: straightforward expositions of running code, and pieces providing more general perspective on some aspect of technology. I've felt some pressure in the publishing world lately to dispense with some of the latter in favor of puff pieces for the hottest technology fad; right now, having "SOA", "AJAX" or "podcast" in the title is the formula for selling a book or article. I've been lucky throughout my career to build relationships with editors who trust my judgment, even if my writing is so often out of the mainstream. As such, whenever an article that's not of any obvious appeal touches a chord and provokes intelligent response, I welcome it as some more ammunition about following the road less trampled in my future writing.

Karl Dubost of the W3C especially appreciated my effort to bring perspective to the inevitable tensions that emerged as XML took hold in various communities.

Uche Ogbuji has written an excellent article The XML Decade. The author is going through XML development history as well as tensions between technological choices in XML communities. One of the fundamental choices of creating XML was to remove the data jail created by some application or programming languages.

Mike Champion, with whom I've often jousted (we have very different ideas of what's practical) was very kind in "People are reflecting on XML after 10 years".

A more balanced assessment of the special [IBM Systems Journal issue] is from Uche Ogbuji. There is quite a nice summary of the very different points of view about what XML is good for and how it can be used. He reminds us that today's blog debates about simplicity/complexity, tight/loose coupling, static/dynamic typing, etc. reflect debates that go back to the very beginning. I particularly like his pushback on one article's assertion that XML leverages the value of "information hiding" in OO design.

It was a really big leap for me from OO (and structured programming/abstract data type) orthodoxy to embrace of XML's open data philosophy (more on that in “Objects. Encapsulation. XML?”). It did help that I'd watched application interface wars from numerous angles in my career: RPC granularity, mixins and implementation versus interface inheritance, SQL/C interface coupling, etc. It start to become apparent to me that something was wrong when we were translating natural business domain problems into such patently artificial forms. RDBMS purists have been making a similar point for ages, but in my opinion, they just want to replace N-tier applications artifice with their own brand of artifice. XML is far from some magic mix for natural expression of the business domain, but I believe that XML artifacts tend to be fundamentally more transparent than other approaches to computing. In my experience, I've found that it's easier to maintain even a poorly-designed XML-driven system than a well-designed system where the programming is preeminent.

Mike's entry goes on to analyze the usefulness, in perspective, of the 10 guiding principles for XML. When he speaks of "a more balanced view" he's contrasting the SlashDot thread on the article, which is mostly filled with the sort of half-educated nonsense that drove me from that site a few years ago (these days I find the most respectable discussion on reddit.com). Poor Liam Quinn spent an awful lot of patience on a gang of inveterate flame-throwers. Besides his calm explanations the best bits in that thread were on HL7 and ASN.1.

Supposedly the new version 3 [HL7] standard (which uses the "modeling approach") will be much more firm with the implementors, which will hopefully mean that every now and then one implementation will actually be compatible with another implementation. I've looked over their "models" and they've modelled a lot of the business use-case stuff for patient data, but not a lot of the actual data itself. Hopefully when it's done, it'll come out a bit better baked than previous versions.

That does not sound encouraging. I know Chimezie, my brother and Copia co-conspirator, is doing some really exciting work on patient data records. More RDF than XML, but I know he has a good head for the structured/unstructured data continuum, so I hope his work propagates more widely than just the Cleveland Clinic Foundation.

J. Andrew Rogers made the point that Simon St.Laurent and I (among others) have made in the many cases where people misuse XML rather than something more suitable, such as ASN.1.

The "slow processing" is caused by more than taking a lot of space. XML is basically a document markup but is frequently and regular used as a wire protocol, which has very different design requirements if you want a good standard. And in fact we already have a good standard for this kind of thing called "ASN.1", which was actually engineered to be extremely efficient as a wire protocol standard. (There is also an ITU standard for encoding XML as ASN.1 called XER, which solves many of the performance problems.)

Of course, I think he goes a bit too far.

The only real advantage XML has is that it is (sort of) human readable. Raw TLV formatted documents are a bit opaque, but they can be trivially converted into an XML-like format with no loss (and back) without giving software parsers headaches. There is buckets of irony that the deficiencies of XML are being fixed by essentially converting it to ASN.1 style formats so that machines can parse them with maximum efficiency. Yet another case of computer science history repeating itself. XML is not useful for much more than a presentation layer, and the fact that it is often treated as far more is ridiculous.

I'd actually argue that XML is suited for a (semi-structured) model layer, not a presentation layer. For one thing, wire efficiency often counts in presentation as well. But his essential point is correct that XML is an awful substitute for ASN.1 as a wire protocol. By the same token, the Web services stack is an awful substitute for CORBA/OMA and even Microsoft's answers to same. It seems the industry is slowly beginning to realize this. I love all the many articles I see with titles such as "We know your SOA investment is stuck firmly in the toilet, but honest, believe us, there is an effective way to use this stuff. Really."

Anyway later on in that sub-thread:

The company I work for has had a lot of success with XML, and are planning to move the internal data structure for our application from maps to XML. There is one simple reason for our sucess with it: XSLT. A customer asks for output in a specific format? Write a template. Want to display the data on a web page? Write a template that converts to HTML. Want to print to PDF? Write a template that converts to XSL, and use one of many available XSL->PDF processors. Want to use PDF forms to input data? Write a template to convert XFDF to our format. Want to import data from a competitor and steal their customer? You get the picture.

Bingo! The secret to XML's value is transformation. Pass it on.

In "XML at 10: Big or Little?" Mark writes:

What the article ultimately ends up being about is the "Big" idea of XML vs. the oftentimes "Little" implementation of it. The Big idea is that XML can be used in a bottom-up fashion to model the grammar of a particular problem domain in an application- and context-independent manner. The little implementation is when XML is essentially used as a more verbose protocol for data interchange between existing applications. I would guess that is 90+ percent of what it is currently used for.

He's right, and I this overuse is one of the reasons XML so often elicits reflex hostility. Then again, anything that doesn't elicit such hostility in some quarters is, of course, entirely irrelevant. I think it's a good thing that cannot be said of XML.

[Uche Ogbuji]

via Copia

“Thinking XML: The XML decade”

“Thinking XML: The XML decade”

Subtitle: Thoughts on IBM Systems Journal's retrospective of XML at ten years (or so)
Synopsis: IBM Systems Journal recently published an issue dedicated to XML's 10th anniversary. It is primarily a collection of interesting papers for XML application techniques, but some of its articles offer general discussion of the technical, economic and even cultural effects of XML. There is a lot in these papers to draw from in thinking about why XML has been successful, and what it would take for XML to continue its success. This article expands on some of these topics that are especially relevant to readers of this column.

In this article I touch on points from what the XML community can learn from the COBOL boom of the 90s, to why it's dangerous to use XML as a basis for traditional application modeling systems. It's a bit of a gestalt approach to analyzing some of the key issues facing XML technology at this milestone, and what it might take to ensure XML is still relevant and valuable after another decade. And by that I mean valuable in itself, and not just as a legacy format with valuable data.

[Uche Ogbuji]

via Copia

“The professional architect, Part 1: How developers become architects”

Just published:

“The professional architect, Part 1: How developers become architects”

Synopsis: Many architects come from the ranks of good developers, but not every good developer wants to be an architect, nor are all of them suited for the role. Whether you're a developer contemplating a career shift or a manager looking for suitable candidates for an architectural responsibility, it's important to have a well-informed perspective on this transition. This article discusses the journey from implementation specialization to architecture.

The article is just a collection of my observations of developers and architects, as well as my own experiences in both roles. Here's the introductory section:

When you look for a good musical conductor, you start by looking for a good musician. But not every good musician makes a good conductor. The situation is similar in the professional development of architects. More and more IT organizations are understanding the importance of sound software architecture, and the architect profession is rapidly emerging as a separate discipline within IT. This presents some fresh challenges to management as they look to recruit architects from a fairly small labor pool.[...] The fastest way over these hurdles is to understand that most good architects are also good developers, so probably the first place to look for architect talent is among the ranks of regular developers.[...]

This article outlines what it takes for a developer to become an architect. I'll present the perspective of a developer who might be considering such a move, as well as that of a manager assessing developers for such a transition. I'll also provide a series of factors to consider when making these decisions.

I'll be writing a few more articles on the architect's profession for IBM developerWorks. A few years ago I wrote “A custom-fit career in app development”, an article discussing how developers can build a career without being enslaved to the mainstream of technology trends.

[Uche Ogbuji]

via Copia

“Real Web 2.0: Bookmarks? Tagging? Delicious!”

“Real Web 2.0: Bookmarks? Tagging? Delicious!”

Subtitle: Learn how real-world developers and users gain value from a classic Web 2.0 site
Synopsis: In this article, you'll learn how to work with del.icio.us, one of the classic Web 2.0 sites, using Web XML feeds and JSON, in Python and ECMAScript. When you think of Web 2.0 technology, you might think of the latest Ajax tricks, but that is just a small part of the picture. More fundamental concerns are open data, simple APIs, and features that encourage users to form social networks. These are also what make Web 2.0 a compelling problem for Web architects. This column will look more than skin deep at important real-world Web 2.0 sites and demonstrate how Web architects can incorporate the best from the Web into their own Web sites.

This is the first installment of a new column, Real Web 2.0. Of course "Web 2.0" is a hype term, and as has been argued to sheer tedium, it doesn't offer anything but the most incremental advances, but in keeping with my tendency of mildness towards buzzwords I think that anything that helps focus Web developers on collaborative features of Web sites is a good thing. And that's what this column is about. It's not about the Miss AJAX pageant, but rather about open data for users and developers. From the article:

The substance of an effective Web 2.0 site, and the points of interest for Web architects (as opposed to, say, Web designers), lie in how readily real developers and users can take advantage of open data features. From widgets that users can use to customize their bits of territory on a social site to mashups that developers can use to create offspring from Web 2.0 parents, there are ways to understand what leads to success for such sites, and how you can emulate such success in your own work. This column, Real Web 2.0, will cut through the hype to focus on the most valuable features of actual sites from the perspective of the Web architect. In this first installment, I'll begin with one of the ancestors of the genre, del.icio.us.

And I still don't want that that monkey-ass Web 1.0. Anyway, as usual, there's lots of code here. Python, Amara, ECMAScript, JSON, and more. That will be the recipe (mixing up the ingredients a bit each time) as I journey along the poster child sites for open data.

[Uche Ogbuji]

via Copia

“Mix and match Web components with Python WSGI”

“Mix and match Web components with Python WSGI”

Subtitle: Learn about the Python standard for building Web applications with maximum flexibility
Synopsis: Learn to create and reuse components in your Web server using Python. The Python community created the Web Server Gateway Interface (WSGI), a standard for creating Python Web components that work across servers and frameworks. It provides a way to develop Web applications that take advantage of the many strengths of different Web tools. This article introduces WSGI and shows how to develop components that contribute to well-designed Web applications.

Despite the ripples in the Python community over Guido's endorsement of Django (more on this in a later posting), I'm not the least bit interested in any one Python Web framework any more. WSGI has set me free. WSGI is brilliant. It's certainly flawed, largely because of legacy requirements, but the fact that it's so good despite those flaws is amazing.

I wrote this article because I think too many introductions to WSGI, and especially middleware, are either too simple or too complicated. In line with my usual article writing philosophy of what could I have read when I started out to make me understand this topic more clearly, I've tried to provide sharp illustration of the WSGI model, and a few clear and practical examples. The articles I read that were too simple glossed over nuances that I think should really be grasped from the beginning (and are not that intimidating). In the too-complicated corner is primarily PEP 333 itself, which is fairly well written, but too rigorous for an intro. In addition, I think the example of WSGI middleware in the PEP is very poor. I'm quite proud of the example I crafted for this article, and I hope it helps encourage more people to create middleware.

I do want to put in a good word for Ian Bicking and Paste. He has put in tireless effort to evangelize WSGI (it was his patient discussion that won me over to WSGI). In his Paste toolkit, he's turned WSGI's theoretical strengths into readily-available code. On the first project I undertook using a Paste-based framework (Pylons), I was amazed at my productivity, even considering that I'm used to productive programming in Python. The experience certainly left me wondering why, BDFL or no BDFL, I would choose a huge mega-framework over a loosely-coupled system of rich components.

[Uche Ogbuji]

via Copia

“Dynamic SVG features for browsers”

“Dynamic SVG features for browsers”

Subtitle: Build on SVG basics to create attractive, dynamic effects in your Web projects
Synopsis: Learn how to use dynamic features of Scalable Vector Graphics (SVG) to provide useful and attractive effects in your Web applications. SVG 1.1, an XML language for describing two-dimensional vector graphics, provides a practical and flexible graphics format in XML. Many SVG features provide for dynamic effects, including features for integration into a Web browser. Build on basic SVG techniques introduced in a previous tutorial.
Lead-in: SVG is a technology positioned for many uses in the Web space. You can use it to present simple graphics (as with JPEG) or complex applications (as with Macromedia Flash). An earlier tutorial from June 2006 introduced the basic features of the format. This tutorial continues to focus on SVG for Web development, as it demonstrates dynamic effects that open up new means of enhancing Web pages. The lessons are built around examples that you can view and experiment with in your favorite browser.
Developed by the W3C, SVG has the remarkable ambition of providing a practical and flexible graphics format in XML, despite the notorious verbosity of XML. It can be developed, processed, and deployed in many different environments -- from mobile systems such as phones and PDAs to print environments. SVG's feature set includes nested transformations, clipping paths, alpha masks, raster filter effects, template objects, and, of course, extensibility. SVG also supports animation, zooming and panning views, a wide variety of graphic primitives, grouping, scripting, hyperlinks, structured metadata, CSS, a specialized DOM superset, and easy embedding in other XML documents. Many of these features allow for dynamic effects in images. Overall, SVG is one of the most widely and warmly embraced XML applications.
Dynamic SVG is a hot topic, and several tutorials and articles are available that include fairly complicated examples of dynamic SVG techniques. This tutorial is different because it focuses on a breadth of very simple examples. You will be able to put together the many techniques you learn in this tutorial to create effects of whatever sophistication you like, but each example in this tutorial is simple, clear, and reasonably self-contained. The tutorial rarely deals with any SVG objects more complex than the circle shape, and it keeps embellishments in scripting and XML to a minimum. The combination of simple, step-by-step development and a focus on real-world browser environment makes this tutorial unique.

Pay attention to that last paragraph. There are many SVG script/animation tutorials out there, including several by IBM developerWorks, but I found most of them don't really suit my learning style, and i set out to write a tutorial that would have been ideal for me when I was first learning dynamic SVG techniques. The tutorial covers CSS animation, other scripting techniques and SMIL declarative animations. It builds on the earlier tutorial “Create vector graphics in the browser with SVG”.

See also:

[Uche Ogbuji]

via Copia

Is USPTO abandoning XML in its electronic filing system?

I wrote an article a while back, "Thinking XML: Patent filings meet XML" in which I covered, among other things, the various patent agancies' efforts to support electronic filing. Many of these efforts are XML-based. Except now perhaps the USPTO's (EFS-Web) isn't. There were a lot of gnarly aspects of the EFS-Web process, and I had heard from some users who ended up abandoning the system. It looks as if the USPTO is trying to address these problems by chucking the whole approach and just having people upload PDFs (via XML.org Daily Newslink--yeah. I'm way behind). I wonder whether they also considered supporting ODF, at least as an alternative to PDF. It seems to me that what they needed was broader, not narrower format and tool support.

[Uche Ogbuji]

via Copia

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

“XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox”

Subtitle: Learn how to manipulate XML in the Firefox browser using JavaScript features
Synopsis: In this third article of the XML in Firefox 1.5 series, you learn to manipulate XML with the JavaScript implementation in Mozilla Firefox. In the first two articles, XML in Firefox 1.5, Part 1: Overview of XML features and XML in Firefox 1.5, Part 2: Basic XML processing, you learned about the different XML-related facilities in Mozilla Firefox, and the basics of XML parsing, Cascading Style Sheets (CSS), and XSLT stylesheet invocation.

Continuing with the series this article provides examples for loading an XML file into Firefox using script, for applying XSLT to XML, and for loading XML with references to scripts. In particular the latter trick is used to display XML files with rendered hyperlinks, which is unfortunately still a bitt of a tricky corner of the XML/Web story. I elaborate more on this trick in my tutorial “Use Cascading Stylesheets to display XML, Part 2”.

See also:

[Uche Ogbuji]

via Copia

“Get free stuff for Web design”

"Get free stuff for Web design"

Subtitle: Spice up your Web site with a variety of free resources from fellow designers
Synopsis: Web developers can find many free resources, although some are freer than others. If you design a Web site or Web application, whether static or with all the dynamic Ajax goodness you can conjure up, you might find resources to lighten your load and spice up your content. From free icons to Web layouts and templates to on-line Web page tools, this article demonstrates that a Web architect can also get help these days at little or no cost.

This was a particularly fun article to write. I'm no Web designer (as you can tell by looking at any of my Web sites), but as an architect specializing in Web-fronted systems I often have to throw a Web template together, and I've slowly put together a set of handy resources for free Web raw materials. I discuss most of those in this article. On Copia Chimezie recently mentioned OpenClipart.org, which I've been using for a while, and is mentioned in the article. I was surprised it was news to as savvy a reader as Sam Ruby. That makes me think many other references in the article will be useful.

[Uche Ogbuji]

via Copia