Agile Web #1: "Google Sitemaps"

"Google Sitemaps"

Uche Ogbuji's new column, "Agile Web," explores the intersection of agile programming languages and Web 2.0. In this first installment he examines Google's Sitemaps schema, as well as Python and XSLT code to generate site maps. [Oct. 26, 2005]

And with this article the "Python and XML" column has been replaced by a new one titled "Agile Web".

I wrote the Python-XML column for three years, discussing the combination of an agile programming language with an agile data format. It's time to pull the lens back a bit to take in other such technologies. This new column, "Agile Web," will cover the intersection of dynamic programming languages and web technologies, particularly the sorts of dynamic developments on the web for which some use the moniker, "Web 2.0." The primary language focus will still be Python, with some ECMAScript. Occasionally there will be some coverage of other dynamic languages as well.

In this first article I introduce the Google SiteMaps program, XML format and Python tools.

[Uche Ogbuji]

via Copia

"Tip: Computing word count in XML documents" pubbed

"Tip: Computing word count in XML documents"

XML is text and yet more than just text -- sometimes you want to work with just the content rather than the tags and other markup. In this tip, Uche Ogbuji demonstrates simple techniques for counting the words in XML content using XSLT with or without additional tools.

It was just a few weeks after I sent the manuscript to the editor that this thread started up on XML-DEV. Spooky timing.

[Uche Ogbuji]

via Copia

XHTML tutorial pubbed

"XHTML, step-by-step"

Start working with Extensible Hypertext Markup Language. In this tutorial, author Uche Ogbuji shows you how to use XHTML in practical Web sites.

Get started working with Extensible Hypertext Markup Language. XHTML is a language based on HTML, but expressed in well-formed XML. But XHTML is much more than just regularizing tags and characters -- XHTML can alter the way you approach Web design. This tutorial gives step-by-step instruction for developers familiar with HTML who want to learn how to use XHTML in practical Web sites.

In this tutorial

  • Tutorial introduction
  • Anatomy of an XHTML Web page
  • Understand the ground rules
  • Replace common HTML idioms
  • Some practical considerations
  • Wrap up

[Uche Ogbuji]

via Copia

Python/XML column #37 (and out): Processing Atom 1.0

"Processing Atom 1.0"

In his final Python-XML column, Uche Ogbuji shows us three ways to process Atom 1.0 feeds in Python. [Sep. 14, 2005]

I show how to parse Atom 1.0 using minidom (for those who want no additional dependencies), Amara Bindery (for those who want an easier API) and Universal Feed Parser (with a quick hack to bring the support in UFP 3.3 up to Atom 1.0). I also show how to use DateUtil and Python 2.3's datetime to process Atom dates.

As the teaser says, we've come to the end of the column in its present form, but it's more of a transition than a termination. From the article:

And with this month's exploration, the Python-XML column has come to an end. After discussions with my editor, I'll replace this column with one with a broader focus. It will cover the intersection of Agile Languages and Web 2.0 technologies. The primary language focus will still be Python, but there will sometimes be coverage of other languages such as Ruby and ECMAScript. I think many of the topics will continue to be of interest to readers of the present column. I look forward to continuing my relationship with the audience.

It is too bad that I don't get to some of the articles that I had in the queue, including coverage of lxml pygenx, XSLT processing from Python, the role of PEP 342 in XML processing, and more. I can still squeeze some of these topics into the new column, I think, as long as I make an emphasis on the Web. I'll also try to keep up my coverage of news in the Python/XML community here on Copia.

Speaking of such news, I forgot to mention in the column that I'd found an interesting resource from John Shipman.

[F]or my relatively modest needs, I've written a more Pythonic module that uses minidom. Complete documentation, including the code of the module in 'literate programming' style, is at:

The relevant sections start with section 7, "".

[Uche Ogbuji]

via Copia

Thinking XML #33: Serving up WordNet as XML

"Thinking XML: Serving up WordNet as XML"

Subtitle: Build the basic WordNet/XML facilities into a Web server framework
Synopsis: A few articles back, Uche Ogbuji discussed WordNet 2.0, a Princeton University project that aims to build a database of English words and lexical relationships between them. He showed how to extract XML serializations from the word database. In this article he continues the exploration, demonstrating code to serve up these WordNet/XML documents over Web protocols and showing you how to access these from XSLT.

This is the second part of a mini-series within the column. The previous article is "Querying WordNet as XML,", in which I present Python code for processing WordNet 2.0 into XML. This time I use CherryPy to expose the XML on the Web, either in human-readable or in raw form. This seems to be part of a nice trend of CherryPy on developerWorks. I hope people see this as yet another example of how easy and clean CherryPy is.

See other articles in the column. Comments here on Copia or on the column's official discussion forum. Next up in Thinking XML, RDF equivalents for the WordNet/XML.

[Uche Ogbuji]

via Copia

Does XML give away the keys to the data warehouse?

"Does XML give away the keys to the warehouse?"

In this ADT article I reflect on some of the implications of XPath injection attacks, and to what extent XML and open data are a danger to developers.

While I don’t claim to have foreseen XPath injection attacks, it does strike me that this security problem is made possible by practices that I and others have always discouraged. One problem is the phenomenon of production XML as database dump. Developers love to create titanic XMLfiles, often as monolithic dumps from databases. Sometimes they deploy such monsters to servers susceptible to the cleverness of attackers.

If someone does compromise the server, they can pilfer one file and have your information warehouse at their hands.

I wrote this article a long time ago, and I actually didn't know if it would be published, because of editorial changes at ADT. I just discovered it by accident yesterday. I'm glad to see it "in print".

[Uche Ogbuji]

via Copia

Python/XML column #36: Should Python and XML Coexist?

"Python and XML: Should Python and XML Coexist?"

In his latest Python and XML column, Uche Ogbuji claims that the costs of using XML as a little language in a Python application may outweigh the benefits of doing so. [Aug. 25, 2005]

In this article I discuss some of the recent round of complaints about XML in the Python community, trying to give perspective that Python and XML should serve very different domains. Treating them in competition for any particular task is often a more general problem of misunderstanding the basic nature of one technology or the other, and it often leads to overstated complaints.

A correspondent already asked one good follow-up question:

My question is simply: do you have a recommendation for an alternative language (or other protocol) that is more suitable for expressing data structures, preferably one that can be coded for reasonably quickly in Python?

YAML seems to be the front-runner for a cross-language data structure format, although JSON is hot these days, courtesy the AJAX hype. I tend to point to Paul Tchistopolskii's, "Alternatives to XML" for a more comprehensive list.

[Uche Ogbuji]

via Copia

Python/XML column #35: EaseXML and more on Unicode

"EaseXML: A Python Data-Binding Tool"

In this month's Python and XML column, Uche Ogbuji examines a new XML data-binding tool for Python: EaseXML. [Jul. 27, 2005]

The main focus of this article is EaseXML, another option for XML data binding. I found the package rather rough around the edges. I also included a section with a bit more on Unicode, which was the topic of the last two articles "Unicode Secrets" and "More Unicode Secrets". This time I introduced the unicodedata module, which provides useful information about characters from the Unicode standard database.

[Uche Ogbuji]

via Copia

Thinking XML #32: Schema annotation for bottom-up semantic transparency:

"Thinking XML: Schema annotation for bottom-up semantic transparency"

Subtitle: Pushing schemata beyond syntax into semantics
Synopsis: Learn more about the different approaches to semantic transparency as Uche Ogbuji discusses what they mean to developers using XML. Whether or not you reuse schemata, you might find it valuable to use formal annotations (as opposed to the informal annotations covered earlier). You gain benefits on several levels by doing so. On the most immediately practical level, you can generate better documentation. A more far-sighted benefit is that it gives you an important measure of semantic transparency. This installment discusses semantic anchors, and gives examples. The author also takes a moment to discuss The XTech Conference 2005.

This is the third part of a mini-series within the column. Previous articles are "State of the art in XML modeling" and "Schema standardization for top-down semantic transparency". In this article I discuss formal schema annotations, the most important tool available for semantic transparency. I started off my exploration of the technique in "Use data dictionary links for XML and Web services schemata". I mentioned why I think schema annotations are so important even in rough and ready use of XML in my discussion of XOXO.

See other articles in the column. Comments here on Copia or on the column's official discussion forum. Next up in Thinking XML, back to Python + WordNet.

[Uche Ogbuji]

via Copia

XSLT + CSS tutorial pubbed

"Use Cascading Stylesheets to display XML, Part 3"

CSS isn't just for HTML anymore! Learn to combine the strengths of CSS with those of XSLT and fine-tune your XML presentation in a browser.

In parts 1 and 2 of this tutorial series, Uche Ogbuji has shown how to use Cascading Stylesheets (CSS) to display XML in browsers, presenting basic and advanced techniques. Although some people see XSLT and CSS as opposing technologies, they are actually very complementary. CSS cannot, and is not designed to, handle many XML rendering tasks. You can use XSLT for many such tasks, and even manage the CSS that is still used to fine-tune the presentation. This tutorial covers techniques for using XSLT to process XML in association with CSS.

Anyone who works with XML should take this tutorial. Even if CSS and XSLT don't cover your needs for production Web publishing, they are great tools for general processing, debugging, and experimentation. They offer rich interaction with other XML technologies and you'll be likely to run into CSS and XSLT even when you least expect them.

This tutorial, third in a series, discusses XSLT techniques for working with CSS for HTML or XML output. The approach very step-by-step, and includes many examples (including expected display in Firefox).

If you're wanting to get more of a background learning how to use CSS with XML, see the previous tutorials in the series:

The response has been very positive to the first two. Here is what has been posted through the feedback form according to my editor:

"Use CSS to display XML, Part 1"
Average rating: 4.03 (out of 5.00)
Responses received: 33
"Excellent concise overview. Thanks."
"very nice presentation of the fundamentals with excellent examples"
"Good introduction. Needs a more dramatic example."
"Nicely done thanks .."
"Only had a quick read this time but I'm sure a full going over will improve my XML/CSS skills no end!"
"Hi:After several weeks of reading several books and posting to several list, you very simply provided me with an answer to my very simple question, which was "How can I present xml data using css?"Thanks."
"I already use CSS but this tutorial taught me some new tricks on generating content with CSS."

"Use CSS to display XML, Part 2"
Average rating: 4.00 (out of 5.00)
Responses received: 9
Comments: none

[Uche Ogbuji]

via Copia