Copia

"Create vector graphics in the browser with SVG"

Subtitle: Add two-dimensional vector graphics to your Web pages with the flexible, XML graphics format of Scalable Vector Graphics (SVG) 1.1.
Synopsis: Learn step-by-step how to incorporate Scalable Vector Graphics (SVG) into Web pages using real browser examples. SVG 1.1, an XML language for describing two-dimensional vector graphics, provides a practical and flexible graphics format in XML, despite the language's verbosity. Several browsers recently completed or announced built-in SVG support.

I was early to SVG, exploring it in this 2001 article, but in recent years I haven't had as much time as I'd have liked to work with this fun technology. I was able to put it to use in projects last year, and I think it's good timing, considering recent inroads SVG has been making in browser and mobile spaces. I've been lucky to have much fewer problems than Eric has. Most of what I've tried just works, and does so in Firefox, Opera 9 and MSIE/Adobe SVG Viewer.

[Uche Ogbuji]

via Copia

"Tip: Rescue terrible HTML with TagSoup"

Well, since I've so emphatically broken my Weblogging pause for The Cup, I'd better post some professional items.

"Tip: Rescue terrible HTML with TagSoup"

Subtitle: Turn poorly formed HTML into valid XHTML
Synopsis: XHTML is a friendly enough format for parsing and screen-scraping, but the Web still has a lot of messy HTML out there. In this tip Uche Ogbuji demonstrates the use of TagSoup to turn just about any HTML into neat XHTML.

TagSoup is very handy. EVen though it's a Java project I put it to use from Python code fairly often. It also recently went full 1.0.

[Uche Ogbuji]

via Copia

"Thinking XML: Good advice for creating XML"

An earlier article (published in January) that I forgot to announce:

"Thinking XML: Good advice for creating XML"

Thinking XML

Subtitle: Principles of XML design from the community at large
Synopsis: The use of XML has become widespread, but much of it is not well formed. When it is well formed, it's often of poor design, which makes processing and maintenance very difficult. And much of the infrastructure for serving XML can compound these problems. In response, there has been some public discussion of XML best practices, such as Henri Sivonen's document, "HOWTO Avoid Being Called a Bozo When Producing XML." Uche Ogbuji frequently discusses XML best practices on IBM developerWorks, and in this column, he gives you his opinion about the main points discussed in such articles. [Also discusses "Monastic XML," by Simon St. Laurent.]

[Uche Ogbuji]

via Copia

"XML in Firefox 1.5, Part 2: Basic XML processing"

Subtitle Do a lot with XML in Firefox, but watch out for some basic limitations

Synopsis This second article in the series, "XML in Firefox 1.5," focuses on basic XML processing. Firefox supports XML parsing, Cascading Stylesheets (CSS), and XSLT stylesheets. You also want to be aware of some limitations. In the first article of this series, "XML in Firefox 1.5, Part 1: Overview of XML features," Uche Ogbuji looked briefly at the different XML-related facilities in Firefox.

I also updated part 1 to reflect the FireFox 1.5 final release.

This article is written at an introductory level. The next articles in the series will be more technically in-depth, as I move from plain old generic XML to fancy stuff such as SVG and E4X.

[Uche Ogbuji]

via Copia

"Tip: Use the Unicode database to find characters for XML documents"

The Unicode consortium is dedicated to maintaining a character set that allows computers to deal with the vast array of human writing systems. When you think of computers that manage such a large and complex data set, you think databases, and this is precisely what the consortium provides for computer access to versions of the Unicode standard. The Unicode Character Database comprises files that present detailed information for each character and class of character. The strong tie between XML and Unicode means this database is very valuable to XML developers and authors. In this article Uche Ogbuji introduces the Unicode Character Database and shows how XML developers can put it to use.

The summary says it all, really.

[Uche Ogbuji]

via Copia

"Tip: Use data URIs to include media in XML"

There are many ways to link to non-XML content within XML, including binary content. Sometimes you need to roll all such external content directly into the XML. Data scheme URIs are one way to specify a full resource within a URI, which you can then use in XML constructs. In this tip, Uche Ogbuji shows how to use this to bundle related media into a single file.

I also touch a bit on unparsed entities and notations in this brief article.

Side note: Of course URLs are a subset of URIs, but I did want to mention that I prefer to use the term "URI" for the data scheme because it feels to me much more of an identifier-by-value than a locator. (I suppose it could be considered a trivial locator.)

[Uche Ogbuji]

via Copia

Agile Web #3: "Scripting Flickr with Python and REST"

"Scripting Flickr with Python and REST"

In his latest Agile Web column, Uche Ogbuji shows us how to use Python to interact with Flickr as a lightweight web service.

This Agile Web installment is fairly straightforward. I look at the several Python libraries for accessing Flickr from programs. They range from low level, thin veneers over the official Flickr API to the one higher level, more Pythonic library. And of course there's the obligatory package I just can't get to work.

[Uche Ogbuji]

via Copia

Thinking XML #34: Search engine enhancement using the XML WordNet server system

Updated—Fixed link to "Serving up WordNet as XML"

"Thinking XML: Search engine enhancement using the XML WordNet server system"

Subtitle: Also, use XSLT to create an RDF/XML representation of the WordNet data
Synopsis: In previous installments of this column, Uche Ogbuji introduced the WordNet natural language database, and showed how to represent database nodes as XML and serve this XML though the Web. In this article, he shows how to convert this XML to an RDF representation, and how to use the WordNet XML server to enrich search engine technology.

This is the final part of a mini-series within the column. The previous articles are:

"Querying WordNet as XML,", in which I present Python code for processing WordNet 2.0 into XML.
"Serving up WordNet as XML", in which I use CherryPy to expose the XML on the Web, either in HTML or in raw XML form.

In this article I write my own flavor of RDF schema for WordNet, a transform for conversion from the XML format presented previously, and a little demo app that shows how you can use WordNet to enhance search with synonym capabilities (and this time it's a much faster approach).

I hope to publicly host the WordNet server I've developed in this series once I get my home page's CherryPy setup updated for 2.2.

See other articles in the column. Comments here on Copia or on the column's official discussion forum. Next up in Thinking XML, RDF equivalents for the WordNet/XML.

[Uche Ogbuji]

via Copia

" Process Atom 1.0 with XSLT"

"Process Atom 1.0 with XSLT"

Learn XSLT techniques for processing Atom documents. In this tutorial, author Uche Ogbuji shows how with real-world use cases. (free registration required)

Atom 1.0 is [the] Internet Engineering Task Force (IETF) standard for Web feeds -- information updates on Web site contents. Since Atom is an XML format, XSLT is a powerful tool for processing it. In this tutorial, Uche Ogbuji looks at XSLT techniques for processing Atom documents, addressing real-life use cases.

This tutorial shows you how to:

Navigate the basic structure of Atom 1.0 documents using XPath expressions

Use these expressions to drive XSLT transformations of Atom source files

Deal with the complications of text and markup embedded in Atom files You will also learn how to use XSLT templates to generate valid Atom files, and how to check the validity of the results.

A companion piece to my recent XML.com article "Handling Atom Text and Content Constructs", this is a task-driven tutorial, taking a more deliberate pace and focusing on XSLT.

developerWorks has had a lot to say about Atom lately, courtesy James Snell (who is also writing a lot of useful Atom extension drafts).

I guess how do you celebrate Atom's promotion to RFC 4287? Why by cooking up even more reading material.

[Uche Ogbuji]

via Copia

"Tip: Use the right pattern for simple text in RELAX NG"

The RELAX NG XML schema language allows you to say "permit some text here" in a variety of ways. Whether you're writing patterns for elements or attributes, it is important to understand the nuances between the different patterns for character data. In this tip, Uche Ogbuji discusses the basic foundations for text in RELAX NG.

Several times while working on RELAX NG in mentoring roles with clients I've had to explain some of the nuances in the various ways to express simple text patterns. In this article I lay out some of the most common distinctions I've had to make. I should say that much of what I know about RELAX NG nuances I learned from Eric van der Vlist and a lot of that wisdom is gathered in his RELAX NG book (in print or online). I recommend the print book because it has some nice additions not in the online version, and because Eric deserves to eat.

[Uche Ogbuji]

via Copia

Copia

Ogbujis on an abundance of topics

Tag Articles

"Create vector graphics in the browser with SVG"

"Tip: Rescue terrible HTML with TagSoup"

"Thinking XML: Good advice for creating XML"

"XML in Firefox 1.5, Part 2: Basic XML processing"

"Tip: Use the Unicode database to find characters for XML documents"

"Tip: Use data URIs to include media in XML"

Agile Web #3: "Scripting Flickr with Python and REST"

Thinking XML #34: Search engine enhancement using the XML WordNet server system

" Process Atom 1.0 with XSLT"

"Tip: Use the right pattern for simple text in RELAX NG"