Copia housecleaning

I've finally had some time today, as I prepare for the holidays, to fix some things on Copia that have been broken for too long. Some of the highlights, especially concerning issues mentioned by readers (thanks, guys), are:

RSS 1.0 feed body fix. Added rss:description field for the RSS 1.0 feed, which fixes missing post bodies in readers such as Bloglines which don't support support content:encoded. I do truncate the field to 500 characters, according to the recommendation in the spec.

Single entry view title fix. Added entry titles for single entry pages. Before today, if you viewed this entry through the perma-link, the title would just say "Copia"; now it says "Copia ✏Copia housecleaning"). I've wanted to do this for a while, but I was having the devil of a time figuring out how to do it with PyBlosxom. A scolding from Dan Connolly forced me to chase down a fix. For other PyBlosxom users the trick is to use the comments plug-in, copy the head.* flavor file to comment-head.*, and then update to use the $title variable, which is the title of the entry itself ($blog-title is the title of the entire blog). In my case the updated HTML header template looks like:

<title>$blog_title &#x270F;$title</title>

I did get a report that Copia is incorrectly sending `Content-Type header text/html;charset=ISO-8859-1`, but when I check using the LiveHTTPHeaders extension for FireFox on Linux it reports the correct charset=UTF-8 from the server. If anyone else can corroborate this issue, please leave a comment with the specific URL from which you noticed the error, your platform and browser, and the HTTP sniffing tool were you using. Thanks.

[Uche Ogbuji]

via Copia

XOXO versus Atom versus XBEL for Web feed lists?

Very useful response to my quest for XOXO understanding. Wow. In fact, I've had so much good discussion here and on the #atom IRC that I'm not sure where to start.

I'll start with what my current leaning is, after all that discussion. I think I want to use XBEL for managing my bookmarks, and I'll advocate the necessary extensions, and hope they are as well received as XBEL itself has been.

On to my general thoughts.

XOXO. I was pointed to a thread in which my use of rel="webfeed" was rightly deprecated. I was just grasping for something appropriate in XHTML space, and I don't think there really is. I got into an argument about whether type="application/atom+xml" is the way to go, and I'm still not fully convinced it is. The way it works for me to think about it is that for each item I have zero or more links I want to consider human-readable content links, and zero or more I want to consider feed links. It's my arbitrary decision how to decide which is which, but media type alone doesn't tell the tale. As an example, what if someone has a Weblog that is served as custom XML with XSLT to render it in my browser. The media type would be application/xml and yet it would be a content link. I see the weakness in my own argument: after all, a user agent can also make an Atom document nicely rendered for a human reader, so the distinction I'm trying to make might be an impossible one regardless of method. Maybe I'll have to cave in to the "use type" argument. I'll try it out in this entry.

So back to my first format example for XOXO. If I throw in examples of "folders", I come up with:

<ol class="xoxo">
  <li>
    <p>Technology</p>
    <ol>
      <li>
        <ul>
          <li>
            Weblog home
            Weblog feed
            <dl>
              <dt>description</dt>
              <dd>That good ole Weblog</dd>
            </dl>
          </li>
        </ul>
      </li>
    </ol>
  </li>
</ol>

Umm. That's a round mouthful of tags. I'm not sure I really want to have to squint at that. I also don't like how <description>...</description> becomes <dl><dt>description</dt><dd>...</dd></dl>. That's too much markup indirection for me. Even worse that the usual reduction ad absurdum of <element name="description">....

Atom. Aristotle and then Mark suggested using Atom for Web feed list exchange. It's one of those "DUH!" moments. I can't believe it didn't occur to me. The main problem I have with Atom is that is really only offers one level of hierarchy: feed/entry. My Web feed list is hierarchical. The ready solution is to use categories to simulate hierarchies.

<entry>
  <id>http://example.com/weblog/</id>
  <updated>2005-05-23T15:38:00-08:00</updated>
  <title>Example Weblog</title>
  <link rel="self" href="http://example.com/weblog/" type="text/html">Weblog home</link>
  <link rel="alternate" href="http://example.com/weblog/atom" type="application/atom+xml">Weblog feed</link>
  <summary>That good ole Weblog</summary>
  <category term="Technology"/>
</entry>

I'd also need some way to separately express the hierarchy of categories. I might also have to use use the ranking extension (see for example this article) to preserve item order, if I care about it. I don't know. This looks a bit of a stretch. It certainly wouldn't be very friendly to edit by hand. The fact that the updated element is mandated, for one thing, tends to color Atom into a machine-generated-only lines. This is OK for a lot of Atom's use cases, but I think it's a real bummer for my present one.

XBEL. XBEL is just the little XML format engine that could. Despite its great age and recent neglect, it's possibly the most widely deployed of these options, because of its use in Browser bookmark formats. Not that I think that's any reason for preferring XBEL. I do like how it looks, though:

<folder>
  <title>Technology</title>
  <bookmark href="http://example.com/weblog/">
    <title>Example Weblog</title>
    <info>
      <metadata owner="webfeed">
        <link href="http://example.com/weblog/atom" type="application/atom+xml"/>
        <description>That good ole Weblog</description>
      </metadata>
    </info>
  </bookmark>
</folder>

Even with the required info/metadata layer, it's much cleaner than the other two options. I think that we could at least get into XBEL 1.1 native elements for alternate links and for bookmark descriptions (which I think were already proposed by others), so we could perhaps eliminate the info/metadata layer in XBEL 1.1. Regardless, I think I'll manage things for myself in XBEL as above and just use XSLT to convert to whatever starts t emerge as a viable option for export to other tools (or as a way to export to OPML if I have to).

[Uche Ogbuji]

via Copia

Trying to revive the #atom IRC channel

John L Clark and I separately needed a place to hang out and discuss Atom. The Atom Wikis talk of "#atom on freenode IRC network", but that channel was empty when John and I separately checked. In the hopes of reviving such a useful forum he and I now stay logged on there. If you're interested in Atom, consider joining us. I'm currently working on several Atom projects, both billable and non-billable, so I expect I'll be knee-deep on the format, and maybe the protocol, for a while.

[Uche Ogbuji]

via Copia

Python/XML column #37 (and out): Processing Atom 1.0

"Processing Atom 1.0"

In his final Python-XML column, Uche Ogbuji shows us three ways to process Atom 1.0 feeds in Python. [Sep. 14, 2005]

I show how to parse Atom 1.0 using minidom (for those who want no additional dependencies), Amara Bindery (for those who want an easier API) and Universal Feed Parser (with a quick hack to bring the support in UFP 3.3 up to Atom 1.0). I also show how to use DateUtil and Python 2.3's datetime to process Atom dates.

As the teaser says, we've come to the end of the column in its present form, but it's more of a transition than a termination. From the article:

And with this month's exploration, the Python-XML column has come to an end. After discussions with my editor, I'll replace this column with one with a broader focus. It will cover the intersection of Agile Languages and Web 2.0 technologies. The primary language focus will still be Python, but there will sometimes be coverage of other languages such as Ruby and ECMAScript. I think many of the topics will continue to be of interest to readers of the present column. I look forward to continuing my relationship with the XML.com audience.

It is too bad that I don't get to some of the articles that I had in the queue, including coverage of lxml pygenx, XSLT processing from Python, the role of PEP 342 in XML processing, and more. I can still squeeze some of these topics into the new column, I think, as long as I make an emphasis on the Web. I'll also try to keep up my coverage of news in the Python/XML community here on Copia.

Speaking of such news, I forgot to mention in the column that I'd found an interesting resource from John Shipman.

[F]or my relatively modest needs, I've written a more Pythonic module that uses minidom. Complete documentation, including the code of the module in 'literate programming' style, is at:

http://www.nmt.edu/tcc/help/pubs/pyxml/

The relevant sections start with section 7, "xmlcreate.py".

[Uche Ogbuji]

via Copia