XML recursive directory listing, part 4

"Its hard to finish; or that Pythons tail's a long way away!", by Dave Pawson

Well, I posted Dave's dirlist.py as a little example, in part, of how quickly an XML expert/Python newbie could get something useful whipped up in 4Suite. Based on the very detail-oriented comments, it seems people in general have found it useful, and have run into limitations from the Python newbie side of that equation. Another example of people taking the code very seriously is Lars Trieloff's posting, "Your filesystem is an XML document"

As I mentioned in the posting, I have not put Dave's code through proper code review: I merely tweaked the command line code a bit to get it to work on my Linux box well enough for me to post an example of its workings. Dave has taken it all a bit to heart, but he shouldn't. He got very far in a short amount of time, and it's always the case in learning any new language or platform that the last 10% of polish is very hard won, and yet worth the experience.

I'm passing on all the comments to the other posting to Dave, and he's already sent me an updated version that fixes some issues. I'll post his version if he wishes, but I'll also give his code a proper, full review this weekend, and post that, for the folks who seem to want to use the code practically. The first thing I'll do it to make it conform to PEP 8.

[Uche Ogbuji]

via Copia

XSLT 2.0 might be worth a second look, if...

XSLT 2.0 Is Way Cool, by Micah Dubinko

Micah. Kimber. Pawson. A handful of the folks who have, like me, turned up their nose at XSLT 2.0, are starting to reconsider. This is not a massive drugging campaign by XSLT 2.0 boosters: it seems all these folks still don't want anything to do with the oppressive type system of XPath and XSLT 2.0, and all balk at the stupendous complexity of the specifications. The key to me is that they see these specs as usable without choking on the types mess. Some folks were claiming this was possible 2 years ago or so, but when I checked, I wasn't convinced. Perhaps things have improved since then.

So I may be up for reconsidering my shunning of XSLT 2.0, but as Micah mentions, I'm not about to wade into 9 documents to work on implementation. (OK, so it would really be 4 or so, but those are 4 huge documents, compared to the 1.0 series, which was 2 modestly sized documents). If someone comes up with a coherent spec that omits the type info, it could somehow make its way into the 4Suite post 1.0.

Micah says, "XSLT 2.0 is a power tool. I don't think it will displace XSLT 1.0, which is remarkable for its power in a small package." For a while I've wanted to write a series of comparisons between XSLT 2.0 and Amara code (which includes XPath 1.0 support). Amara is my power tool, for when XSLT 1.0 + EXSLT is not enough, and I find it hard to imagine XSLT 2.0 as offering more power.

And I really need to get back to work on EXSLT. Folks are getting very restless with the fact that work on EXSLT has been fallow for most of 2005. I just wish I could count on some help. Part of what impedes me is a shrinking back from all the demands of the EXSLT community without many offers of help.

[Uche Ogbuji]

via Copia

Python/XML community:

lxml 0.6.0
Picket (updated)

lxml 0.6.0 is an alternative, more Pythonic binding for the libxml2 and libxslt XML processing libraries. Martijn Faassen says "lxml 0.6 contains important bugfixes, in particular better namespace support while handling attributes, as well as a fix for what turned out to be totally broken behavior for etree.tostring(). An upgrade is recommended."

Sylvain Hellegouarch updated Picket, a simple CherryPy filter for processing XSLT as a template language. It uses 4Suite to do the job. He incorporated feedback, including my own thoughts on Processor object management. A CherryPy "filter is an object that has a chance to work on a request as it goes through the usual CherryPy processing chain."

[Uche Ogbuji]

via Copia

Rewriting Source Content Descriptions as Versa Queries

I recently read Morten Frederiksen's blog entry about implementing Source Content Descriptions as SPARQL queries in Redland and was quite interested. Especially the consideration that such queries could be automatically generated and the set of these queries you would want to ask is small and straight forward. Even more interesting was Morten's step-by-step walk-thru of how such queries would be translated to SQL queries on a Redland Triple store sitting on top of MySQL (my favorite RDBMS deployment for 4RDF as well).

However, I couldn't help but wonder how such a set of queries would be expressed in Versa (in my opinion, a language more aligned with the data model it queries than it's SQL-RDQL counter-parts). So below was my attempt to port the queries into versa:

Classes used in the store

SPARQL
SELECT DISTINCT ?Class
WHERE { ?R rdf:type ?Class }
Versa
set(all() - rdf:type -> *)

Predicates that are used with instances of each class

SPARQL
SELECT DISTINCT ?Class, ?Property
  WHERE { ?R rdf:type ?Class .
        OPTIONAL { ?R ?Property ?Object .
                   FILTER ?Property != rdf:type } }
Versa
difference(
  properties(set(all() - rdf:type -> *)),
  set(rdf:type)
)

Do all instances of each class have a statement with each predicate?

It wasn't clear to me if the intent was to check if all classes have a statement with each predicate as specified by an ontology or to just count how many properties each class instance has. The latter interpretation is the one I went with (it's also simpler). This particular query will return a list of lists, each inner list consisting of two values: the URI of a distinct class instance and the number of distinct properties described in a statements about it (except rdf:type)

Versa
distribute(
  set(all() |- rdf:type -> *),
  '.',
  'length(
    difference(
      properties(.),
      set(rdf:type)
    )
  )'
)

Is the type of object in a statement with each class/predicate combination always the same?

I wasn't clear on the intent of this query, either. I wasn't sure if he meant to ask this of the combination with all predicates defined in an ontology or all predicates on class instances in the graph being queried.

But there you have it.

NOTE: The use of the set function was in order to guarantee that only distinct values were returned and may have been used redundantly with functions and expressions that already account for duplication.

[Uche Ogbuji]

via Copia

SPARQL versus Versa

Booyakasha! In a few simple examples, Chime illustrates just why I was so annoyed when I read the SPARQL spec drafts. Eric also has some good words on the matter. Sure, I'm biased as one of the inventors of Versa, but my reaction has more to do with SPARQL than Versa. Frankly, SPARQL bends my brain and twists my gut. Before I continue with my rant, I should say that I'm not blameless in this matter. I have a huge respect for the people working on SPARQL, and a lot of them (Dan Brickley, Libby Miller and Kendall Clark come to mind) were very polite in trying to get me more directly engaged in the standardization process. I just never had the time for more than the informal discussions I had with these folks, and apparently those who prefer SQLish syntax ended up dominating the important discussion or decisions.

It has never been Versa or the highway for me, but I was never going to swallow an RDF query language that used SQLish syntax. I always wanted a path-like language, preferably with a very "composable" syntax (which is why I went with such a functional language flavor in Versa). I'm far from alone in this. There have been many other respectable "pathy" RDF query proposals, and the feedback on Versa has been almost universally positive.

Apparently some people are very tied to their "SELECT"s. Isn't there room for those of us who just find it way too much of a conceptual mismatch from SQL conventions to RDF graphs? I have no choice but to make my own room. I'll continue working on Versa: it's time to start gathering my Versa 2.0 thoughts together. I'll implement Versa 2.0 for 4Suite, and help anyone who wants to implement it for any other tool (I hope that encourages Eric a bit). I may work on a Versa to SPARQL converter, but honestly, that's as much as I expect to ever have to do with SPARQL. No offense to any of the fine people involved. It just doesn't come close to fitting my head.

Chimezie Ogbuji

via Copia

X-Chat and irc:// URLs

I was wanting to get to the CherryPy IRC channel and I had the handy URL irc://irc.oftc.net/cherrypy. I could fiddle with the URL bits in the server dialog box, but I wondered why I couldn't just get X- Chat to swallow the URL whole. Turns out this is not a capability that X-Chat is all that eager to present to users. Nothing on the FAQs, nothing in the menus, and nothing without a lot of digging in the on- line help. There is the command line: `xchat irc://irc.oftc.net/cherrypy`, but this doesn't seem very useful in practice if you already have IRC sessions set-up for autoload. It basically just launches a new X-Chat session with all your defaults, and seems to ignore the URL you specified on the command line (I'm using X- Chat version 2.4.0). I've seen others voicing similar complaints.

Anyway, the recipe I found that works for me is:

  • Create a new tab (Ctrl-T)
  • Type /server [irc-url] in the entry line (e.g. `/server irc://irc.oftc.net/cherrypy`

Works well, but shouldn't be such a deep secret. What X-Chat really needs is an "Open URL" menu item. Simple as that. And it really needs its command line behavior cleaned up if it wants to be the default handler for irc:// URLs clicked in other apps such as browsers.

[Uche Ogbuji]

via Copia

Yet another pretty damn cool Wiki

Wikinews

Maybe these are all old hat, but not too long after discovering Wiktionary, I've come across another cool Wiki version of a classic resource. Wikinews is news by open community contribution. Terrifying or edifying prospect, based on your point of view. In my view, it's a nice opening for me to find stuff, and then do my own research to check up on it.

The coverage seems to be a bit laggy, and coverage is a bit spotty (a search for "Nigeria" brought up only one storage all month). All this is as expected for a volunteer-run site. It's a neat idea and neat execution, natheless.

[Uche Ogbuji]

via Copia

NOAA's arcs: the auroral oval

Auroral Activity extrapolated from the NOAA POES satellite

The only times I can remember seeing the Aurora Borealis (I've never been south of the equator, so no chance of having seen the Aurora Australis were once near Ashland, Wisconsin, and, surprisingly (to me), once here in Colorado. Lori and I were driving along and thought there was a strange shimmer in the sky. We then saw many other cars pulled over to get a better look. We followed suit and watched the show. On the NOAA page I can wistfully look from time to time to see whether there is any chance of catching the show again without having to venture pole-wards of either 45th parallel in Winter.

[Uche Ogbuji]

via Copia

Quotīdiē

Il me faut le cacher au plus intime de mes veines
L’Ancêtre à la peau d’orage sillonnée d’éclairs et de foudre
Mon animal gardien, il me faut le cacher
Que je ne rompe le barrage des scandales.
Il est mon sang fidèle qui requiert fidélité
Protégeant mon orgueil nu contre
Moi-même et la superbe des races heureuses…

Léopold-Sedar Senghor—"Le Totem"

When the late, great Senhgor expressed a sentiment, it stayed expressed. Founding president of la République du Sénégal (after an exile for revolutionary activities), and member of the Négritude movement poet, Senghor was one of West Africa's most astounding minds. "Le Totem" (above is the complete poem) is one of very few French poems I've memorized. It expresses a sentiment that I don't know that I feel directly, but that I can well imagine based on knowing so many Africans in the diaspora (and older ones, in particular). Here is my poor student's translation:

I'm forced to hide in my most intimate veins
The ancestor with the hide of storms streaked and burned with lightning
My guardian animal, I must hide it
So that I do not breach the barrier of scandal.
It is my faithful blood that requires faithfulness
Protecting my inborn pride against
My very self and the superb among the happy races

This poem has a lot that is difficult to render faithfully into English, and in some cases, I've preferred a somewhat unidiomatic transliteration to an anglophone translation that would lose too much of the nuance (the last line is the main example).

Another place where I could barely approximate is "sillonnée d’éclairs et de foudre". I've always had a vague feeling of the distinction between these two French words for "lightning". The first being more a display of lightning and the second being more of a thunderclap, such as Zeus would have hurled at impudent mortals. See below for more on this distinction. I always think of Senghor's line as juxtaposing the white flash of "éclairs" with the blackened result of "foudre", using the apt verb "sillioner", which means "plough" as well as "streak".

I have the same attitude towards the Négritude movement as Nigerian Nobel Laureate, great playwright Wole Soyinka. Soyinka said to Senghor: "A tiger does not proclaim its tigritude. It acts."—from Myth, Literature and the African world (which is coming up on my re-reading list). I agree with complaints about Négritude as an overall notion. For my part, being an avid student of Western classics has never made me feel I cannot also soak myself in my own rich West African heritage. Négritude taken too carelessly can lead to a dangerous combination of anomie and chauvanism. But it's very hard to accuse Senghor and Césaire, the patrons of the movement, of themselves falling into such a trap. They and their colleagues through hard work and masterly writing carved into the world's consciousness a testament to the vast intellectual resources of their native land. They didn't just proclaim. They did act. And the value of their legacy is immeasureable.

I have the poem in Selected Poems of Senghor, edited by Abiola Irele (Cambridge University Press, 1977), which, according to my notes, I bought at Nsukka in 1989 (for ₦10.00). Incredibly, I can't find a good in-print source of Senghor's poetry in French. I must just not be searching rightly. If anyone can recommend one for fellow readers (I'm all set with my Irele edition), please leave a comment.

But appropriately enough, since it's Tuesday again, I have a bit more on "éclairs" versus "foudre". The visceral nature of this distinction was brought home to me a couple of weeks ago at la Table Francophone when Karen was explaining to me how a freak spring thunder and hailstorm had ruined her garden. She said something like "et partout des éclairs", gesturing upwards with both open palms. In my response I said something about "foudre", using the word I'm more familiar with for "lightning" ("fouldre" in Villon's (Old French) L'Épitaph, which I worked from in an earlier Quotīdiē. Karen gave me an odd look, and clarified: "éclairs". Since I'm there to improve my French I asked her for a detailed explanation of the difference. She explained that "éclairs" is lightning with the connotation of distant flashes in the sky, and that "foudre" is lightning with the connotation of striking the ground (or someone), with violent accompaniment of thunder, and the whole bit. Basically, the former is nature's display, and the latter is nature's vengeance. Makes sense given that "foudroyer" means to blast or strike (as with lightning), and "foudrayant!" is an exclamation (based on the participle) mixing terror and excitement. I had to stop using "foudrayant!" when my francophone friends would tease me about its quaintness (I suppose my beloved "donnerwetter" sounds just as quaint to a contemporary German).

[Uche Ogbuji]

via Copia

Contexts, and Scopes, and Provenance, Oh My!

Being the KR theory masochist I am, I've lately been wrestling with the concept of RDF Graph Contexts - yeah, ouch! The motivation is to determine the optimal RDF statement vector size or database configuration for representing RDF sufficiently and in a scalable way. Graph / SubGraph identification seems advantageous for query optimization for large (> 5 Million Triples) RDF stores - especially those built on RDBMSs. At any rate, below are some good references I found on the subject(s):

I'm sure I've missed some other useful ones

[Uche Ogbuji]

via Copia