Another small 4Suite MarkupWriter example: XHTML 1.1

I was writing code to emit XHTML 1.1 using 4Suite and just to double-check the doc types I looked at the spec. I thought it might be useful to write up a small MarkupWriter example for emitting the example in the spec.

from Ft.Xml.MarkupWriter import MarkupWriter
from xml.dom import XHTML_NAMESPACE, XML_NAMESPACE

XHTML_NS = unicode(XHTML_NAMESPACE)
XML_NS = unicode(XML_NAMESPACE)
XHTML11_SYSID = u"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
XHTML11_PUBID = u"-//W3C//DTD XHTML 1.1//EN"

writer = MarkupWriter(indent=u"yes", doctypeSystem=XHTML11_SYSID,
                      doctypePublic=XHTML11_PUBID)
writer.startDocument()
writer.startElement(u'html', XHTML_NS, attributes={(u'xml:lang', XML_NS): u'en'})
writer.startElement(u'head', XHTML_NS)
writer.simpleElement(u'title', XHTML_NS, content=u'Virtual Library')
writer.endElement(u'head', XHTML_NS)
writer.startElement(u'body', XHTML_NS)
writer.startElement(u'p', XHTML_NS)
writer.text(u'Moved to ')
writer.simpleElement(u'a', XHTML_NS,
                     attributes={u'href': u'http://vlib.org/'},
                     content=u'vlib.org')
writer.text(u'.')
writer.endElement(u'p', XHTML_NS)
writer.endElement(u'body', XHTML_NS)
writer.endElement(u'html', XHTML_NS)
writer.endDocument()

It's worth mentioning that this example would be even simpler with template output facilities I've proposed for Amara.

[Uche Ogbuji]

via Copia

Python/XML column #35: EaseXML and more on Unicode

"EaseXML: A Python Data-Binding Tool"

In this month's Python and XML column, Uche Ogbuji examines a new XML data-binding tool for Python: EaseXML. [Jul. 27, 2005]

The main focus of this article is EaseXML, another option for XML data binding. I found the package rather rough around the edges. I also included a section with a bit more on Unicode, which was the topic of the last two articles "Unicode Secrets" and "More Unicode Secrets". This time I introduced the unicodedata module, which provides useful information about characters from the Unicode standard database.

[Uche Ogbuji]

via Copia

Is it coz I is not black?

James Governor pointed me to "Blacks Only!" and I thought my reaction was worth a Copia entry.

When I was in college in Milwaukee I sort of joined National Society of Black Engineers, NSBE, as in, I played for their intramural basketball team, attended a few meetings, and (independent matter) made friends with a number of the members. I don't think I even joined formally, but anyway my then roommate and current business partner Mike Olson challenged me on it. He put up the usual counterexample of the horrified response if he'd started up a white engineer's club. As I recall, I started with a half hearted defense, before admitting that I was uncomfortable with the idea. I'd started out being friends with NSBE members, and never made an explicit, personal, moral stand about the club.

I do think the general idea of exclusion on the basis of race is dangerous, regardless of what past injustices you think you're trying to redress. It's also confusing. I'm raising three mixed race children and where do they fit in with such boundaries. Lori and I generally respond indignantly whenever we're supposed to classify the kids as one race or another. Luckily the census has a mixed race category these days. When they grow up, they can choose to associate as they please, but right now, we have no intention of disrespect to any branch of their rich heritage.

But I'm not a fundamentalist on integration. I understand the occasional motivation for exclusionary clubs. Women's networking groups spring up because even now it's hard for women to find equitable general fora for business. No doubt some other disadvantaged groups such as Black Americans have the same problem, so whereas I think the idea behind NSBE can be dangerous, you won't catch me entirely condemning it. I think some of the most disturbing aspects of the case in the linked article are specific to that case.

For one thing, I read that this blacks-only golf club sees itself as a charity. This beggars common sense considering that they would happily accept "a young, black, successful third-generation, Oxford educated Brit". When you insult the intelligence of those whom you exclude rather than engaging with them to honestly discuss the practical need for exclusion, you're asking for trouble, and you can't expect sympathy.

In South Africa, I think this sort of exclusion is especially problematic because it tarnishes the extraordinary success of the fall of Apartheid. I know and respect a lot of white South Africans, and based on these associations and my following of current events in South Africa, I believe that a gratifying number of the white population in that country is horrified at their racist legacy. Sure, they might not have come to such reform if not for the forceful realities of the freedom movements (much more important than even the infamously leaky sanctions), but all that matters is that they did the right thing in the end, for whatever reasons, and are now largely committed to justice. In turn Mandela, Tutu, etc. showed the most unbelievable courage in fostering an atmosphere of reconciliation. I think the likes of the black golf club causes very dangerous and unnecessary rifts in this peace. Even if it doesn't cause bloody conflict, it will continue the flight of white South Africans out of the country, and I think this a tremendous loss.

[Uche Ogbuji]

via Copia

Quotīdiē

Rap snitches
Telling all their business
Sit in the court and be their own star witness
"Do you see the perpetrator"
"Yeah, I'm right here"
Fuck around, get the whole label sent up for years
True. There's rules to this shit
Fools dare care
Everybody wants to rule the world with tears for fear
Yeah yeah tell 'em—tell it on the mountain hill
Running up their mouth bill
Everybody doubting still.
Informer, keep it up and get tested
Pop pleated bubble vest or double breasted.
He keep a lab down south in the little beast
So much heat you would have thought it was the Middle East.
A little grease always keeps the wheels a spinning
Like sitting on 23s to get the squealers grinning,
Hitting on many trees feel real linen,
Spitting on enemies enemies get the skill for ten men.
With no brains but gum flap
You said there's gun clap,
Then you fled after one slap
Son, shut your trap save it for the bitches
Mmmm. Delicious. Rap snitch knishes.

MF Doom—from "Rapp Snitch Knishes"—MM..Food?

[Note: "with tears for fear" is Doom's pun, not my typo]

This song is nothing but wicked, but in a sly way. I get an image of Berry Gordy leading his famous Motown quality control sessions. "Rapp Snitch Knishes" drops on the platter. The focus group looks around at each other and wonders "what is this weirdness"? What's with the over-tightened electric guitar loop followed offset by the sauntering bass riff? What's with the staccato flow. Maybe? Maybe? Nah. Dump it. Then later on as they're on their way to the car, they all realize that one song from the day's session is firmly lodged in their heads. And they're still replaying it to themselves the next day, and all that week. What do you know? They should not have dumped that song? I think that's the major label attitude to a lot of genius of the Metal Face Doom sort. Fly but too risky.

"Rapp Snitch Knishes" sounds as if it shouldn't be any good, but it's actually a mini masterpiece of abstract hip-hop. And I just love the subject matter. MF Doom is mocking all the superbadass MCs who like to boast on how many people they've killed, how much drugs they've sold, and how many hoes they've pimped. Any sensible person figures that:

  1. Either they're fronting Vanilla Ice type punks or
  2. They're frank but stupid, saving the feds a lot of investigative budget to build a case against them

A lot of MCs make fun of category 1, but it took Doom's audacity to pull cards on category 2. And if people don't believe he has a case, they need look no further than Murder Inc. and Death Row, both record labels that had to change their names because of the effects of Rap Snitching. And while 50 cent's G Unit is busy reveling in Murder Inc.'s misfortune, their fans should reflect on the fact that there is no bigger Rap Snitch Knish right now than 50 cent (well, The Game is making quite a run at that title).

Mr. Fantastik (who's 'dro is the stickiest, he says) guests lovely on the track, and the playful back and forth is enough fun that I hope Doom and Fantastik (whom I'd never heard from before) team up more often in future. It's always worth checking for MF Doom, a true hip-hop veteran, one third, as "Zev Love X", of classic group KMD, which also included Doom's brother Subroc. KMD met their demise because they said exactly what they thought, and MF Doom continues the tradition. His all time classic is Madvillainy which is ingenious lunacy. I also recommend Spitkicker's The Next Spit, volume 3, a mix CD hosted by Doom, and featuring a couple of tracks from MM..Food?

MM..Food? and Madvillainy are two of the top ten albums of 2004. If you've been sleeping, wake up and cop that underground goodness...early. And the next time you hear some MC killing hundreds of victims on wax, just think quietly to yourself. Mmmm. Delicious. Rap snitch knishes.

[Uche Ogbuji]

via Copia

Quotīdiē

Benediction

God banish from your house
The fly, the roach, the mouse

That riots in the walls
Until the plaster falls;

Admonish from your door
The hypocrite and liar;

No shy, soft tigrish fear
Permit upon your stair,

Nor agents of your doubt.
God drive them whistling out.

Let nothing touched with evil,
Let nothing that can shrivel

Heart's tenderest frond, intrude
Upon your still deep blood.

Against the drip of night
God keep all windows tight,

Protect your mirrors from
Surprise, delirium,

Admit no trailing wind
Into your shuttered mind

To plume the lake of sleep
With dreams. If you must weep

God give you tears, but leave
You secrecy to grieve,

And islands for your pride,
And love to nest in your side.

God grant that, to the bone,
Yourself may be your own;

God grant that I may be
(my sweet) sweet company.

Stanley Kunitz—"Benediction"

Stanley Kunitz turns 100 today. I can't say that he ranks among my favorite poets, but in the above he certainly wrote a piece that ranks among my favorite poems. And to write one great poem in a lifetime is quite an achievement. Many people are celebrating Kunitz's milestone, but as an NPR fan, I'll naturally wave at the coverage on All Things Considered, which is almost entirely taken up by what sounds like a new poem of his, "The Long Boat". It's a nice piece (the text is on the page I just linked to) using the viking funereal boat as its central metaphor, offering some very palpable images and a finely balanced ending that can only come from the quiet wisdom of long years contemplating that sepulchural voyage.

As I said in my piece on Richard Eberhart:

I meant to link to "Benediction" but I can't find a respectable transcription of on-line. It deserves its own entry, so some other day I'll type it in for Quotīdiē. But I do want to mention that I found "A Young Greek, Killed in the Wars", "The Fury of Aerial Bombardment" and "Benediction" all in my favorite small poetry book, John Wain's Anthology of Modern Poetry (Hutchinson, 1963), ISBN 0090671317. It's out of print and not easy to find, even used (here are the listings on Amazon UK Marketplace). I bought it in 1988 at the University of Nigeria and it has been one of my most treasured books all this time. It's a superb collection, and if you can lay your hands on a copy, I suggest you do so.

Well, I've put in that promised labor, and here is "Benediction" on line. And yes, I've gone on about that Wain book, mentioning it yesterday as well. What can I say? It's worth all the repeated mention, except that you can't buy it anymore, it seems. But I had an idea yesterday. Soon, I'll post the table of contents here, with links to on-line versions of the poems where possible. This way you can at least enjoy Wain's marvelous selection without suffering through my Quotīdiē ramblings into the fathomless future.

Here is Wain on Kunitz:

"Benediction" and "The War against the Trees" are good examples of Stanley Kunitz's open, lyrical style, and need no comment....

And surely you agree, reading "Benediction". Who says good poetry has to be inscrutable?

Watch this space for more from Wain. And read a Kunitz poem or two this weekend. It's quiet and very intelligent entertainment.

[Uche Ogbuji]

via Copia

Quotīdiē ❧ Udoka Julian Ogbuji

Morning Song
Love set you going like a fat gold watch.
The midwife slapped your footsoles, and your bald cry
Took its place among the elements.

Our voices echo, magnifying your arrival. New statue.
In a drafty museum, your nakedness
Shadows our safety. We stand round blankly as walls.

I'm no more your mother
Than the cloud that distils a mirror to reflect its own slow
Effacement at the wind's hand.

All night your moth-breath
Flickers among the flat pink roses. I wake to listen:
A far sea moves in my ear.

One cry, and I stumble from bed, cow-heavy and floral
In my Victorian nightgown.
Your mouth opens clean as a cat's. The window square

Whitens and swallows its dull stars. And now you try
Your handful of notes;
The clear vowels rise like balloons.

Sylvia Plath—"Morning Song"

The child has a name now. Udoka Julian Melayo Ogbuji. Udoka means roughly "peace reigns". As with many Igbo names, it has a couple of levels of meaning for us, mostly as a hope for unlikely peace in a household with three boys, and partly as an imprecation for peace in troubled times, globally. It's shortened to "Udo", pronounced "oo-doh" with stress on the second syllable. Julian follows from the month (I suppose Jide could have been "August", but we preferred "Maxwell"). Melayo means roughly "relax", and is my father's contribution. We were all hoping for a girl, and even though it's a boy, we're all easy like Sunday morning.

And so speaking of mornings, what better poem for the mood than one of my favorite Plath pieces, another discovery from my favorite small poetry book, John Wain's Anthology of Modern Poetry (Hutchinson, 1963), ISBN 0090671317, which I've mentioned before. I love reciting "Morning Song" to my children at bedtime, and doubly so with the roseate memory of Udoka's birth still fresh. One thing about reciting it is that I cannot bring myself to say "New statue. In a drafty museum,...". I always end up saying: "New statue in a drafty museum,..." Another thing is the lovely, last metaphor, the vivid synaesthesia that is so typical of Plath's keen sensibility. It's a very romanticized fallacy of a newborn baby's very nasal cry, but also a very crafty expression of the fact that this sound is music to any parent's ears. And the images in this poem just keep coming at you like, well, like purple pila. I'm not one for image for the sake of image, but Plath is one of the few with the craft to pull it off, as I discussed earlier.

And in honor of Lori, who brought Udoka forth to the world, here's another poem in the genus.

Metaphors
I'm a riddle in nine syllables,
An elephant, a ponderous house,
A melon strolling on two tendrils.
O red fruit, ivory, fine timbers!
This loaf's big with its yeasty rising.
Money's new-minted in this fat purse.
I'm a means, a stage, a cow in calf.
I've eaten a bag of green apples,
Boarded the train there's no getting off.

—Sylvia Plath—"Metaphors"

[Uche Ogbuji]

via Copia

'Tis a boy

If I've disappeared from correspondence over the past few days, here's why:

Lori did her amazingly quick labor miracle thingie and brought the new boy into the world at our home at 21:16, Monday, 25 July, 2005. (It was a planned home birth, expert midwife in attendance, and all that). We haven't named him yet, in part because we had hoped for a girl and really didn't give the other 50% contingency much thought, and partly because I, and surprisingly Lori, tend towards the Igbo tradition of waiting to meet the little one before you burden him with a life-long name (Jide wasn't named until a few days after his birth).  Update: he is Udoka Julian Melayo Ogbuji.

Mother and child are doing quite well. Eldest boy Osita is very excited to have another baby brother, and middle son Jide is largely concerned with his own affairs (we do have a few pictures of him playing with his new brother). I appear to be continuing evidence of the Ogbuji Y chromosome entrenchment: My parents had three children, all boys, and now they have 5 grandchildren, all boys. But putting my three sons side by side, I find quite marvelous in fact what Lori and I had considered scary in imagination. We are most emphatically not trading this batch in, least of all the newest household terror.

More pictures on my Flickr photo stream.

[Uche Ogbuji]

via Copia

XML data bindings, static languages, dynamic languages

A discussion about the brokenness of W3C XML Schema (WXS) on XML-DEV turned interestingly to the topic of the limitations of XML data bindings. This thread crystallized into a truly bizarre subthread where we had Mike Champion and Paul Downey actually trying to argue that the silly WXS wart xsi:nil might be more important in XML than mixed content (honestly the arrogance of some of the XML gentry just takes my breath away). As usual it was Eric van der Vlist and Elliotte Harold patiently arguing common sense, and at one point Pete Cordell asked them:

How do you think a data binding app should handle mixed content? We lump a complex types mixed content into a string and stop there, which I don't think is ideal (although it is a common approach). Another approach could be to have strings in your language binding classes (in our case C++) interleaved with the data elements that would store the CDATA parts. Would this be better? Is there a need for both?

Of course as author of Amara Bindery, a Python data binding, my response to this is "it's easy to handle mixed content." Moving on in the thread he elaborates:

Being guilty of being a code-head (and a binding one at that - can it get worse!), I'm keen to know how you'd like us to make a better fist of it. One way of binding the example of "<p>This is <strong>very</strong> important</p>" might be to have a class structure that (with any unused elements ignored) looks like:-

class p
{
    string cdata1;        // = "This is "
    class strong strong;
    string cdata2;        // = " important"
};

class strong
{
    string cdata1;        // = "very"
};

as opposed to (ignoring the CDATA):

class p
{
    class strong strong;
};

class strong
{
};

or (lumping all the mixed text together):

class p
{
    string mixedContent;    // = "<p>This is <strong>very</strong> important</p>"
};

Or do you just decide that binding isn't the right solution in this case, or a hybrid is required?

It looks to me like a problem with poor expressiveness in a statically, strongly typed language. Of course, static versus dynamic is a hot topic these days, and has been since the "scripting language" diss has started to wear thin. But the simple fact is that Amara doesn't even blink at this, and needs a lot less superstructure:

>>> from amara.binderytools import bind_string
>>> doc = bind_string("<p>This is <strong>very</strong> important</p>")
>>> doc.p
<amara.bindery.p object at 0xb7bab0ec>
>>> doc.p.xml()
'<p>This is <strong>very</strong>  important</p>'
>>> doc.p.strong
<amara.bindery.strong object at 0xb7bab14c>
>>> doc.p.strong.xml()
'<strong>very</strong>'
>>> doc.p.xml_children
[u'This is ', <amara.bindery.strong object at 0xb7bab14c>, u' important']

There's the magic. All the XML data is there; it uses the vocabulary of the XML itself in the object model (as expected for a data binding); it maintains the full structure of the mixed content in a very easy way for the user to process. And if we ever decide we just want to content, unmixed, we can just use the usual XPath technique:

>>> doc.p.xml_xpath(u"string(.)")
u'This is very  important'

So there. Mixed content easily handled. Imagine my disappointment at the despairing responses of Paul Downey and even Elliotte Harold:

Personally I'd stay away from data binding for use cases like this. Dealing with mixed content is hardly the only problem. You also have to deal with repeated elements, omitted elements, and order. Child elements just don't work well as fields. You can of course fix all this, but then you end up with something about as complicated as DOM.

Data binding is a plausible solution for going from objects and classes to XML documents and schemas; but it's a one-way ride. Going the other direction: from documents and schemas to objects and classes is much more complicated and generally not worth the hassle.

As I hope my Amara example shows, you do not need to end up with anything nearly as complex as DOM, and it's hardly a one-way ride. I think it should be made clear that a lot of the difficulties that seem to stem from Java's own limitations are not general XML processing problems, and thus I do not think they should properly inform a problem such as the emphasis of an XML schema language. In fact, I've [always argued]() that it's the very marrying of XML technology to the limitations of other technologies such as statically-typed OO languages and relational DBMSes that results in horrors such as WXS and XQuery. When designers focus on XML qua XML, as the RELAX NG folks did and the XPath folks did, for example, the results tend to be quite superior.

Eric did point out Amara in the thread.

An interesting side note—a question about non-XHTML use cases of mixed content (one even needs to ask?!) led once again to mention of the most widely underestimated XML modeling problem of all time: the structure of personal names. Peter Gerstbach provided the reminder this time. I've done my bit in the past.

[Uche Ogbuji]

via Copia

Beyond HTML tidy, or "Are you a chef? 'Cause you keep feeding me soup."

In my last entry I presented a bit of code to turn Amara XML toolkit into a super duper HTML slurper creating XHTML data binding objects. Tidy was the weapon. Well, ya'll readers wasted no time pimping me the Soups. First John Cowan mentioned his TagSoup. I hadn't considered it because it's a Java tool, and I was working in Python. But I'd ended up using Tidy through the command line anyway, so TagSoup should be worth a look.

And hells yeah, it is! It's easy to use, mad fast, and handles all the pages that were tripping up Tidy for me. I was able to very easily update Amara's tidy.py demo to use Tagsoup, if available. Making it available on my Linux box was a simple matter of:

wget http://mercury.ccil.org/~cowan/XML/tagsoup/tagsoup-1.0rc3.jar
ln -s tagsoup-1.0rc3.jar tagsoup.jar

That's all. Thanks, John.

Next up Dethe Elza asked about BeautifulSoup. As I mentioned in "Wrestling HTML", I haven't done much with this package because it's more of a pull/scrape approach, and I tend to prefer having a fully cleaned up XHTML to work with. But to be fair, my last extract-the-mp3-links example was precisely the sort of case where pull/scrape is OK, so I thought I'd get my feet wet with BeautifulSoup by writing an equivalent to that code snippet.

import re
import urllib
from BeautifulSoup import BeautifulSoup
url = "http://webjay.org/by/chromegat/theclassicnaijajukebox2823229"
stream = urllib.urlopen(url)
soup = BeautifulSoup(stream)
for incident in soup('a', {'href' : re.compile('\\..*mp3$')}):
    print incident['href']

Very nice. I wonder how far that little XPath-like convention goes.

In a preëmptive move, I'll mention Danny's own brand of soup, psoup. Maybe I'll have some time to give that a whirl, soon.

It's good to have alternatives, especially when dealing with madness on the order of our Web of tag soup.

And BTW, for the non-hip-hop headz, the title quote is by the female player in the old Positive K hit "I Got a Man" (What's your man gotta do with me?..."

I gotta ask you a question, troop:
Are you a chef? 'Cause you keep feeding me soup.

Hmm. Does that count as a Quotīdiē?

[Uche Ogbuji]

via Copia

Versa: Pattern Matching (article)

My Versa article (Versa: Path-Based RDF Query Language) is up but I've recently been tinkering with Emeka and haven't been able to post about it. I wanted Emeka functional so people could familiarize themselves with the language by example instead of specification deciphering. Simple saying ".help" in a channel where he is located (#swig,#swhack,#4suite,#foaf) should be sufficient. Please, if his commands interfere with an existing bot's, please let me know.

The article is based (in part) on an earlier paper I wrote on Versa. I reworked it to focus more on the use patterns in common with other existing query languages (SPARQL primarily) to make the point that RDF querying is truely not in it's infancy any more. I also wanted to use it as a spring board to suggest some possible enhancements to an already (IMHO) expressive syntax (mostly burrowed from N3)

My hope is to spark some conversation across the opposing ends as well as get people familiar with the language for the betterment of RDF and RDF querying.

See an exchange between Dave Beckett and myself on the #swig scratchpad.

[Uche Ogbuji]

via Copia