Copia

New shopping cart features for Sun.com

Weblogging has been pretty thin for me lately, as has everything else. for the past few months now I've been working on a large XML-driven integration project at Sun Microsystems. I consult as a data architect to the group that drives the main www.sun.com Web site, as well as product marketing pages and other data-driven venues. That's been a large part of my day job for the last four years, and in the most recent project Sun is working in a versatile new e-commerce engine. They put a lot of care into analysis for integrating this into existing product pages, so I found myself waist deep in XML pipeline architecture and data flows from numerous source systems (some XML, some ERP, some CMS and every other TLA you can fathom). The XML pipeline aggregates the sources according to a formal data model, the result of which feeds normalized XML data into the commerce back end. A veritable enterprise mash-up. It's been a lot of work, leavened by collaboration with a top-notch team, and with the launch last week of the new system I've found palpable reward.

Web Usability guru Martin Hardee, whose team put together the stringent design parameters for the project, mentioned the new feature this week.

We're already off building on this success, and it's more enterprise-grade (yeah, buzzword, sue me) XML modeling and pipeline-driven architecture with a global flavor for a good while to come, I expect. And probably not all that much time for Weblogging.

[Uche Ogbuji]

via Copia

Today's (non-XML) WTF or High stakes in the SOA sweeps

So here's Infoworld's daily nugget of wisdom, poised over the provocation: "Should you fire your enterprise architect in 2007? Take the test."

The largest and most disturbing issue ... is the fact that there seems to be a huge chasm between the traditional enterprise architecture crowd, and those looking at the value of SOA. Indeed, enterprise architecture, as a notion, has morphed from an approach for the betterment of corporate IT to a management practice, at least for some. Thus, the person that is needed to understand and implement the value of SOA is sometimes not the current enterprise architect in charge. -- David Linthicum.

So the SOA wars are heating up. More and more smart people are pointing out that the emperor has no clothes; but stakes is still crazy high. Some folks haven't yet made all their money from SOA. So how do the stakeholders respond? With cold-blooded threats.

"So your architect isn't all bought into SOA, eh? Well fire him, dammit."

And oh, isn't it delicious irony that this dude is claiming it's the experienced architects cautious on SOA who are establishing a pet management practice within IT. Oh, there's no way the SOA sellers could be guilty of that. Noooo. Never. Never. Never. Neeeever!

[Uche Ogbuji]

via Copia

Thinking“Thinking XML: The XML decade”

I've been very pleased at the response to my article “Thinking XML: The XML decade”. I like to write two types of articles: straightforward expositions of running code, and pieces providing more general perspective on some aspect of technology. I've felt some pressure in the publishing world lately to dispense with some of the latter in favor of puff pieces for the hottest technology fad; right now, having "SOA", "AJAX" or "podcast" in the title is the formula for selling a book or article. I've been lucky throughout my career to build relationships with editors who trust my judgment, even if my writing is so often out of the mainstream. As such, whenever an article that's not of any obvious appeal touches a chord and provokes intelligent response, I welcome it as some more ammunition about following the road less trampled in my future writing.

Karl Dubost of the W3C especially appreciated my effort to bring perspective to the inevitable tensions that emerged as XML took hold in various communities.

Uche Ogbuji has written an excellent article The XML Decade. The author is going through XML development history as well as tensions between technological choices in XML communities. One of the fundamental choices of creating XML was to remove the data jail created by some application or programming languages.

Mike Champion, with whom I've often jousted (we have very different ideas of what's practical) was very kind in "People are reflecting on XML after 10 years".

A more balanced assessment of the special [IBM Systems Journal issue] is from Uche Ogbuji. There is quite a nice summary of the very different points of view about what XML is good for and how it can be used. He reminds us that today's blog debates about simplicity/complexity, tight/loose coupling, static/dynamic typing, etc. reflect debates that go back to the very beginning. I particularly like his pushback on one article's assertion that XML leverages the value of "information hiding" in OO design.

It was a really big leap for me from OO (and structured programming/abstract data type) orthodoxy to embrace of XML's open data philosophy (more on that in “Objects. Encapsulation. XML?”). It did help that I'd watched application interface wars from numerous angles in my career: RPC granularity, mixins and implementation versus interface inheritance, SQL/C interface coupling, etc. It start to become apparent to me that something was wrong when we were translating natural business domain problems into such patently artificial forms. RDBMS purists have been making a similar point for ages, but in my opinion, they just want to replace N-tier applications artifice with their own brand of artifice. XML is far from some magic mix for natural expression of the business domain, but I believe that XML artifacts tend to be fundamentally more transparent than other approaches to computing. In my experience, I've found that it's easier to maintain even a poorly-designed XML-driven system than a well-designed system where the programming is preeminent.

Mike's entry goes on to analyze the usefulness, in perspective, of the 10 guiding principles for XML. When he speaks of "a more balanced view" he's contrasting the SlashDot thread on the article, which is mostly filled with the sort of half-educated nonsense that drove me from that site a few years ago (these days I find the most respectable discussion on reddit.com). Poor Liam Quinn spent an awful lot of patience on a gang of inveterate flame-throwers. Besides his calm explanations the best bits in that thread were on HL7 and ASN.1.

Supposedly the new version 3 [HL7] standard (which uses the "modeling approach") will be much more firm with the implementors, which will hopefully mean that every now and then one implementation will actually be compatible with another implementation. I've looked over their "models" and they've modelled a lot of the business use-case stuff for patient data, but not a lot of the actual data itself. Hopefully when it's done, it'll come out a bit better baked than previous versions.

That does not sound encouraging. I know Chimezie, my brother and Copia co-conspirator, is doing some really exciting work on patient data records. More RDF than XML, but I know he has a good head for the structured/unstructured data continuum, so I hope his work propagates more widely than just the Cleveland Clinic Foundation.

J. Andrew Rogers made the point that Simon St.Laurent and I (among others) have made in the many cases where people misuse XML rather than something more suitable, such as ASN.1.

The "slow processing" is caused by more than taking a lot of space. XML is basically a document markup but is frequently and regular used as a wire protocol, which has very different design requirements if you want a good standard. And in fact we already have a good standard for this kind of thing called "ASN.1", which was actually engineered to be extremely efficient as a wire protocol standard. (There is also an ITU standard for encoding XML as ASN.1 called XER, which solves many of the performance problems.)

Of course, I think he goes a bit too far.

The only real advantage XML has is that it is (sort of) human readable. Raw TLV formatted documents are a bit opaque, but they can be trivially converted into an XML-like format with no loss (and back) without giving software parsers headaches. There is buckets of irony that the deficiencies of XML are being fixed by essentially converting it to ASN.1 style formats so that machines can parse them with maximum efficiency. Yet another case of computer science history repeating itself. XML is not useful for much more than a presentation layer, and the fact that it is often treated as far more is ridiculous.

I'd actually argue that XML is suited for a (semi-structured) model layer, not a presentation layer. For one thing, wire efficiency often counts in presentation as well. But his essential point is correct that XML is an awful substitute for ASN.1 as a wire protocol. By the same token, the Web services stack is an awful substitute for CORBA/OMA and even Microsoft's answers to same. It seems the industry is slowly beginning to realize this. I love all the many articles I see with titles such as "We know your SOA investment is stuck firmly in the toilet, but honest, believe us, there is an effective way to use this stuff. Really."

Anyway later on in that sub-thread:

The company I work for has had a lot of success with XML, and are planning to move the internal data structure for our application from maps to XML. There is one simple reason for our sucess with it: XSLT. A customer asks for output in a specific format? Write a template. Want to display the data on a web page? Write a template that converts to HTML. Want to print to PDF? Write a template that converts to XSL, and use one of many available XSL->PDF processors. Want to use PDF forms to input data? Write a template to convert XFDF to our format. Want to import data from a competitor and steal their customer? You get the picture.

Bingo! The secret to XML's value is transformation. Pass it on.

In "XML at 10: Big or Little?" Mark writes:

What the article ultimately ends up being about is the "Big" idea of XML vs. the oftentimes "Little" implementation of it. The Big idea is that XML can be used in a bottom-up fashion to model the grammar of a particular problem domain in an application- and context-independent manner. The little implementation is when XML is essentially used as a more verbose protocol for data interchange between existing applications. I would guess that is 90+ percent of what it is currently used for.

He's right, and I this overuse is one of the reasons XML so often elicits reflex hostility. Then again, anything that doesn't elicit such hostility in some quarters is, of course, entirely irrelevant. I think it's a good thing that cannot be said of XML.

[Uche Ogbuji]

via Copia

Serendip's 1920x1600 wallpapers

Maybe this is a big fat DUH! to everyone else, and it sure is to me in retrospect, but I'm not always able to find a nice diversity of fly wallpapers that suit my Dell Inspiron 8600's resolution of 1920x1600. Tonight I did a Google search because I'm looking for an LCD desktop monitor with that resolution. Lazily, I just Googled "1920x1200" and I was greeted with three really snazzy hits from Google images. One nature shot, one astronomical feature and one neat, abstract graphic. Clicking into Google images revealed a ton of really nice wallpaper options. Some are on the seamy side (I Google with SafeSearch off) but it's all good.

Meanwhile, I understand that 1920x1200 (WUXGA) TFT leads to crazy yield problems, but it is a bit of a drag that WSXGA (1680x1050) LCDs can be had for under $300 whereas the next step up sets you back a minimum of $700. I'll have to figure out how high to ride the exponential price/pixel curve.

[Uche Ogbuji]

via Copia

“Thinking XML: The XML decade”

Subtitle: Thoughts on IBM Systems Journal's retrospective of XML at ten years (or so)
Synopsis: IBM Systems Journal recently published an issue dedicated to XML's 10th anniversary. It is primarily a collection of interesting papers for XML application techniques, but some of its articles offer general discussion of the technical, economic and even cultural effects of XML. There is a lot in these papers to draw from in thinking about why XML has been successful, and what it would take for XML to continue its success. This article expands on some of these topics that are especially relevant to readers of this column.

In this article I touch on points from what the XML community can learn from the COBOL boom of the 90s, to why it's dangerous to use XML as a basis for traditional application modeling systems. It's a bit of a gestalt approach to analyzing some of the key issues facing XML technology at this milestone, and what it might take to ensure XML is still relevant and valuable after another decade. And by that I mean valuable in itself, and not just as a legacy format with valuable data.

[Uche Ogbuji]

via Copia

Ubuntu Edgy, Firefox and Flash

One of the things I noticed upon upgrade to Edgy is that I'd lost Flash support in Firefox. I wish I could say that I had no need for Adobe/Macromedia's crap, but I can't. For one thing, I often make musical discoveries on MySpace pages--my musical interests tend towards the underground. I set about this evening getting Flash working again, and it was harder than I expected. Recently MySpace switched to Flash 9 (punks!) and I decided to use the version 9 beta bone that Adobe tossed to the Linux community. I found that the nonfree virtual Flash package had ben removed, so I restored it:

sudo apt-get install flashplugin-nonfree

The plan would be to just copy the libflashplayer.so from Flash 9 beta over the installed version 7. BTW the Flash 7 file was 2MB and the Flash 9 Beta file is almost 7MB. Might be debugging info or something, but yeesh!

Anyway, even before copying the Flash 9 beta I found that Firefox was crashing every time it loaded a Flash site. After some hunting I found a bug with Ubuntu Edgy and either version of the Flash plugin. There are several workarounds mentioned in that thread, but one of them is simply to bump up color depth to 24 bits. I hadn't even known Edgy had limited me to 16 bits, but surely enough I checked my /etc/X11/xorg.conf:

Section "Screen"
    Identifier      "Default Screen"
    Device          "NVIDIA Corporation NV34M [GeForce FX Go 5200]"
    Monitor         "Generic Monitor"
    DefaultDepth    16

I changed that from 16 to 24, restarted X, and all has been well since then. Fingers crossed.

Meanwhile, speaking of music on MySpace, here's my brother's crew, StrawHat BentLow.

...your best advice is look for the hat...

[Uche Ogbuji]

via Copia

Progress on two Reference Implementations for RETE and GRDDL

Whew! During the moments I was able to spare while at ISWC (I was only there on monday and tuesday, unfortunately), I finished up two 'reference' implementations I've been enthusiastically hacking on quite recently. By reference implementation, I mean an implementation that attempts to follow a specification verbatim as an exercise to get a better understanding of it.

I still need a write up on the engaging conversations I had at ISWC (it was really an impressive conference, even from just the 2 days worth I got to see) as well as the presentation I gave on using GRDDL and Web architecture to meet the requirements of Computer-based Patient Records.

FuXi and DLP

The first milestone was with FuXi, which I ended up rewriting completely based on some exchanges with Peter Lin and Charles Young.

This has probably been the most challenging piece of software I've ever written and I was quite niave in the beginning about my level of understanding of the nuances of RETE. Anyone interested in the formal intersection of Notation 3 / RDF syntax and the RETE algorithm(s) will find the exchanges in the comments of the above post very instructive - and Peter Lin's blog in general. Though he and I have our differences in the value of mapping RETE to RDF/N3 his insights into my efforts have been very helpful.

In the process, I discovered Robert Doorenbos PhD thesis "Production Matching for Large Learning Systems" which was incredibly valuable in giving me a comprehensive picture of how RETE (and RETE/UL) could be 'ported' to accomodate Notation 3 reasoning.

The primary motivation (which has led to many sleepless nights and late night hackery) is what I see as a lack of understanding within the community of semantic web developers of the limitations of Tableux-based reasoning and the common misconception that the major Tableaux-based reasoners (Fact++, Racer, Pellet, etc..) represent the ceiling of DL reasoning capability.

The reality is that logic programming has been around the block much longer than DL and has much more mature algorithms available (the primary one being RETE). I've invested quite a bit of effort in what I believe will (eventually) demonstrate very large-scale DL reasoning performance that will put Tableaux-based reasoning to shame - if only to make it clear that more investigation into the intersection of LP and DL is crucial for making the dream of the SW a verbatim reality.

Ofcouse, there is a limit to what aspects of DL can be expressed as LP rules (this subset is called Description Logic Programming). The 'original' DLP paper does well to explain this limitation, but I believe this subset represents the more commonly used portions of OWL and the portions of OWL 1.0 (and OWL 1.1 for that matter) left out by such an intersection will not be missed.
Ivan Herman pointed me to a paper by Horst, Herman which is quite comprehensive in outlining how this explicit intersection can be expressed axiomatically and the computational consequences of such an axiomatization. Ivan used this a guide for his RDFSClosure module.

Not enough (IMHO) has been done to explore this intersection because people are comfy with the confines of non-LP algorithms. The trail (currently somewhat cold) left by the Mindswap work on Pychinko needs to be picked up, followed and improved.

So, I rolled up my sleeves, dug deep and did my best to familiarize myself with the nuances of production system optimization. Most of the hard work has already been done, thanks to Robert Doorenbos subsetting (and extension) of the original Charles Forgy algorithm. FuXi, gets through a large majority the OWL tests using a ruleset that closely implements what Horst lays out in his paper and does so with impressive times - even with more optimizations pending.

The most recent changes include a command-line interface for launching it:

chimezie@Zion:~/devel/Fuxi$ python Fuxi.py --out=n3 --ruleFacts
--ns=owl=http://www.w3.org/2002/07/owl#
--ns=test=http://metacognition.info/FuXi/DL-SHIOF-test.n3#
--rules=test/DL-SHIOF-test.n3
Time to build production rule (RDFLib): 0.0172629356384 seconds
Time to calculate closure on working memory: 224.906921387 m seconds

@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix test: <http://metacognition.info/FuXi/DL-SHIOF-test.n3#>.

 test:Animal test:in _:XRfZIlKy56,
        _:XRfZIlKy57.

 test:Cosi_fan_tutte a test:DaPonteOperaOfMozart;
    test:in _:XRfZIlKy47,
        _:XRfZIlKy48,
        [].

 test:Don_Giovanni a test:DaPonteOperaOfMozart;
    test:in _:XRfZIlKy47,
        _:XRfZIlKy48.

 .... snip ...

FuXi still doesn't support 'built-ins' (or custom comparisons), but luckily the thesis includes a section on how to implement non equality testing of rule constraints that should be (relatively) easy to add. The theseis also includes a section on how negated conditions can be implemented (which is probably the most glaring axiom missing from DLP). Finally, Robert's paper includes a clever mechanism for hashing the Alpha network that has yet to be implemented (and is simple enough to implement) that should contribute significant performance gains.

There are other pleasant surprises with the current codebase. The rule compiler can be used to identify inefficencies in rule patterns, the command-line program can be used to serialize the closure delta (i.e., only the triples inferred from the ruleset and facts), and (my favorite) a Notation 3 ruleset can be exported as a graphviz diagram in order to visualize the rule network. Having 'browsed' various rule-sets in this way, I must say it helps in understanding the nuances of optimization when you can see the discrimination network that the triples are propagated through.

I don't have a permanent home for FuXi yet but have applied for a sourceforge project (especially since SF now supports SVN, apparently). So, until then, FuXi can be downloaded from:

GRDDL Client for 4Suite and RDFLib

During the same period, I've also been working on a 'reference' implementation of GRDDL (per the recently released Working Draft) for 4Suite and RDFLib. It's a bit ironic in doing so, since the 4Suite repository framework has essentially been using GRDDL-like Content Management Systems mechanisms since its inception (sometime in 2001).
However, I thought doing so would be the perfect oppurtunity to:

Demonstrate how 4Suite can be used with RDFLib (as a prep to the pending deprecation of 4Suite RDF for RDFLib)
Build a framework to compose illustrative test cases for the Working Group (of which I'm a member)
As a way to completely familiarize myself with the various GRDDL mechanisms

I posted this to the Working Group mailing list and plan to continue working on it. In particular, the nice thing about the convergence of these two projects of mine is that I've been able to think a bit about how both GRDDL and Fuxi could be used to implement efficient, opportunistic programs that extract RDF via GRDDL and explicit links (rdf:seeAlso, owl:imports, rdfs:isDefinedBy) and perform incremental web closures by adding the triples discovered in this way one at a time to a live RETE network.

The RETE algorithm is tailored specifically to behave as a black box to changes to a production system and so crawling the web, extracting RDF (a triple at a time) and reasoning over it in this way (the ultimate semantic web scenario) becomes a very real scenario with sub-second response times to boot. At the very least, reasoning should cease to be much of a bottleneck compared to the actual dereferencing and parsing of RDF from distributed locations. Very exciting prospects. Watch this space..

Chimezie Ogbuji

via Copia

“The professional architect, Part 1: How developers become architects”

Just published:

“The professional architect, Part 1: How developers become architects”

Synopsis: Many architects come from the ranks of good developers, but not every good developer wants to be an architect, nor are all of them suited for the role. Whether you're a developer contemplating a career shift or a manager looking for suitable candidates for an architectural responsibility, it's important to have a well-informed perspective on this transition. This article discusses the journey from implementation specialization to architecture.

The article is just a collection of my observations of developers and architects, as well as my own experiences in both roles. Here's the introductory section:

When you look for a good musical conductor, you start by looking for a good musician. But not every good musician makes a good conductor. The situation is similar in the professional development of architects. More and more IT organizations are understanding the importance of sound software architecture, and the architect profession is rapidly emerging as a separate discipline within IT. This presents some fresh challenges to management as they look to recruit architects from a fairly small labor pool.[...] The fastest way over these hurdles is to understand that most good architects are also good developers, so probably the first place to look for architect talent is among the ranks of regular developers.[...]

This article outlines what it takes for a developer to become an architect. I'll present the perspective of a developer who might be considering such a move, as well as that of a manager assessing developers for such a transition. I'll also provide a series of factors to consider when making these decisions.

I'll be writing a few more articles on the architect's profession for IBM developerWorks. A few years ago I wrote “A custom-fit career in app development”, an article discussing how developers can build a career without being enslaved to the mainstream of technology trends.

[Uche Ogbuji]

via Copia

Thank-you notes and informal debt tracking with IOU Note

A few of my friends have launched the beta of a Web site from which you can send informal notes to track debts (in cash, goods or deeds). Say a business colleague buys you lunch and you want to combine a thank-you note with an offer to reciprocate. IOU Note makes it easy to do so. Interesting idea, and I think they've executed it well, although I'm of course biased. I did send them a bunch of suggestions and I know they're furiously working to improve the service. It will be fun to watch their progress.

[Uche Ogbuji]

via Copia

Voting. Responsibiity. Sucks.

So I carried mi rumpe to my local poll for the desultory vote. Politics are especially foul right now, and I just cannot energize myself for any vote. Never mind the constant campaign phone calls and relentless attack ads. I ignore those. (I did really enjoy NPR's clever take on the matter—John Jacob Jingleheimer Schmidt. His name is my name too. Whenever we go out. The people always shout: "Hey! What about Iraq?"). It's all more fundamental than that.

Oh so thrilled voter The number one political issue for me is the alarming grab of executive power by the Bush administration. I could talk at length about this one issue, but I'll save that for another time (others have expounded on this as well as I can, but the problem is that no one is really listening). My vote can't do very much to make a difference there. I can vote to bring about a congressional (but not Senate, this year) majority in opposition to Bush, but I don't know if that works in an atmosphere where politicians on both sides seem to be cowed by Bush's claims that not giving him a king's grip on liberty will threaten us all with annihilation at the hands of terrorists. Heck, even Ken Salazar, whom I generally like, and who is a Juris Doctor, and so should understand the implications, voted in favor of the landmark Military Commissions Act. Honestly, if Bush manoeuvered to suspend presidential elections in 2008 in the name of national security, I'd be only mildly surprised. And I'm not confident we have an opposition party with the character to check such excess.

And party is the touchstone of discontent in U.S. politics. I presently lean Democratic, despite being in many ways what Americans call a "conservative" (and what everyone else calls a liberal). It's less of an ideological matter than a matter of being appalled at the conduct and policy of those in power right now. We desperately need a credible opposition to keep the brigands more honest. We have two truly bad political parties at present. Bad for different reasons. I don't know what worries me more, the prospect that Republicans might hold on to the house, or that Nancy Pelosi of all clumsy figures might become Speaker. I actually like Pelosi's voting record (there are some demerits, but there are such for most congress-peeps). She's not as left wing as her enemies make her out to be, but she is often far from her own most effective advocate, and I think she's as capable of hurting the opposition cause as her predecessor Gephardt. It's the same problem with Hillary Clinton, who is not as left wing as she is made out to be by the right wing, but who would nevertheless possibly create a hazard for the key goal of changing the order of the executive branch so that some of the might be reversed. Then again, I do wonder if even a Democrat, finding himself in the presidency, would have the integrity to shrink the government's power.

And who would that Democrat be? The only figure in that party I think I could get excited on as a presidential candidate is Barack Obama. On the other hand, there are two Republicans I would easily back for presidency: John McCain and Colin Powell. I would have supported McCain over Gore in 2000 (not that I was then eligible to vote), despite the fact that I thought Clinton was our best president since Roosevelt. I think McCain would have been more likely to carry on that excellence than Gore. Yet McCain is lacking honor even in his own party because they don't consider him right sufficiently right wing. Powell, of course has done the sane if unfortunate thing in removing himself from the fray. I am sooooo not looking forward to 2008.

Meanwhile most of our ballot here in Colorado is dominated by constitutional amendments. People here just don't seem to get that a constitution is the foundation, not the edifice of laws. Well, to be fair, they probably get that, but they also know that such ballot initiatives are the most effective way to run an end around the legislature. I petty much struck everything down (except for a provision to remove obsolete clauses), even stuff I agree with in principle. If we want all these laws we should properly elect representatives who will enact them. I signed up for representative democracy, not mob rule.

Callooh! Callay! I voted. How fricking frabjous.

[Uche Ogbuji]

via Copia