Moving FuXi onto the Development Track

[by Chimezie Ogbuji]

I was recently prompted to consider updating FuXi to use the more recent CVS versions of both Pychinko and rdflib. In particular, I've been itching to get Pychinko working with the new rdflib API – which (as I've mentioned) has had it's API updated significantly to support (amongst other things) support for Notation 3 persistence.

Currently, FuXi works with frozen versions of cwm, rdflib, and Pychiko.

I personally find it more effective to work with reasoning capabilities within the context of a querying language than as a third party software library. This was the original motivation for creating FuXi. Specifically, the process of adding inferred statements, dispatching a prospective query and returning the knowledge base to it's original state is a perfect compromise between classic backward / forward chaining.

It frees up both the query processor and persistence layer from the drudgery of logical inference – a daunting software requirement in its own right. Of course, the price paid in this case is the cumbersome software requirements.

It's well worth noting that such on-demand reasoning also provides a practical way to combat the scalability limitations of RDF persistence.

To these ends, I've updated FuXi to work with the current (CVS) versions of rdflib, 4Suite RDF, and pychinko. It's essentially a re-write and provides 3 major modules:

  • FuXi.py (the core component – a means to fire the pychinko interpreter with facts and rules from rdflib graphs)
  • AgentTools.py (provides utility functions for the parsing and scuttering of remote graphs)
  • VersaFuXiExtensions.py (defines Versa extension functions which provide scutter / reasoning capabilities)

Versa Functions:

reason(expr)

This function takes a Versa expression as a string and evaluates it after executing FuXi using any rules associated with the current graph (via a fuxi:ruleBase property). FuXi (and Pychinko, consequently) use the current graph (and any graphs associated by rdfs:isDefinedBy or rdfs:seeAlso) as the set of facts against which the rules are fired.

class(instances)

This function returns the class(es) – rdfs:Class or owl:Class – of the given list of resources. If the current graph has already been extended to include inferred statements (via the reason function, perhaps), it simply returns the objects of all rdf:type statements made against the resources. Otherwise, it registers, compiles, and fires a set of OWL/RDFS rules (a reasonable subset of owl-rules.n3 and rdfs-rules.n3 bundled with Euler) against the current graph (and any associated graphs) before matching classes to return.

type(klasses)

This essentially overrides the default 4Suite RDF implementation of this 'built-in' Versa function which attempts to apply RDFS entailment rules in brute force fashion. It behaves just like class with the exception that it returns instances of the given classes instead (essentially it performs the reverse operation).

scutter(url,expr,steps=5)

This function attempts to apply some best practices in the interpretation of a network of remote RDF graphs. In particular it uses content negotiation and Scutter principles to parse linked RDF graphs (expressed in either RDF/XML or Notation 3). The main use case for this function (and the primary motivation for writing it) is identity-reasoning within a remsotely-hosted set of RDF Graphs (FOAF smushing for example)

The FuXi software bundle includes a short ontology documenting the two RDF terms: one is used to manage the automated association of a rule base with a graph and the other identifies a graph that has been expanded by inference.

I have yet to write documentation, so this piece essentially attempts to serve that purpose, however included in the bundle are some unittest cases for each of the above functions. It works off a small set of initial facts.

Unfortunately, a majority of the aforementioned software requirement liability has to do with Pychinko's reliance on the SWAP code base. Initially, I began looking for a functional subset to bundle but later decided it was against the spirit of the combined body of work. So, until a better solution surfaces, the SWAP code can be checked out from CVS like so (taken from ):

$ cvs -d:pserver:anonymous@dev.w3.org:/sources/public login
password? anonymous
$ cvs -d:pserver:anonymous@dev.w3.org:/sources/public get 2000/10/swap

The latest 4Suite CVS snapshot can be downloaded from ftp://ftp.4suite.org/pub/cvs-snapshots/4Suite-CVS.tar.gz,
Pychinko can be retrieved from the Mindswap svn repository, and rdflib can also be retrieved from it's svn repository.

Chimezie Ogbuji

via Copia

Thinking XML #34: Search engine enhancement using the XML WordNet server system

Updated—Fixed link to "Serving up WordNet as XML"

"Thinking XML: Search engine enhancement using the XML WordNet server system"

Subtitle: Also, use XSLT to create an RDF/XML representation of the WordNet data
Synopsis: In previous installments of this column, Uche Ogbuji introduced the WordNet natural language database, and showed how to represent database nodes as XML and serve this XML though the Web. In this article, he shows how to convert this XML to an RDF representation, and how to use the WordNet XML server to enrich search engine technology.

This is the final part of a mini-series within the column. The previous articles are:

In this article I write my own flavor of RDF schema for WordNet, a transform for conversion from the XML format presented previously, and a little demo app that shows how you can use WordNet to enhance search with synonym capabilities (and this time it's a much faster approach).

I hope to publicly host the WordNet server I've developed in this series once I get my home page's CherryPy setup updated for 2.2.

See other articles in the column. Comments here on Copia or on the column's official discussion forum. Next up in Thinking XML, RDF equivalents for the WordNet/XML.

[Uche Ogbuji]

via Copia

4Suite XML 1.0b3

I posted the 4Suite XML 1.0b3 announcement today. This was supposed to be 1.0rc1 but then Jeremy went and added this little feature. Yeah, 4Suite now has full DTD validation, written in C. Just use the ValidatingReader. PyXML is no longer necessary for any 4Suite feature. I just need to figure out whether Jeremy ever sleeps. I hope to move quickly on a 1.0rc1. Perhaps even in January. We'll see.

I've updated my on-line manual

[Uche Ogbuji]

via Copia

Getting Some Mileage out of Semantic Works

Well, I recently had a need to write-up an OWL ontology describing the components of a 4Suite repository configuration file (which is expressed as an RDF graph, hence the use of OWL to formalize the format). There has been some mention (with regards to the long-term roadmap of the 4Suite repository component) of the possiblity of moving to a pure XML format.

Anyways, below is a diagram of the model produced by Semantic Works. I still think Protege and SWOOP provide much more bang for your buck (when you consider that they are free and Semantic Works isn't) and produce much more concise OWL/RDFS XML. But the ability to produce diagrams of this quality of a complex OWL ontology is definately a plus.

Semantic Works Diagram of 4Suite Repository Configuration Ontology Semantic Works Diagram of 4Suite Repository Configuration Ontology Semantic Works Diagram of 4Suite Repository Configuration Ontology

[Chimezie Ogbuji]

via Copia

CVS log since tag?

My usual trick for creating a "What's changed" summary in my projects is to check CVS for commits since the previous release. SO if the previous release was 24 October 2005 I run

cvs log -NSd ">2005/10/24"

It would be nice if I could do the same thing while specifying the last revision, rather than a date. I wish I could do:

cvs log -NSr<last-rev>::HEAD

but that seems to work only for numerical revisions rather than tags. Does anyone know of any neat hacks to achieve this? Note: if you prefer to advocate Subversion, that's OK, but at least be sure to specify the precise command to do this with SVN so that others can benefit from the example.

Note: this is coming up for me now because I'm wrapping up the packaging for 4Suite 1.0b3 release. One huge new feature: Full DTD support for all the parsers (written in C by the indefatigable Jeremy). One big fix: build support for 64 bit Intel architecture machines.

[Uche Ogbuji]

via Copia

XSLT l10n in 2 tags? 20 says the gentleman over there? Ah, the lady in green has 200.

I was pretty well ROTFL after reading this Daily WTF (via XSLT Blog). OK, so that code isn't even really doing l10n. I'm not sure what the coder thinks it's doing. It's a complete exercise in useless cut and paste. But it's worth noting that you can do the task of competent l10n in a tenth of the tag load used in the WTF example (see Docbook XSL), and you can even do it using a tenth of the tag load used in Docbook, if you don't mind using an XSLT extension module.

[Uche Ogbuji]

via Copia

XSLT l10n in 2 tags? 20 says the gentleman over there? Ah, the lady in green has 200.

I was pretty well ROTFL after reading this Daily WTF (via XSLT Blog). OK, so that code isn't even really doing l10n. I'm not sure what the coder thinks it's doing. It's a complete exercise in useless cut and paste. But it's worth noting that you can do the task of competent l10n in a tenth of the tag load used in the WTF example (see Docbook XSL), and you can even do it using a tenth of the tag load used in Docbook, if you don't mind using an XSLT extension module.

[Uche Ogbuji]

via Copia

Programmatic Access to Repository (SOAP/FtRPC)

This is meant to be a follow-up to my last entry to cover the programmatic remote procotols supported by the 4Suite repository.

FtRPC

The 4Suite repository supports an internal RPC protocol with a Python implementation that provides programmatic access to the repository. The automated 4Suite build process has recently changed significantly so you can browse the Python API documentation (courtesy of John L Clark's build). Each repository could serve it's own instance (the default port is 8803)

SOAP

The 4Suite repository supports a SOAP mapping that essentially attempts to serve as a translation mechanism between SOAP and the internal repository API. A repository instance can manage a SOAP server instance (the default port is 8090).

SOAP Service Namespace

The namespace URI associated with the SOAP service is:

http://xmlns.4suite.org/reserved#services

Authentication

Each SOAP message can have authentication information in the SOAP Header. The format is:

<SOAP-ENV:Header>
  <ftsoap:authenticationHeader>
    <ftsoap:sessionId>..</>
    <ftsoap:sessionKey>..</>
    <ftsoap:authenticatingUser>.. user .. </>
    <ftsoap:authenticatingPassword></>
  </>
</>

where SOAP is bound to:

http://schemas.xmlsoap.org/soap/envelope/Header

Session authentication is supported with the first two header entries. The other two are for simple / basic authentication (very similar to HTTP scenario).

Message-to-Repo API Mapping

SOAP messages are invoked against repository resources. The local name of the SOAP-ENV:Body child element (in the ftsoap namespace) is mapped to the name of the method to invoke. The child elements are mapped to parameters to the methods. There are certain special parameters:

  • scrpath (the repository path of the resource to execute the method against)
  • base64 (a boolean value indicating whether or not the content is Base 64 encoded – fault by default)
  • src (the content – transmitted as pure text or Base 64 encoded)
  • updateSrc (used by the xUpdate method as the XUpdate document – transmitted as pure text or Base 64 encoded)

The method is invoked with one of the following as the response:

  • ftsoap:successReponse (if there is nothing returned)
  • ftsoap:valueResponse (the value returned – it's string representation)
  • ftsoap:Resource (if a resource itself is returned)
  • SOAP-ENV:Fault (includes Base 64 encoded traceback string)

ftsoap:Resource diagram

[Uche Ogbuji]

via Copia

Amara API quick reference, and Windows packages

I forgot to mention in the Amara 1.1.6 announcement that I drafted an API quick reference. I've put a link to it on the Amara home page.

I've also added a Windows installer created by Sylvain Hellegouarch, with some help from Jeremy Kloth. It's an installer for Amara "allinone", so all you need is to have installed Python 2.4 for Windows, then you run this installer, and you should be all set.

[Uche Ogbuji]

via Copia