4Suite and Rdflib - Advancing the State of The Art in Python / RDF

A few days ago I checked in a 4RDF Driver that wraps the rdflib persistence API. 4RDF has a standard API for abstracting the persistence of RDF that sits directly below the Model interface and allows an author to implement a mechanism for persisting an RDF graph in whatever database, filesystem, etc.. they choose. rdflib has a similar seperation but the actual interfaces differ. Daniel Krech and I have been working rather diligently on formalizing a universal persistence API that allows an implementation to support a graduated set of features:

  • Core RDF Store
  • Context-aware RDF Store
  • Notation 3 RDF Store (Formula-aware RDF Store)

It's still a work in progress (with regards to Notation 3 persistence, mostly) but at least those interfaces/method signatures needed by a context-aware RDF store are well spec'd out.

I won't bore you with the details (and there are plenty - as we've covered alot of ground) but you can dive in here. This driver, which allows 4Suite to use rdflib for persistence of it's RDF data, is the first step an an agreed migration that will phase out 4RDF and replace it with rdflib, eventually. This module, at the very least, allows for the dispatching of Versa queries on an rdflib Graph, is the first step in allowing a 4Suite repository to it's RDF graphs in an rdflib store - I think there is alot of synthesis worth exploring with redfoot, provides rdflib with access to a rather voluminous 4RDF test suite, and demonstrates how existing applications that use 4RDF could be ported to use rdflib instead.

Outstanding / Possible Issues:

1) the current rdflib Graph interfaces do not account for RDF reification. These are somewhat covered by support for Notation 3 quoted/hypothetical contexts. The only visible difference is in the test cases that match by statementUri

2) This driver has only been tested against the MySQL implementation of an rdflib store. This is mainly because it's currently the only rdflib store implementation that supports matching arbitrary triple / statement terms by REGEX patterns and/or the production of quads instead of triples (i.e. the name of the context of each resulting statement in addition). This is only an issue for RDF stores that are at least context-aware, but an interface mismatch at most.

I plan to do some more experimentation on the possiblities that this synthesis provides (surprise, surprise). The timing is rather appropriate given the on-going development on the next generation Versa query specification, the concurrent effort to graduate the 4Suite code base to 1.0, and my recent pleasant surprise regarding Versa, Sparta, and rdflib.

If you are interested in helping or learning more about the roadmap, you can pay #redfoot (on irc.freenode.net) a visit. That's where Daniel Kreck and the other rdflib folks have been burning braincells as of late.

Chimezie Ogbuji

via Copia