N3 Deserialization in 4RDF (and other possiblities)

Motivated by the idea that 4RDF (and 4Suite) could benefit greatly from being able to parse Notation 3 documents (see bottom), I attempted to write an N3 Deserializer for 4RDF that makes use of Sean B. Palmer's n3processor.

Ft.Rdf.Serializers.N3

The module simply needs to be added to 4RDF's Ft/Rdf/Serializers directory. I hesitate to check it in, since 4Suite is now in a feature-frozen beta release cycle. It implements a sink which captures generated triples and adds it to a 4RDF model:

class FtRDFSink:
  """
  A n3proc sink that captures statements produced from
   processing an N3 document
  """
  def __init__(self, scope,model):
     self.stmtTuples = []
     self.scope = scope
     self.model = model
     self.bnodes = {}
     self.resources = {}

  def start(self, root):
     self.root = root

  def statement(self, s, p, o, f):
     #First check if the subject is a bnode (via n3proc convention)
     #If so, use 4RDF's bnode convention instead
     #Use self.bnodes as a map from n3proc bnode uris -> 4RDF bnode urns
     if s[:2] == '_:':
        if s in self.bnodes:
           s = self.bnodes[s]
        else:
           newBNode = self.model.generateBnode()
           self.bnodes[s] = newBNode
           s = newBNode

     #Make the same check for the statement's object
     if o[:2] == '_:':
        if o in self.bnodes:
           o = self.bnodes[o]
        else:
           newBNode = self.model.generateBnode()
           self.bnodes[o] = newBNode
           o = newBNode

     #Mark the statement's subject as a resource (used later for objectType)
     self.resources[s] = None

     if f == self.root:
        #Regular, in scope statement
        stmt=(s,p,o,self.scope)
        self.stmtTuples.append(stmt)
     else:
        #Special case
        #This is where the features of N3 beyond standard RDF can be harvested
        #In particular, a statement with a different scope / context than
        #that of the containing N3 document is a 'hypothetical statement'
        #Such statement(s) are mostly used to specify impliciation via log:implies
        #Such implications rules can be persisted (by flattening the forumae)
        #and later interpreted by a backward-chaining inference process
        #triggered from Versa or from within the 4RDF Model retrieval interfaces
        #Forumulae are assigned a bnode uri by n3proc which needs to be mapped
        #to a 4RDF bnode urn
        if f in self.bnodes:
           f = self.bnodes[f]
        else:
           newBNode = self.model.generateBnode()
           self.bnodes[f] = newBNode
           f = newBNode

        self.resources[f] = None

        self.flatten(s, p, o, f)

  def flatten(self, s, p, o, f):
     """
     Adds a 'Reified' hypothetical statement (associated with the formula f)
     """
     fs = self.model.generateUri()
     self.stmtTuples.append((f,
                             N3R.statement,
                             fs,
                             self.scope))
     self.stmtTuples.append((fs,
                             N3R.subject,
                             s,
                             self.scope))
     self.stmtTuples.append((fs,
                             N3R.predicate,
                             p,
                             self.scope))
     self.stmtTuples.append((fs,
                             N3R.object,
                             o,
                             self.scope))

In addition, I made a patch to the 4RDF command that adds 'n3' as a input format. See my previous blog for an example of using this command to generate diagrams of 4RDF graphs.

For example, this diagram is of rdfs-rules - rendered via the 4rdf command line (patched in able to deserialize n3 documents)

Advantages

First, deserializing N3 will almost always be faster than deserializing from rdf/xml (especially for larger graphs) since it's a text parse vs an XML parse. So, if 4Suite repository XSLT Document Definitions are augmented to be able to deserialize into the model via n3, repository operations on documents with such Document Definition will be significanly faster.

Finally, by allowing the deserialization of SWAP constructs such as log:implies, formulae reification, existential and universal variables, reasoners capable of interpreting N3 rule semantics (such as Sean's pyrple Graph class - see a demonstration of it's inference capabilities) can perform inference externally (without having to build it into 4RDF or Versa) on a 4RDF store containing RDF deserialized from N3 documents with appropriate rules.

One thing to note about this implementation is that the default baseUri of N3 documents is http://nowhere when the specified scope is a urn (since the n3processor is unable to handle urn's). Otherwise, the given scope is used as the baseUri

[Chimezie Ogbuji]

via Copia