Mapping Rete algorithm to FOL and then to RDF/N3

Well, I was hoping to hold off on the ongoing work I've doing with FuXi, until I could get a decent test suite working, but I've been engaged in several threads (older) that left wanting to elaborate a bit.

There is already a well established precedent with Python/N3/RDF reasoners (Euler, CWM, and Pychinko). FuXi used to rely on Pychinko, but I decided to write a Rete implementation for N3/RDF from scratch - trying to leverage the host language idioms (hashing, mappings, containers, etc..) as much as possible for areas where it could make a difference in rule evaluation and compilation.

What I have so far is more Rete-based than a pure Rete implementation, but the difference comes mostly from the impedance between the representation components in the original algorithm (which are very influenced by FOL and Knowledge Representation in general) and those in the semantic web technology stack.

Often with RDF/OWL, there is more talk than neccessary, so I'll get right to the meat of the semantic mapping I've been using. This assumes some familiarity with the original algorithm. Some references:

Tokens

The working memory of the network is fed by an N3 graph. Tokens represent the propagation of RDF triples (no variables or formulae identifiers) from the source graph through the rule network. These can represent token addition (where the triples are added to the graph - in which case the live network can be associated with a live RDF graph) and token removals (where triples are removed from the source graph). When tokens pass an alpha nodes intra-element test (see below) the tokens are passed on with a subtitution / mapping of variables in the pattern to the corresponding terms in the triples. This variable substitution is used to check for consistent variable bindings across beta nodes.

ObjectType Nodes and Working Memory

ObjectType nodes can be considered equivalent to the test for concept subsumption (in Description Logics). and therefore equivalent to the alpha node RDF pattern:

?member rdf:type ?klass.

'Classic' Alpha node patterns (the one below is taken directly from the original paper) map to multiple RDF-triple alpha node patterns:

(Expression ^Op X ^Arg2 Y)

Would be equivalent to the following triples patterns (the multiplicative factor is that RDF assertions are limited to binary predicates):

  • ?Obj rdf:type Expression
  • ?Obj Op ?X
  • ?Obj Arg2 ?Y

Alpha Nodes

Alpha nodes correspond to patterns in rules, which can be

  1. Triple patterns in N3 rules
  2. N-array functions.

Alpha node intra-element tests have a 'default' mechanism for matching triple patterns or they exhibit the behavior associated with a registered set of N-array functions - the core set coincide with those used by CWM/Euler/Pychinko (and often called N3 built-ins). Fuxi will support an extension mechanism for registering additional N-array N3 functions by name which associate them with a Python function that implements the constraint in a similar fashion to SPARQL extension functions. N-aray functions are automatically propagated through the network so they can participate in Beta Node activation (inter-element testing in Rete parlance) with regular triple patterns, using the bindings to determine the arguments for the functions.

The default mechanism is for equality of non-variable terms (URIs).

Beta Nodes

Beta nodes are pretty much verbatim Rete, in that they check for consistent variable substitution between their left and right memory.This can be considered similar to the unification routine common to both forward and backward chaining algorithms which make use of the Generalized Modus Ponens rule. The difference is that the sentences aren't being make to look the same but the existing variable substitutions are checked for consistency. Perhaps there is some merit in this similarity that would make using a Rete network to faciliate backward-chaining and proof generation an interesting possiblity, but that has yet to be seen.

Terminal Nodes

These correspond with the end of the LHS of a N3 rule, and is associated with the RHS and when 'activated' they 'fire' the rules, apply the propaged variable substitution, and add the newly inferred triples to the network and to the source graph.

Testing and Visualizing RDF/N3 Rete Networks

I've been able to adequately test the compilation process (the first of two parts in the original algorithm), using a visual aid. I've been developing a library for generating Boost Graph Library DiGraphs from RDFLib graphs - called Kaleidos. The value being in generating GraphViz diagrams, as well as access to a whole slew of graph heuristics / algorithms that could be infinitely useful for RDF graph analysis and N3 rule network analysis:

  • Breadth First Search
  • Depth First Search
  • Uniform Cost Search
  • Dijkstra's Shortest Paths
  • Bellman-Ford Shortest Paths
  • Johnson's All-Pairs Shortest Paths
  • Kruskal's Minimum Spanning Tree
  • Prim's Minimum Spanning Tree
  • Connected Components
  • Strongly Connected Components
  • Dynamic Connected Components (using Disjoint Sets)
  • Topological Sort
  • Transpose
  • Reverse Cuthill Mckee Ordering
  • Smallest Last Vertex Ordering
  • Sequential Vertex Coloring

Using Kaleidos, I'm able to generate visual diagrams of Rete networks compiled from RDF/OWL/N3 rule sets.

However, the heavy cost with using BGL is the compilation process of BGL and BGL python, which is involved if doing so from source.

Chimezie Ogbuji

via Copia

FuXi v0.6 - Rearchitected / Repackaged

I've been experimenting with the use of FuXi as an alternative in situations where I had been manipulating application-specific RDF content using Versa within a host language (XSLT). In some cases I've been able to reduce a very complex set of XSLT logic to 1-2 queries on RDF models extended via a handful of very concise rules (expressed as N3). I'm hoping to build some usecases to demonstrate this later.

The result is that I've rearchitected FuXi to work as a blackbox directly with a 4RDF Model instance (it is now query agnostic, so it can be plugged in as an extension library to any other/future RDF querying language bound to a 4RDF model). Prior to this version, it was extracting formulae statements by Versa queries instead of directly through the Model interfaces.

Right now I primarily use it through a single Versa function prospective-query. Below is an excerpt from the README.txt describing it's parameters:

prospective-query

prospective-query( ruleGraph, targetGraph, expr, qScope=None)

Using FuXi, it takes all the facts from the current query context (which may or may not be scoped) , the rules from the <ruleGraph> scope and invokes/executes the Rete reasoner. It adds the inferred statements to the <targetGraph> scope. Then, it performs the query <expr> within the <qScope> (or the entire model if None), removing the inferred statements upon exit


FuXi is is now a proper python package (with a setup.py) and I've moved it (permanently - I hope) to: http://copia.ogbuji.net/files/FuXi

I was a little unclear on Pychinko's specific dependencies with rdflib and cwm in my previous post, but Yarden Katz cleared up the confusion in his comments (thanks).

The installation and use of FuXi should be significantly easier than before with the recent inclusion of the N3 deserializer/parser into 4Suite.

Chimezie Ogbuji

via Copia

FuXi - Versa / N3 / Rete Expert System

Pychinko is a python implementation of the classic Rete algorithm which provides the inferencing capabilities needed by an Expert System. Part of Pychinko works ontop of cwm / afon out of the box. However, it's Interpreter only relies on rdflib to formally represent the terms of an RDF statement.

FuXi only relies on Pychinko itself, the N3 deserializer for persisting N3 rules, and rdflib's Literal and UriRef to formally represent the corresponding terms of a Pychinko Fact. FuXi consists of 3 components (in addition to a 4RDF model for Versa queries):

I. FtRdfReteReasoner

Uses Pychinko and N3RuleExtractor to reason over a scoped 4RDF model.

II. N3RuleExtractor

Extracts Pychinko rules from a scoped model with statements deserialized from an N3 rule document

III. 4RDF N3 Deserializer

see: N3 Deserializer

The rule extractor reverses the reification of statements contained in formulae/contexts as performed by the N3 processor. It uses three Versa queries for this

Using the namespace mappings:

Extract ancendent statements of logical implications

distribute(
  all() |- log:implies -> *,
  '.',
  '. - n3r:statement -> *'
)

Extract implied / consequent statements of logical implications

distribute(
  all() - log:implies -> *,
  '.',
  '. - n3r:statement -> *'
)

Extract the terms of an N3 reified statement

distribute(
  <statement>,
  '. - n3r:subject -> *',
  '. - n3r:predicate -> *',
  '. - n3r:object -> *'
)

The FuXi class provides methods for performing scoped Versa queries on a model extended via inference or on just the inferred statements:

For example, take the following fact document deserialized into a model:

@prefix : <http://foo/bar#> .
:chimezie :is :snoring .

Now consider the following rule:

@prefix ex: <http://foo/bar#> .
{?x ex:is ex:snoring} => {?x a ex:SleepingPerson} .

Below is a snapshot of Fuxi perforing the Versa query “type(ex:SleepingPerson)” on a model extended by inference using the above rule:

Who was FuXi? Author of the predecessor to the King Wen Sequence

Chimezie Ogbuji

via Copia