Rete-inspired N3 Rule Network Finished

See: previous

I called it quits (for now) in trying to retrofit the RETE-based algorithm to handle N3 functions & filters. I've checked in a rewritten Rete module with tests for SHIOF Description Logic axioms.

Tracking dependencies between filter / function variables, became hairy.

An SVG diagram of these rules compiled into a RETE-like network was generated using an included Boost Graph Library utility function.

It's unit test output:

testRules (__main__.TestEvaluateNetwork) ... Time to build production rule (RDFLib): 0.0118989944458 seconds
Time to calculate closure on working memory: 0.478726148605 seconds
ok
----------------------------------------------------------------------
Ran 1 test in 0.751s
OK

I also integrated Python iterator algebra implementation of a relational hash join (for the Beta Nodes).

It workes with RDFLib, so I'd eventually like to integrate the ability to dispatch SPARQL queries over an N3 Closure Graph - generated from the network. The speed in which it is able to render rules graphs rendered the option of serializing a compiled network into persistence a non-issue.

From the README.txt:

Fu Xi (pronounced foo-see) is a forward-chaining inferencing expert system for RDFLib Notation 3 graphs. It is named after the first mythical soveriegn of ancient china who supposedly, 'looked upward and contemplated the images in the heavens, and looked downward and contemplated the occurrences on earth.'.

Originally, it was an idea to express the formalisms of the Yi Jing / I Ching in Description & First Order Logic.

It relies on Charles Forgy's Rete algorithm for the many pattern/many object match problem - which builds a triple pattern reasoning network. It can be used as a reasoner with capabilities for certain expressive Description Logics (via OWL/RDFS axioms in N3) or a general N3 production / expert system. It uses Python hash / set / list idioms to maximize matching efficiency of the network. In particular, it uses an Iterator algebra implementation for the join activation mechanism

An example of its use:

from FuXi.Rete.Util import generateTokenSet
        from FuXi.Rete import *
        from rdflib.Graph import Graph,ReadOnlyGraphAggregate
        from rdflib import plugin
        from FuXi.Rete.RuleStore import N3RuleStore
        store = plugin.get('IOMemory',Store)()
        store.open('')
        ruleGraph = Graph(N3RuleStore())
        ruleGraph.parse(open('some-rule-file.n3'),format='n3')             
        factGraph = Graph(store)
        factGraph.parse(open('fact-file'),format='..')
        closureDeltaGraph = Graph(store)            
        network = ReteNetwork(ruleStore,
                              initialWorkingMemory=generateTokenSet(factGraph),
                              inferredTarget = closureDeltaGraph)            
        for s,p,o in network.closureGraph(factGraph):
            .. do something with triple ..

To check out from cvs:

cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot login
cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot get Fuxi

Chimezie Ogbuji

via Copia

4 responses

"foo-shee" would be a better pronunciation guide.

— John Cowan

Interesting approach, but using a algebraic isn't going to be faster than strict RETE implementation. If I understand the code in the article linked in the blog, it's still going to evaluate more nodes than a normal RETE implementation. There's several papers that explain it mathematically, but I don't remember them off the top of my head.

In Forgy's original Phd thesis, i believe went over the benefits of RETE from a mathematic perspective. I haven't read the paper in a long time, since it's rather long. The only place that explains this stuff is iLog JRules documentation, but I think they no longer make it available on their website.

the down side of using the algebra iterator approach is this. Is it compatable with adding Hash indexing to the BetaNode like the one I describe in my blog? I don't know if it is, but I suspect it might make it much harder. With a good design, adding Hash indexing to the betaNodes "can be easy". If you compare the implementation between drools3 and Sumatra, you'll see Sumatra is much simpler and slightly faster.

— Peter Lin

Interesting approach, but using a algebraic isn't going to be faster than strict RETE implementation. If I understand the code in the article linked in the blog, it's still going to evaluate more nodes than a normal RETE implementation. There's several papers that explain it mathematically, but I don't remember them off the top of my head.

Interesting you mention that, Peter, because I actually revisited the iterator algebra code yesterday with some tweaks. As FuXi is currently, the iterator-algebraic join works over all the tokens in the left and right memories using a predicate function that returns the value(s) associated with the variables common to both the left and the right.

the down side of using the algebra iterator approach is this. Is it compatable with adding Hash indexing to the BetaNode like the one I describe in my blog?

I'll take a look, but it probably is. The hashing method I've been using with betanodes is to have the memory associated with each betanode have a dictionary where the keys are the values of the common variables (between both memories) and the values are a list of all the tokens with those values.

So, my second attempt has the iterator join work instead over the keys of both memories (effectively reducing the total iterations by collapsing tokens with the same values for the common variables into one entry in the dictionary)

And yes, hashing the Beta nodes has been critical performance-wise for me.

I still need to fully reconcile the join algebra with Forgy's description of inter-element features. As it stands (and I discovered this yesterday), I was only including variables that occured in both the left and right memories as being 'inter-element features' rather than any variable that occured twice or more (regardless of which memory its containing token was in ).

So, if I had in my left memory (using Forgy's OPL syntax):

(SomeType ^father <F>)

and in the right memory:

(SomeType ^parent <F> ^child1 <X> ^child2 <X>)

It would join on F only and output partial instanciations where there were differing values for X

— Chimezie

Little doubt, the dude is completely just.
comprar tablet pc

— loacarpuemec1978