Versa: Pattern Matching (article)

My Versa article (Versa: Path-Based RDF Query Language) is up but I've recently been tinkering with Emeka and haven't been able to post about it. I wanted Emeka functional so people could familiarize themselves with the language by example instead of specification deciphering. Simple saying ".help" in a channel where he is located (#swig,#swhack,#4suite,#foaf) should be sufficient. Please, if his commands interfere with an existing bot's, please let me know.

The article is based (in part) on an earlier paper I wrote on Versa. I reworked it to focus more on the use patterns in common with other existing query languages (SPARQL primarily) to make the point that RDF querying is truely not in it's infancy any more. I also wanted to use it as a spring board to suggest some possible enhancements to an already (IMHO) expressive syntax (mostly burrowed from N3)

My hope is to spark some conversation across the opposing ends as well as get people familiar with the language for the betterment of RDF and RDF querying.

See an exchange between Dave Beckett and myself on the #swig scratchpad.

[Uche Ogbuji]

via Copia

Identifying BNodes via RDF Query

Sorry couldn't help but commence round 3 (I believe it is) of Versa vs SPARQL. In all honesty, however, this is has to do more with RDF itself than it with either query language. It is primarily motivated by a very informative and insightful take (by Benjamin Nowack) on the problems regarding identifying BNodes uniquely in a query. His final conclusion (as I understood it) is that although the idea of identifying BNodes directly by URI seems counter-inituitive to the very nature of BNodes (anonymous resources) it is a practical necessity (one that I have had to use more often than not with Versa and caused him to have to venture outside the boundaries of the SPARQL specification for a solution). This is especially the case when you don't have much identifying metadata associated with the BNode in question (where if you did you could rely on inferencing - explicit or otherwise).

Well, ironically, the reason why this issue never occured to me is that in Versa, you refer to resources (for identification purposes) by URI regardless of whether they are blank nodes or not. I guess I would interpet this functionality as leaving it up to the author of the query to understand the exact nature of BNode URI's (that they are transient,possibly inconsistent, etc.)

Chimezie Ogbuji

via Copia

Pythonic SPARQL API over rdflib

I've recently been investigating the possiblity of adapting an existing SPARQL parser/query engine on top of 4RDF - mostly for the eventual purpose of implementing a sparql-eval Versa extension function - was pleased to see there has already been some similar work done:

Although this isn't exactly what I had in mind (the more robust option would be to write an adaptor for Redland's model API and execute SPARQL queries via rasqal ), it provides an interesting pythonic analog to querying RDF.

Chimezie Ogbuji

via Copia

Versa by Deconstruction

I was recently compelled to write an introductory companion to the Versa specification. The emphasis for this document (located here) is with readers with little to no experience with formal language specifications and/or with the RDF data model. It is inspired by it's predecessors (which make good follow-up material):

I initially started using Open Office Writer to compose an Open Office Document and export it to an HTML document. But I eventually decided to write it in MarkDown and use pymarkdown to render it to an HTML document stored on Copia.

The original MarkDown source is here.

-- Chimezie

[Uche Ogbuji]

via Copia

Rewriting Source Content Descriptions as Versa Queries

I recently read Morten Frederiksen's blog entry about implementing Source Content Descriptions as SPARQL queries in Redland and was quite interested. Especially the consideration that such queries could be automatically generated and the set of these queries you would want to ask is small and straight forward. Even more interesting was Morten's step-by-step walk-thru of how such queries would be translated to SQL queries on a Redland Triple store sitting on top of MySQL (my favorite RDBMS deployment for 4RDF as well).

However, I couldn't help but wonder how such a set of queries would be expressed in Versa (in my opinion, a language more aligned with the data model it queries than it's SQL-RDQL counter-parts). So below was my attempt to port the queries into versa:

Classes used in the store

SPARQL
SELECT DISTINCT ?Class
WHERE { ?R rdf:type ?Class }
Versa
set(all() - rdf:type -> *)

Predicates that are used with instances of each class

SPARQL
SELECT DISTINCT ?Class, ?Property
  WHERE { ?R rdf:type ?Class .
        OPTIONAL { ?R ?Property ?Object .
                   FILTER ?Property != rdf:type } }
Versa
difference(
  properties(set(all() - rdf:type -> *)),
  set(rdf:type)
)

Do all instances of each class have a statement with each predicate?

It wasn't clear to me if the intent was to check if all classes have a statement with each predicate as specified by an ontology or to just count how many properties each class instance has. The latter interpretation is the one I went with (it's also simpler). This particular query will return a list of lists, each inner list consisting of two values: the URI of a distinct class instance and the number of distinct properties described in a statements about it (except rdf:type)

Versa
distribute(
  set(all() |- rdf:type -> *),
  '.',
  'length(
    difference(
      properties(.),
      set(rdf:type)
    )
  )'
)

Is the type of object in a statement with each class/predicate combination always the same?

I wasn't clear on the intent of this query, either. I wasn't sure if he meant to ask this of the combination with all predicates defined in an ontology or all predicates on class instances in the graph being queried.

But there you have it.

NOTE: The use of the set function was in order to guarantee that only distinct values were returned and may have been used redundantly with functions and expressions that already account for duplication.

[Uche Ogbuji]

via Copia

SPARQL versus Versa

Booyakasha! In a few simple examples, Chime illustrates just why I was so annoyed when I read the SPARQL spec drafts. Eric also has some good words on the matter. Sure, I'm biased as one of the inventors of Versa, but my reaction has more to do with SPARQL than Versa. Frankly, SPARQL bends my brain and twists my gut. Before I continue with my rant, I should say that I'm not blameless in this matter. I have a huge respect for the people working on SPARQL, and a lot of them (Dan Brickley, Libby Miller and Kendall Clark come to mind) were very polite in trying to get me more directly engaged in the standardization process. I just never had the time for more than the informal discussions I had with these folks, and apparently those who prefer SQLish syntax ended up dominating the important discussion or decisions.

It has never been Versa or the highway for me, but I was never going to swallow an RDF query language that used SQLish syntax. I always wanted a path-like language, preferably with a very "composable" syntax (which is why I went with such a functional language flavor in Versa). I'm far from alone in this. There have been many other respectable "pathy" RDF query proposals, and the feedback on Versa has been almost universally positive.

Apparently some people are very tied to their "SELECT"s. Isn't there room for those of us who just find it way too much of a conceptual mismatch from SQL conventions to RDF graphs? I have no choice but to make my own room. I'll continue working on Versa: it's time to start gathering my Versa 2.0 thoughts together. I'll implement Versa 2.0 for 4Suite, and help anyone who wants to implement it for any other tool (I hope that encourages Eric a bit). I may work on a Versa to SPARQL converter, but honestly, that's as much as I expect to ever have to do with SPARQL. No offense to any of the fine people involved. It just doesn't come close to fitting my head.

Chimezie Ogbuji

via Copia