Copia

Why Web Architecture Shouldn't Dictate Meaning

This is a very brief demonstration motivated by some principled arguments I've been making over the last week or so regarding Web Architecture dictates which are ill-concieved and may do more damage to the Semantic Web than good. A more fully articulated argument is sketched out in "HTTP URIs are not Without Expense" and "Semiotics of RDF Signs". In particular, the argument about why most of the httpRange-14 dialog is confusing dereference with denotation. I've touched on some of this before.

Anywho, the URI I've minted for myself is

http://metacognition.info/profile/webwho.xrdf#chime

When you 'dereference' it, the server responds with:

chimezie@otherland:~/workspace/Ontologies$ curl -I http://metacognition.info/profile/webwho.xrdf#chime
HTTP/1.1 200 OK
Date: Thu, 30 Aug 2007 06:29:12 GMT
Server: Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2 mod_python/3.2.10 Python/2.4.4 PHP/4.4.4-8+etch1 proxy_html/2.5 mod_ssl/2.2.3 OpenSSL/0.9.8c mod_perl/2.0.2 Perl/v5.8.8
Last-Modified: Mon, 23 Apr 2007 03:09:22 GMT
Content-Length: 6342
Via: 1.1 www.metacognition.info
Expires: Thu, 30 Aug 2007 07:28:26 GMT
Age: 47
Content-Type: application/rdf+xml

According to TAG dictate, a 'web' agent can assume it refers to a document (yes, apparently an RDF document composed this blog you are reading).

Update: Bijan points out that my example is mistaken. The TAG dictate only allows the assumption to be made of the URI which goes across the wire (the URI with the fragment stripped off). The RDF (FOAF) doesn't make any assertions about this (stripped) URI being a foaf:Person. This is technically correct, however, the concern I was highlighting still holds (albeit it is more likely to confuse folks who are already confused about dereference and denotation). The assumption still gets in the way of 'proper' interpretation. Consider if I had used the FOAF graph URL as the URL for me. Under which mantra would this be taboo? Furthermore, if I wanted to avoid confusing unintelligent agents such as this one above, which URI scheme would I be likely to use? Hmmm...

Okay, a more sophisticated semantic web agent parses the RDF and understands (via the referential mechanics of model theory) that the URI denotes a foaf:Person (much more reasonable). This agent is also much better equipped to glean 'meaning' from the model-theoretic statements made about me instead of jumping to binary conclusions.

So I ask you, which agent is hampered by a dictate that has all to do with misplaced pragmatics and nothing to do with semantics? Until we understand that the 'Semantic Web' is not 'Web-based Semantics', Jim Hendler's question about where all the agents are (Where are all the agents?) will continue to go unanswered and Tim Bray's challenge will never be fulfilled.

A little tongue-in-cheek, but I hope you get the point

Chimezie Ogbuji

via Copia

Linked Data and Overselling the HTTP URI Scheme

So, I'm going to do something which may not be well-recieved: I'm going to push-back (slightly) on the Linked Data movement, because, frankly, I think it is a bit draconian with respect to the way it oversells the HTTP URI scheme (points 3 and 4):

2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information.

There is some interesting overlap as well between this overselling and a recent W3C TAG finding which takes a close look at motivations for 'inventing' URI schemes instead of re-using HTTP. The word 'inventing' seems to suggest that the URI specification discourages the use of URI schemes beyond the most popular one. Does this really only boil down to an argument of popularity?

So, here is an anecdotal story that is based part in fiction and part in fact. So, a vocabulary author within an enterprise is (at the very beginning) has a small domain in mind that she wants to build some concensus around by developing an RDF vocabulary. She doesn't have any authority with regards to web space within (or outside) the enterprise. Does she really have to stop developing her vocabulary until she has selected a base URI from which she can gurantee that something useful can be dereferenced from the URIs she mints for her terms? Is it really the case that her vocabulary has no 'semantic web' value until she does so? Why can't she use the tag scheme (for instance) to identify her terms first and then worry later about the location of the vocabulary definition. Afterall, those who push HTTP URI schemes as a panacea solution must be aware that URIs are about identification first and location second (and this latter characteristic is optional).

Over the years, I've developed an instinct to immediately question arguments that suggests a monopoly on a particular approach. This seems to be the case here. Proponents of a HTTP URI scheme monoploy for follow your nose mechanics (or auto discovery of useful RDF data) seem to suggest (quite strongly) that using anything else besides the HTTP URI scheme is bad practice, without actually saying so. So, if this is not the case, my original question remains: is it just a URI scheme popularity contest? If the argument is to make it easy for clients to build web closure then I've argued before that there are better ways to do this without stressing the protocol with brute force and unintelligent term 'sniffing'.

It seems to be a much better approach to be unambigious about the the trail left for software agents by using an explicit term (within a collection of RDF statements) to point to where more aditionally useful information can be retrieved for said collection of RDF statements. There is already decent precedent in terms such as rdfs:seeAlso and rdfs:isDefinedBy. However, these terms are very poorly defined and woefully abused (the latter term especially).

Interestingly, I was introduced to this "meme" during a thread on the W3C HCLS IG mailing list about the value of the LSID URI scheme and whether it is redundant with respect to HTTP. I believe this disconnect was part of the motivation behind the recent TAG finding: URNs, Namespaces and Registries. Proponents of a HTTP URI scheme monopoly should educate themselves (as I did) on the real problems faced by those who found it neccessary to 'invent' a URI scheme to meet needs they felt were not properly addressed by the mechanics of the HTTP protocol. They reserve that right as the URI specification does not endorse any monopolies on schemes. See: LSID Pros & Cons

Frankly, I think fixing what is broken with rdfs:isDefinedBy (and pervasive use of rdfs:seeAlso - FOAF networks do this) is sufficient for solving the problem that the Linked Data theme is trying to address, but much less heavy handedly. What we want is a way to say is:

this collection of RDF statements are 'defined' (ontologically) by these other collections of RDF statements.

Or we want to say (via rdfs:seeAlso):

with respect to this current collection of RDF statements you might want to look at this other collection

It is also worth noting the FOAF namespace URI issues which recently 'broke' Protege. It appears some OWL tools (Protege - at the time) were making the assumption that the FOAF OWL RDF graph would always be resolvable from the base namespace URI of the vocabulary: http://xmlns.com/foaf/0.1/ . At some point, recently, the namespace URI stopped serving up the OWL RDF/XML from that URI and instead served up the specification. Nowhere in the the human-readable specification (which - during that period - was what was being served up from that URI) is there a declaration that the OWL RDF/XML is served up from that URI. The only explicit link is to : http://xmlns.com/foaf/spec/20070114.rdf

However, how did Protege come to assume that it could always get the FOAF OWL RDF/XML from the base URI? I'm not sure, but the short of it was that any vocabulary which referred to FOAF (at that point) could not be read by Protege (including my foundational ontology for Computerized Patient Records - which has since moved away from using FOAF for reasons that included this break in Protege).

The problem here is that Protege should not have been making that assumption but should have (instead) only attempted to assume an OWL RDF/XML graph could be dereferenced from a URI if that URI is the object of an owl:imports statement. I.e.,

http://example.com/ont owl:imports http://xmlns.com/foaf/spec/20070114.rdf

This is unambigous as owl:imports is very explicit about what the URI at the other end points to. If you setup semantic web clients to assume they will always get something useful from the URI used within an RDF statement or that HTTP schemed URI's in an RDF statement are always resolveable then you set them up for failure or at least alot of uneccessary web crawling in random directions.

My $0.02

Chimezie Ogbuji

via Copia

Programmatic Access to Repository (SOAP/FtRPC)

This is meant to be a follow-up to my last entry to cover the programmatic remote procotols supported by the 4Suite repository.

FtRPC

The 4Suite repository supports an internal RPC protocol with a Python implementation that provides programmatic access to the repository. The automated 4Suite build process has recently changed significantly so you can browse the Python API documentation (courtesy of John L Clark's build). Each repository could serve it's own instance (the default port is 8803)

SOAP

The 4Suite repository supports a SOAP mapping that essentially attempts to serve as a translation mechanism between SOAP and the internal repository API. A repository instance can manage a SOAP server instance (the default port is 8090).

SOAP Service Namespace

The namespace URI associated with the SOAP service is:

http://xmlns.4suite.org/reserved#services

Authentication

Each SOAP message can have authentication information in the SOAP Header. The format is:

<SOAP-ENV:Header>
  <ftsoap:authenticationHeader>
    <ftsoap:sessionId>..</>
    <ftsoap:sessionKey>..</>
    <ftsoap:authenticatingUser>.. user .. </>
    <ftsoap:authenticatingPassword></>
  </>
</>

where SOAP is bound to:

http://schemas.xmlsoap.org/soap/envelope/Header

Session authentication is supported with the first two header entries. The other two are for simple / basic authentication (very similar to HTTP scenario).

Message-to-Repo API Mapping

SOAP messages are invoked against repository resources. The local name of the SOAP-ENV:Body child element (in the ftsoap namespace) is mapped to the name of the method to invoke. The child elements are mapped to parameters to the methods. There are certain special parameters:

scrpath (the repository path of the resource to execute the method against)
base64 (a boolean value indicating whether or not the content is Base 64 encoded – fault by default)
src (the content – transmitted as pure text or Base 64 encoded)
updateSrc (used by the xUpdate method as the XUpdate document – transmitted as pure text or Base 64 encoded)

The method is invoked with one of the following as the response:

ftsoap:successReponse (if there is nothing returned)
ftsoap:valueResponse (the value returned – it's string representation)
ftsoap:Resource (if a resource itself is returned)
SOAP-ENV:Fault (includes Base 64 encoded traceback string)

ftsoap:Resource diagram

[Uche Ogbuji]

via Copia

Copia

Ogbujis on an abundance of topics

Tag protocol

Why Web Architecture Shouldn't Dictate Meaning

Linked Data and Overselling the HTTP URI Scheme

Programmatic Access to Repository (SOAP/FtRPC)

FtRPC

SOAP

SOAP Service Namespace

Authentication

Message-to-Repo API Mapping