Recently, Dan Brickley expressed an interest in the extent to which Bioinformatic research efforts are leveraging RDF for temporal reasoning (and patient healthcare record integration - in general). The thread on the value of modeling temporal relations explicitly versus relying on them being built into core RDF semantics left me feeling like a concrete example was in order.
We have a large (3500+ assertions) OWL Full ontology describing all the data we collect about Cardiothoracic procedures (the primary purpose of our database as currently constituted – in a relational model). There are several high-level classes we use to model concepts that, though core to our model, can be thought of as general enough for a common upper ontology for patient data.
One of the classes is ptrec:TemporalData (from here on out, I'll be using the ptrec prefix to describe vocabulary terms in our ontology) which is the ancestor of all classes that are expressed on an axis of time. We achieve a level of precision in modeling data on a temporal axis that enhances the kind of statistical analysis we perform on a daily basis.
In particular we use three variables:
- ptrec:startDT (xs:dateTime)
- ptrec:stopDT (xs:dateTime)
- ptrec:instantDT (xs:dateTime)
The first two are used to describe an explicit (and 'proper') interval for an event in a patient record. This is often the case where the event in question only had a date associated with it. The latter variable is used when the event is instantaneous and the associated date / time is known.
The biggest challenge isn't simply the importance of time in asking questions of our data but of temporal factors that are keyed off specific, moving points of reference. For example, consider a case study on the effects of administering a medication within X days of specific procedure. The qualifying procedure is key to the observations we wish to make and behaves as a temporal anchor. Another case study interested in the effects of administering the same medication but with respect to a different procedure should be expected to rely on the same temporal logic – but keyed off a different point in time. However, by being explicit about how we place temporal data on a time axis (as instants or intervals) we can outline a logic for general temporal reasoning that can be used by either case study.
Linking into an OWL time ontology we can setup some simple Notation 3 rules for inferring interval relationships to aid such questions:
#Infering before and after temporal relationships (between instants and intervals alike) {?a a ptrec:TemporalData; ptrec:instantDT ?timeA. ?b a ptrec:TemporalData; ptrec:instantDT ?timeB. ?timeA str:greaterThan ?timeB} => {?a time:intAfter ?b.?b time:intBefore ?a} {?a a ptrec:TemporalData; ptrec:startDT ?startTimeA; ptrec:stopDT ?stopTimeA. ?b a ptrec:TemporalData; ptrec:startDT ?startTimeB; ptrec:stopDT ?stopTimeB. ?startTimeA str:greaterThan ?stopTimeB} => {?a time:intAfter ?b.?b time:intBefore ?a} #Infering during and contains temporal relationships (between proper intervals) #Since there is no str:greaterThanOrEqual CWM function, the various permutations #Are spelled out explicitely {?a a ptrec:TemporalData; ptrec:startDT ?startTimeA; ptrec:stopDT ?stopTimeA. ?b a ptrec:TemporalData; ptrec:startDT ?startTimeB; ptrec:stopDT ?stopTimeB. ?startTimeA str:lessThan ?startTimeB. ?stopTimeA str:greaterThan ?stopTimeB} => {?a time:intContains ?b.?b time:intDuring ?a} {?a a ptrec:TemporalData; ptrec:startDT ?startTimeA; ptrec:stopDT ?stopTimeA. ?b a ptrec:TemporalData; ptrec:startDT ?startTimeB; ptrec:stopDT ?stopTimeB. ?startTimeA str:equalIgnoringCase ?startTimeB. ?stopTimeA str:greaterThan ?stopTimeB} => {?a time:intContains ?b.?b time:intDuring ?a} {?a a ptrec:TemporalData; ptrec:startDT ?startTimeA; ptrec:stopDT ?stopTimeA. ?b a ptrec:TemporalData; ptrec:startDT ?startTimeB; ptrec:stopDT ?stopTimeB. ?startTimeA str:lessThan ?startTimeB. ?stopTimeA str:equalIgnoringCase ?stopTimeB} => {?a time:intContains ?b.?b time:intDuring ?a}
Notice the value in xs:dateTime values being ordered temporally and as unicode, simultaneously. This allows us rely on str:lessThan and str:greaterThan for determining interval intersection and overlap.
Terms such as 'preoperative' (which refer to events that occurred before a specific procedure / operation) and 'postoperative' (events that occurred after a specific procedure / operation), which are core to general medical research nomenclature, can be tied directly into this logic:
{?a a ptrec:TemporalData. ?b a ptrec:Operation. ?a time:intBefore ?b} => {?a ptrec:preOperativeWRT ?b} {?a a ptrec:TemporalData. ?b a ptrec:Operation. ?a time:intAfter ?b} => {?a ptrec:postOperativeWRT ?b}
Here we introduce two terms (ptrec:preOperativeWRT and ptrec:postOperativeWRT) which relate temporal data with an operation in the same patient record. Using interval relationships as a foundation you can link in domain-specific, temporal vocabulary into your temporal reasoning model, and rely on a reasoner to setup a framework for temporal reasoning.
Imagine the value in using a backward-chaining prover (such as Euler) to logically demonstrate exactly why a specific medication (associated with the date when it was administered) is considered to be preoperative with respect to a qualifying procedure. This would complement the statistical analysis of a case study quite nicely with formal logical proof.
Now, it's worth noting that such a framework (as it currently stands) doesn't allow precision of interval relationships beyond simple intersection and overlap. For instance, in most cases you would be interested primarily in medication administered within a specific length of time. This doesn't really impact the above framework since it is no more than a functional requirement to be able to perform calendar math. Imagine if the built-in properties of CWM were expanded to include functions for performing date math. for instance:
- time:addDT (adds an xs:duration to a date time)
With such a function we can expand our logical framework to include more explicit temporal relationships.
For example, if we only wanted to consider medications that were done 30 days prior to an operation to be considered 'preoperative':
{?a a ptrec:TemporalData; ptrec:startDT ?startTimeA; ptrec:stopDT ?stopTimeA. ?b a ptrec:Operation; ptrec:startDT ?opStartTime; ptrec:stopDT ?opStopTime. ?a time:intBefore ?b. (?stopTime "-P30D") time:addDT ?preOpMin. ?stopTimeA str:lessThan ?preOpMin} => {?a ptrec:preOperativeWRT ?b}
It's worth noting that such an addition (to facilitate calendar math) would be quite useful as a general extension for RDF processors.
For the most part, I think a majority of the requirements needed for temporal reasoning (in any domain) can be accommodated by explicit modeling, because FOPL (the foundation upon which RDF is built) was designed to be expressive enough to represent all human concepts.