Ticket #1050 (closed defect: fixed)
Garbage collection and other maintenance in Ambra's mulgara database after 0.9.1 deployment
| Reported by: | pradeep | Owned by: | russ |
|---|---|---|---|
| Priority: | critical | Milestone: | 0.9.2 |
| Component: | ambra | Version: | 0.9.1-SNAPSHOT |
| Keywords: | Cc: |
Description
The previous Ambra versions have potentially left some orphaned statements in the database. These are:
- Replies that are 'inReplyTo' nonexistent 'Replies' or 'Annotations'.
- Replies that have a nonexistent 'root'
- Permission grants for nonexistent resources
- Permission revokes for nonexistent resources.
- Propagated permissions for nonexistent resources (both primary and secondary)
In addition the following things need to be added:
- Missing rdf:type for replies as mentioned in r6604
Relevant Data Model Information
- Replies and what they are 'inReplyTo' are in the 'ri' named-graph.
- Permission grants are in the 'grants' named-graph.
- Permission revokes are in the 'revokes' named-graph.
- Permission propagation information is in the 'pp' named-graph.
- Statements in grants and revokes are in the form of: <resource> <permission> <principal>. So it is the <resource> that needs to be checked for existence.
- Statements in the propagated permissions are in the form of:
$primary <topaz:propagate-permissions-to> $secondary
Primer on 'exists' checking in ITQL
In TQL, the way to search for the nonexistence of something is by doing a graph difference (minus operator):
eg. select $s from <...> where $s $p $o minus $s <dc:title> $dontcare;
This will give all subject-uris with missing dc:title on it. Now $s can be further constrained to reduce the result-set down.
eg. select $s from <...> where ($s $p $o minus $s <dc:title> $dontcare)
and $s <rdf:type> <topaz:Article>;
will give you a list of articles with missing dc:title.
Caution:
Always think in terms of tuples as opposed to rdf triple even though the patterns themselves are triple patterns. ie. when you look at ($s $p $o minus $s <dc:title> $dontcare) if you think in terms of rdf statements, you would expect to see all statements that are not <dc:title> statements including the statements that have the same subject-uri as the one without the dc:title. That would be wrong. But if you think in terms of tuples, you would interpret the query as 'all $s that satisfies the first-set 'minus' all $s that satisfies the second-set.
Grants for nonexistent resources
select $s $permission $principal from <local:///topazproject#grants> where
$s $permission $principal and
($s $anyPermission $anyPrincipal in <local:///topazproject#grants>
minus (
$s $p $o in <local:///topazproject#ri> or
$s $p $o in <local:///topazproject#users> or
$s $p $o in <local:///topazproject#preferences> or
$s $p $o in <local:///topazproject#profiles> or
$s $p $o in <local:///topazproject#ratings> or
$s $p $o in <local:///topazproject#alerts> or
$s $p $o in <local:///topazproject#criteria>
)
);
Combining that with a deletion:
delete (select ...) from <local:///topazproject#grants>;
Revokes for nonexistent revokes
Same as grants, except for the named-graph being < local:///topazproject#revokes>.
Permission propagations to and from nonexistent resources
Similar to the grants. Except you need to perform two queries:
select $s <topaz:propagate-permissions-to> $secondary
from <local:///topazproject#pp> where
$s <topaz:propagate-permissions-to> $secondary and
($s <topaz:propagate-permissions-to> $dontcare
in <local:///topazproject#pp>
minus (
$s $p $o in <local:///topazproject#ri> or
$s $p $o in <local:///topazproject#users> or
$s $p $o in <local:///topazproject#preferences> or
$s $p $o in <local:///topazproject#profiles> or
$s $p $o in <local:///topazproject#ratings> or
$s $p $o in <local:///topazproject#alerts> or
$s $p $o in <local:///topazproject#criteria>
)
);
select $primary <topaz:propagate-permissions-to> $s
from <local:///topazproject#pp> where
$primary <topaz:propagate-permissions-to> $s and
($dontcare <topaz:propagate-permissions-to> $s
in <local:///topazproject#pp>
minus (
$s $p $o in <local:///topazproject#ri> or
$s $p $o in <local:///topazproject#users> or
$s $p $o in <local:///topazproject#preferences> or
$s $p $o in <local:///topazproject#profiles> or
$s $p $o in <local:///topazproject#ratings> or
$s $p $o in <local:///topazproject#alerts> or
$s $p $o in <local:///topazproject#criteria>
)
Nonexistent inReplyTo and root in replies
select $reply $prop $val from <local:///topazproject#ri> where
$reply $prop $val and
($reply <http://www.w3.org/2001/03/thread#inReplyTo> $s minus $s $p $o);
select $reply $prop $val from <local:///topazproject#ri> where
$reply $prop $val and
($reply <http://www.w3.org/2001/03/thread#root> $s minus $s $p $o);
Adding missing reply-type (from r6604)
insert select $s <rdf:type> <http://www.w3.org/2001/12/replyType#Comment>
from <local:///topazproject#ri>
where $s <rdf:type> <http://www.w3.org/2001/03/thread#Reply>
into <local:///topazproject#ri>;
