Ticket #1050 (closed defect: fixed)

Opened 23 months ago

Last modified 19 months ago

Garbage collection and other maintenance in Ambra's mulgara database after 0.9.1 deployment

Reported by: pradeep Owned by: russ
Priority: critical Milestone: 0.9.2
Component: ambra Version: 0.9.1-SNAPSHOT
Keywords: Cc:

Description

The previous Ambra versions have potentially left some orphaned statements in the database. These are:

  • Replies that are 'inReplyTo' nonexistent 'Replies' or 'Annotations'.
  • Replies that have a nonexistent 'root'
  • Permission grants for nonexistent resources
  • Permission revokes for nonexistent resources.
  • Propagated permissions for nonexistent resources (both primary and secondary)

In addition the following things need to be added:

  • Missing rdf:type for replies as mentioned in r6604

Relevant Data Model Information

  • Replies and what they are 'inReplyTo' are in the 'ri' named-graph.
  • Permission grants are in the 'grants' named-graph.
  • Permission revokes are in the 'revokes' named-graph.
  • Permission propagation information is in the 'pp' named-graph.
  • Statements in grants and revokes are in the form of: <resource> <permission> <principal>. So it is the <resource> that needs to be checked for existence.
  • Statements in the propagated permissions are in the form of:

$primary <topaz:propagate-permissions-to> $secondary

Primer on 'exists' checking in ITQL

In TQL, the way to search for the nonexistence of something is by doing a graph difference (minus operator):

eg. select $s from <...> where $s $p $o minus $s <dc:title> $dontcare;

This will give all subject-uris with missing dc:title on it. Now $s can be further constrained to reduce the result-set down.

eg. select $s from <...> where ($s $p $o minus $s <dc:title> $dontcare)
         and $s <rdf:type> <topaz:Article>;

will give you a list of articles with missing dc:title.

Caution:

Always think in terms of tuples as opposed to rdf triple even though the patterns themselves are triple patterns. ie. when you look at ($s $p $o minus $s <dc:title> $dontcare) if you think in terms of rdf statements, you would expect to see all statements that are not <dc:title> statements including the statements that have the same subject-uri as the one without the dc:title. That would be wrong. But if you think in terms of tuples, you would interpret the query as 'all $s that satisfies the first-set 'minus' all $s that satisfies the second-set.

Grants for nonexistent resources

   select $s $permission $principal from <local:///topazproject#grants> where
     $s $permission $principal and
     ($s $anyPermission $anyPrincipal in <local:///topazproject#grants>
       minus (
         $s $p $o in <local:///topazproject#ri> or
         $s $p $o in <local:///topazproject#users> or
         $s $p $o in <local:///topazproject#preferences> or
         $s $p $o in <local:///topazproject#profiles> or
         $s $p $o in <local:///topazproject#ratings> or
         $s $p $o in <local:///topazproject#alerts> or
         $s $p $o in <local:///topazproject#criteria>
       )
     );

Combining that with a deletion:

  delete (select ...) from <local:///topazproject#grants>;

Revokes for nonexistent revokes

Same as grants, except for the named-graph being < local:///topazproject#revokes>.

Permission propagations to and from nonexistent resources

Similar to the grants. Except you need to perform two queries:

   select $s <topaz:propagate-permissions-to> $secondary 
      from <local:///topazproject#pp> where
     $s <topaz:propagate-permissions-to> $secondary and
     ($s <topaz:propagate-permissions-to> $dontcare 
         in <local:///topazproject#pp>
       minus (
         $s $p $o in <local:///topazproject#ri> or
         $s $p $o in <local:///topazproject#users> or
         $s $p $o in <local:///topazproject#preferences> or
         $s $p $o in <local:///topazproject#profiles> or
         $s $p $o in <local:///topazproject#ratings> or
         $s $p $o in <local:///topazproject#alerts> or
         $s $p $o in <local:///topazproject#criteria>
       )
     );

   select $primary <topaz:propagate-permissions-to> $s 
          from <local:///topazproject#pp> where
     $primary <topaz:propagate-permissions-to> $s and
     ($dontcare <topaz:propagate-permissions-to> $s 
           in <local:///topazproject#pp>
       minus (
         $s $p $o in <local:///topazproject#ri> or
         $s $p $o in <local:///topazproject#users> or
         $s $p $o in <local:///topazproject#preferences> or
         $s $p $o in <local:///topazproject#profiles> or
         $s $p $o in <local:///topazproject#ratings> or
         $s $p $o in <local:///topazproject#alerts> or
         $s $p $o in <local:///topazproject#criteria>
       )

Nonexistent inReplyTo and root in replies

  select $reply $prop $val from <local:///topazproject#ri> where
    $reply $prop $val and
    ($reply <http://www.w3.org/2001/03/thread#inReplyTo> $s minus $s $p $o);

  select $reply $prop $val from <local:///topazproject#ri> where
    $reply $prop $val and
    ($reply <http://www.w3.org/2001/03/thread#root> $s minus $s $p $o);

Adding missing reply-type (from r6604)

insert select $s <rdf:type> <http://www.w3.org/2001/12/replyType#Comment>
            from <local:///topazproject#ri> 
       where $s <rdf:type> <http://www.w3.org/2001/03/thread#Reply>
into <local:///topazproject#ri>;

Change History

Changed 23 months ago by amit

  • priority changed from high to critical
  • milestone set to 0.9.1

Russ assigning to you so you are aware and can take appropriate steps.

Changed 20 months ago by russ

pradeep notes that the two reply-related tasks should be performed after the upgrade to 0.9.1.

Changed 19 months ago by amit

  • milestone changed from 0.9.1 to 0.9.2

Moving it out so I can close the milestone. Data related anyway.

Changed 19 months ago by russ

  • status changed from new to closed
  • resolution set to fixed

completed.

Note: See TracTickets for help on using tickets.