Project

General

Profile

Actions

The current indexing uses the following items:

  • Subject >> S
  • Predicate >> P
  • Object >> O
  • Meta >> M

There are 6 ordered indexes. Together these indexes make one of the possible sets of indexes which allows any group of statements to be found as a contiguous group within one of the constituent indexes. There are 24 such sets (see discussion on Paul's blog on August 4 and August 5, 2004). The set of indexes chosen for Mulgara is:

  • SPOM
  • POSM
  • OSPM
  • MSPO
  • MPOS
  • MOSP
These indexes have the following properties:
  • All elements are treated symmetrically, with no bias reflecting usage patterns.
  • Any specification for S, P, O and M can be found as a contiguous group within one index. The specified elements are first grouped, and then the corresponding index whose first elements match that group is used.
  • They are all tree indexes, with ordering according to the letters labelling them.
  • Contiguous groups can always be found with a pair of O(log(N)) searches, where N is the size of the data set.
  • Any discovered group of statements will be in only one of the 24 possible orderings, where that ordering is specified by the given index.
  • Counting the size of a group may have O(log(N)) complexity (current implementation is linear).
e.g. SPOM is ordered first by subject, then by predicate, object and finally meta.
In addition to these properties, we have 2 additional properties based on the choice of AVL trees for the tree structure:
  • Additions to a known point in the tree takes place in constant time.
  • Removals from a known point in the tree takes place in O(log(N)) time.

Updated by Paula Gearon about 16 years ago ยท 2 revisions