Project

General

Profile

Actions

Resolver Database Class

The resolver database class is highly configurable, allowing you to optimally set up Mulgara for the appropriate usage requirements. By inserting different classes into the constructor of the database class you can set it up as a:

  • Heavyweight store that uses disk input and output as its primary storage, making it persistent across executions of the server
  • Lightweight, memory based store that is faster but subject to memory limitations and no persistence

There are five configurable parts for the database with respect to its operation:
1. Persistent Node Pool
2. Persistent String Pool
3. Temporary Node Pool
4. Temporary String Pool
5. System Resolver Factory

The Persistent String and Node Pools maintain the mappings of node id to string representations for all current models in both the System and External models.

The Temporary Node and Temporary String Pools are used for storing temporary nodes during a query.

To ensure proper functionality, the System Resolver Factory determines which resolver manages the system model information. It is recommended that you use internal resolvers for the System Resolver Factory because of their stability, but external resolvers can also be used. By default, Persistent Node and Persistent String Pools use disk based storage, while Temporary Node and Temporary String Pools use memory based storage.

Node Pools

Node Pools and Data Pools work closely together to manage the data contained in the system model. Most importantly, the Node Pool, which contains the numerical representations of nodes within the graph structure of the models which the Data Pool depends upon. Each node in a model's graph is assigned a unique numerical representation that is stored and handled within the Node Pool. All nodes are local to the system model, that is they only have scope within the system model and are meaningless in any other server.

Data Pools

Working as a compliment to the Node Pool, the Data Pool maps node numbers to their data counterparts, such as URIs, strings, or other literal types. Since all data is globally available (that is, they hold the same meaning wherever they are used), localization is required before the node's value can be retrieved. The Data Pool was previously called the String Pool and a lot of documentation still refers to it by this name.

The most recent version of the transaction store (XA 1.1) has now merged the Node and Data Pools into the same class. This class is accessed using the StringPool and NodePool interfaces, depending on the required usage.

Temporary Pools

Temporary Pools are a combination of a Node and a Data Pool that maintains nodes used during queries that are not part of the store. When a query is executed, the Persistent Pools are consulted and if the node cannot be found, the Temporary Pool creates a new entry. This prevents the creation of nodes in the store which are not part of the graph. Temporary nodes are given IDs that do not overlap with the existing store node IDs and are consulted before the Persistent Pools. After a query is executed, all temporary nodes are deleted.

Persistent Pools

All data store graph nodes are stored in the Persistent Pools, allowing quick and easy processing of queries. Unlike Temporary Pools, they are permanent and remain after the query is executed. Generally, only the nodes handled by models in the internal resolvers are stored in the Persistent Pools, with the external resolvers' nodes being handled by the resolver itself.

System Resolver Factory

The System Resolver Factory is a title rather than an actual functioning part of any database instance. It defines the resolver factory used to create a resolver that manages the system model.

When deciding on which resolver to use, it is important to note that the resolver is in charge of addition, removal and modification of the models and should allow for this functionality. The other factor to consider is that the resolver's store should reflect the usage of Mulgara. That is, for a persistent system model, use a persistent resolver, and for a temporary system model, use a temporary resolver.

Internal Resolvers

An Internal Resolver operates on a data store internal to a Mulgara server and therefore has its own model within the system model. Usually, internal resolvers are used alongside data stores with RDF ready information, which requires very little or no conversion. It is possible however, to set up an internal relational database resolver or similar.

Internally resolved models are the most stable as they are Mulgara controlled. The data is guaranteed at all times as is always accessible for the life of the server. Internal resolvers use the Persistent Node and Data Pools and therefore require no translation of the nodes in the data store's graphs.

External Resolvers

An External Resolver has its model outside of the scope of the system model and outside the control of Mulgara. External resolvers are useful for data stores that are not in an RDF ready format and require some processing before results can be returned. This is most often used for resolving files of various formats, but can also be useful for connecting to a relational database and converting results to RDF on the fly, or reading from an unknown source's stream.

There is a danger to using external resolvers because the data being queried is not controlled by Mulgara and there is no guarantee of the model being present. External factors, such as other users, servers, or security protocols may alter or remove files or stores being resolved, thus contaminating or causing errors in the results.

Since external resolvers operate outside the scope of the Mulgara server, they are responsible for managing their own Node and String Pools as well as translating them across to the Mulgara pools during a query. This also applies to blank nodes, whose values should be maintained across resolutions for the same file otherwise the results might become unpredictable.

Once a query's model is determined to be external to the system model, the protocol is checked to determine how the resolver should connect to the resource. After setting up a connection, the actual URL's type is determined and the appropriate resolver is selected to resolve the constraints.

Updated by Paula Gearon about 15 years ago ยท 2 revisions