Transaction Architecture

Discussion

Existing Architecture

Mulgara's transactions are currently handled on 3 levels:

  • User demarcation
    setAutocommit() - allows user to take/release the write-lock. Currently integrated with...
  • Session demarcation
    Manages access to the current transaction. See begin/endTransactionalBlock(), suspend/resumeTransactionalBlock(), start/finishTransactionalOperation() in DatabaseSession.
  • Store phase management
    Manages the stores phase-tree. See refresh(), prepare(), commit(), rollback() and release() on StringPoolSession and XAStringPoolImpl, XANodePoolImpl, and XAStatementStoreImpl.

The current transaction is maintained in the variable DatabaseSession::transaction. If transaction is null, then there is no existing transaction. Consequently an attempt to perform an operation that requires starting a new transaction (ie. another query), then the previous transaction is ended. Any associated Answer_s are supposed to be closed, the call to _close(), in endPreviousQueryTransaction() has been commented out to allow server-side JRDF to continue to work - see http://mulgara.org/jira/browse/MGR-19. [Update to new Trac item]

New Architecture

To fix this we need to extend the architecture to 4 levels:

  • User demarcation
  • Per-operation demarcation
    Every new operation should result in a new transactional context. The only times this should fail would be
    1. Multiple write operations with timeout.
    2. On failed write transaction.
  • Per-answer demarcation
    Every interaction with Mulgara is via either an Answer or a Session. The latter is an operation level concern. The former should have an existing transaction associated with it. Every call to an Answer in Mulgara must first ensure the transaction is resumed if necessary, and suspended before it returns to the user.
  • Store phase management
    Key things to watch out for here:
    • SS/SP/NP phase synchronisation
    • Micro-phases
    • Interaction between rollback's and reads from the write-phase.

Design
Goals =

  • DatabaseSession should contain no transactional logic. Just trivial calls to the DatabaseTransactionManager.
  • SubqueryAnswer should have no transactional logic at all.
  • DatabaseOperationContext should delegate resolver enlisting to the DatabaseTransactionContext.

Proposed Classes ===
DatabaseTransactionManager ====
Singleton, therefore attached to Database.

Responsibilities
  • Maintains association between Answer's and TransactionContext's.
  • Manages tracking the ownership of the write-lock.
  • Maintains the write-queue and any timeout algorithm desired.
  • Provides new/existing TransactionContext's to DatabaseSession on request.
    Note: Returns new context unless Session is currently in a User Demarcated Transaction.

DatabaseTransactionContext

Responsible for the javax.transaction.Transaction object.

Responsibilities
  • Ensuring every begin or resume is followed by either a suspend or an end.
  • Ensuring every suspend or end is preceeded by either a begin or a resume.
  • In conjunction with TransactionalAnswer ensuring that
    • all calls to operations on SubqueryAnswer are preceeded by a successful resume.
    • all calls to operations on SubqueryAnswer conclude with a suspend as the last call prior to returning to the user.
  • Collaborates with DatabaseTransactionManager to determine when to end the transaction.

TransactionalAnswer

A decorator implementing Answer, wrapping SubqueryAnswer. Effectively an aspect on SubqueryAnswer.

  • Ensures every call from the user is wrapped by calls to the correct DatabaseTransactionContext for resume and suspend/end.

Relationships ===
Arities ====
  • Each Thread is associated with exactly ZERO or ONE Transaction. This is a JTA requirement and non-negotiable.
  • Each Answer is associated with a single Transaction.
  • Each Transaction is associated with a single Answer unless it is user demarcated.
  • Each Session is either associated with a single user-demarcated transaction, or it is not assoicated with a transaction at all.
  • Each Answer is associated with a single Session.
  • Each Session is associated with multiple Answers.
  • In the case of Read-Only transactions
    • Each Answer always has a valid transaction available unless there has been a system-error.
    • Each Transaction is only closed when the last Answer associated with it is closed.
  • In the case of Read-Write transactions
    • The Transaction is always closed upon a call to commit() or rollback(). This immediately invalidates all Answers associated with it. To change this will require changes to the store-layer to allow phase-promotion.
    • The Transaction is only closed as a result of a commit, or a rollback(). This can be explicit, in the form of a user demarcated transaction; or implicit in the form of a insert/delete/load/etc operation while autocommit=true.
  • At most one DatabaseSession can be associated with a transaction between calls - this implies holding the writelock.

Behaviours

  • On entry to the Session, DatabaseSession needs to make a (re-entrant) call to obtain a transaction.
    • If the Session holds the write lock, return the current Write-Transaction.
    • If the Session does not hold the write lock and requests a read-only transaction, create a new ro-transaction object and return it.
    • If the Session does not hold the write lock and requests a read-write transaction, obtain the write-lock, create a new transaction object and return it.
  • Any result from this transaction needs to be associated with the transaction.
  • When that result is closed, the transaction needs to be informed.
Subqueries

Watch out for subqueries. Each subquery must be performed within the same Transactional Context as the outer query. It is an ACID violation to use a different phase to the outer query to evaluate the inner query. DatabaseSession should therefore assume within any internal method that the current thread is correctly associated with a transaction.

Autocommit

Behaviour of transactions with respect to setAutoCommit needs a disciplined case analysis to understand.

  • The write lock must guarantee that at most one Session has autocommit off at any point in time.

Therefore when a request to set the autoCommit there are 5 cases of interest:

 1 
 2 h18. ================ ==================
 3 
 4 Session.autocommit autocommit req   transaction status Result
 5 
 6 h18. ================ ==================
 7 
 8 ON                 ON               --                 We ignore this case
 9 ON                 OFF              --                 Obtain writelock, and attach transaction to session
10 OFF                ON               ACTIVE             Attempt to commit transaction - finalise
11 OFF                ON               FAILED             Transaction is already finalised. Just throw the failure-cause
12 OFF                OFF              --                 This is most likely a programmer error, current behaviour is to ignore
13 
14 h18. ================ ==================
15 

Bootstrapping

The database routinely needs to insert data into the system resolver on startup - this is prior to any client interaction. This complicates the transaction architecture as this exceptional case occurs outside the standard flow of control. There needs to be a seperate 'bootstrapping' interface to the transaction control logic to allow the database to do this.

OperationContext

There are various other resources that share a lifetime with the transaction (caches and the like). These need to be released at the end of the transaction. They are associated with an '!OperationContext', this should register an XAResource — in a similar manner to the various resolvers that will be enlisted in the javax.transaction.Transaction — to receive notification of the transaction finalisation.

Finalize

Transactional objects should implement finalize(). Naturally this cannot be relied upon to ensure semantics - but it should check that any transactional object collected by the gc has in fact completed it's transaction and is properly cleaned up. This is an excellent way to keep an eye out for semantics violations.

Obtaining a transaction

  • A User Initiated Operation on a Session - obtains a Transaction by asking the manager for the appropriate transaction object, and activating it.
  • An existing TransactionalAnswer obtains a Transaction by activating the transaction object it obtained on creation.
  • No-one else can activate a transaction.
  • Activating a transaction either begins a new JTA transaction; or resumes an existing one.
  • This associates a transaction with a thread - as a result ACTIVATION ASSOCIATES A TRANSACTION OBJECT WITH THE CURRENT THREAD. Note: Activation is the only way to associate a transaction with the current thread.
  • Subquery Answer, and non-user initiated operations on DatabaseSession must obtain the transaction object by requesting the ALREADY ACTIVATED transaction associated with the current thread.

Transaction Lifecycle

Standard RO Transaction - with subquery.

  • Call to DatabaseSession.
    • DatabaseSession requests Transaction - Manager creates Transaction (Using=1, Inuse=1) BEGIN
      • Possible recursive calls to DatabaseSession, each obtains the existing Transaction(Using=1, Inuse=1+N)
        • TransactionalAnswer object is created, associated with Transaction on current-thread. - Transaction(Using=2, Inuse=1+N)
      • Recursive calls return - Transaction(Using=2, Inuse=1)
  • Call returns object - Transaction(Using=1, Inuse=0) SUSPEND
  • Call to Answer.beforeFirst
    • TransactionalAnswer activates Transaction - Transaction(Using=1, Inuse=1) RESUME
    • Call to beforeFirst processed.
  • Call returns - Transaction(Using=1, Inuse=0) SUSPEND
  • Call to next - RESUME / next() / SUSPEND
  • Call to getObject(subquery-column)
    • TransactionalAnswer activates Transaction - Transaction(Using=1, Inuse=1) RESUME
      • Call to DatabaseSession.innerQuery() creates Answer (associated with current-transaction) - Transaction(Using=2, Inuse=1)
  • Call returns from getObject - Transaction(Using=2, Inuse=0) SUSPEND
  • repeat....
  • Call to TransactionalAnswer.close() on subquery - Transaction(Using=1, Inuse=0) RESUME/close()/*SUSPEND*
  • Call to TransactionalAnswer.close() on outer-query - Transaction(Using=0, Inuse=0) RESUME/close()/*END*

Question: Should we ignore the inner-queries, and just let the transaction extent be determined solely by the lifetime of the outermost query?
Answer: List consensus is that we should ignore inner-queries and only track the outer query. When it is closed, we release the transaction.

Limitations

  • Support for concurrent writes will require changes to the phase-tree implementation at the store level.
  • A commit on a write-phase will promote the write-phase to the current-phase. This will invalidate existing micro-commits, and consequently any Answer's based on them. Phase promotion of the answers is not possible without changes at the store layer.
  • Read-transactions are not envisiaged in this current design, but it can be extended relatively easily to support them.

original page by Andrae Muys