Updated section 8 (mostly wording and shortening)
This commit is contained in:
parent
330d1dc4d1
commit
cdcdba1099
1 changed files with 61 additions and 164 deletions
|
@ -1418,9 +1418,6 @@ algorithm's outperforms the naive traversal.
|
||||||
\section{Related Work}
|
\section{Related Work}
|
||||||
\label{related-work}
|
\label{related-work}
|
||||||
|
|
||||||
|
|
||||||
\eab{moved text here from section 2 to make it smaller and less technical}
|
|
||||||
|
|
||||||
\subsection{Database Variations}
|
\subsection{Database Variations}
|
||||||
\label{sec:otherDBs}
|
\label{sec:otherDBs}
|
||||||
|
|
||||||
|
@ -1428,14 +1425,14 @@ This section discusses transaction systems with goals
|
||||||
similar to ours. Although these projects were successful in many
|
similar to ours. Although these projects were successful in many
|
||||||
respects, they fundamentally aimed to extend the range of their
|
respects, they fundamentally aimed to extend the range of their
|
||||||
abstract data model, which in the end still has limited overall range.
|
abstract data model, which in the end still has limited overall range.
|
||||||
In contrast, \yad follows a bottom-up approach that enables can
|
In contrast, \yad follows a bottom-up approach that can support (in
|
||||||
implement (in theory) any of these abstract models and their extensions.
|
theory) any of these abstract models and their extensions.
|
||||||
|
|
||||||
\subsubsection{Extensible databases}
|
\subsubsection{Extensible databases}
|
||||||
|
|
||||||
Genesis~\cite{genesis}, an early database toolkit was explicitly
|
Genesis is an early database toolkit that was explicitly
|
||||||
structured in terms of the physical data models and conceptual
|
structured in terms of the physical data models and conceptual
|
||||||
mappings described above.
|
mappings described above~\cite{genesis}.
|
||||||
It is designed to allow database implementors to easily swap out
|
It is designed to allow database implementors to easily swap out
|
||||||
implementations of the various components defined by its framework.
|
implementations of the various components defined by its framework.
|
||||||
Like subsequent systems (including \yad), it allows its users to
|
Like subsequent systems (including \yad), it allows its users to
|
||||||
|
@ -1461,9 +1458,8 @@ a database toolkit, new types are defined when the database server is
|
||||||
compiled. In today's object-relational database systems, new types
|
compiled. In today's object-relational database systems, new types
|
||||||
are defined at runtime. Each approach has its advantages. However,
|
are defined at runtime. Each approach has its advantages. However,
|
||||||
both types of systems aim to extend a high-level data model with new
|
both types of systems aim to extend a high-level data model with new
|
||||||
abstract data types, and are thus limited in the range of new
|
abstract data types. This is of limited use to applications that are
|
||||||
applications they support, which remain essentially queries over sets.
|
not naturally structured in terms of queries over sets.
|
||||||
|
|
||||||
|
|
||||||
\subsubsection{Modular databases}
|
\subsubsection{Modular databases}
|
||||||
|
|
||||||
|
@ -1522,17 +1518,13 @@ Special-purpose languages for transaction processing allow programmers
|
||||||
to express transactional operations naturally. However, programs
|
to express transactional operations naturally. However, programs
|
||||||
written in these languages are generally limited to a particular
|
written in these languages are generally limited to a particular
|
||||||
concurrency model and transactional storage system. Therefore, these
|
concurrency model and transactional storage system. Therefore, these
|
||||||
systems are complementary to \yad; they provide a specialized
|
systems are complementary to our work; \yad provides a substrate that makes
|
||||||
high-level interface that hard-codes a particular programming model
|
|
||||||
and specialized storage infrastructure. In contrast, \yad is a
|
|
||||||
general-purpose storage infrastructure that avoids hardcoding
|
|
||||||
programming model assumptions. \yad provides a substrate that makes
|
|
||||||
it easier to implement transactional programming models.
|
it easier to implement transactional programming models.
|
||||||
|
|
||||||
\subsubsection{Nested Transactions}
|
\subsubsection{Nested Transactions}
|
||||||
|
|
||||||
{\em Nested transactions} form trees of transactions, where children
|
{\em Nested transactions} form trees of transactions, where children
|
||||||
were spawned by their parents. They can be used to increase
|
are spawned by their parents. They can be used to increase
|
||||||
concurrency, provide partial rollback, and improve fault tolerance.
|
concurrency, provide partial rollback, and improve fault tolerance.
|
||||||
{\em Linear} nesting occurs when transactions are nested to arbitrary
|
{\em Linear} nesting occurs when transactions are nested to arbitrary
|
||||||
depths, but have at most one child. In {\em closed} nesting, child
|
depths, but have at most one child. In {\em closed} nesting, child
|
||||||
|
@ -1543,15 +1535,18 @@ transactions are not rolled back if the parent aborts.
|
||||||
Closed nesting aids in intra-transaction concurrency and fault
|
Closed nesting aids in intra-transaction concurrency and fault
|
||||||
tolerance. Increased fault tolerance is achieved by isolating each
|
tolerance. Increased fault tolerance is achieved by isolating each
|
||||||
child transaction from the others, and automatically retrying failed
|
child transaction from the others, and automatically retrying failed
|
||||||
transactions. This technique is similar to the one used by MapReduce,
|
transactions. This technique is similar to the one used by MapReduce
|
||||||
which isolates subtasks by restricting the data that each unit of work
|
to provide exactly-once execution on very large computing
|
||||||
may read and write, and which provides atomicity by ensuring
|
clusters~\cite{mapReduce}.
|
||||||
exactly-once execution of each unit of work~\cite{mapReduce}.
|
|
||||||
|
|
||||||
\yads nested top actions, and support for custom lock managers also
|
%which isolates subtasks by restricting the data that each unit of work
|
||||||
|
%may read and write, and which provides atomicity by ensuring
|
||||||
|
%exactly-once execution of each unit of work~\cite{mapReduce}.
|
||||||
|
|
||||||
|
\yads nested top actions, and support for custom lock managers
|
||||||
allow for inter-transaction concurrency. In some respect, nested top
|
allow for inter-transaction concurrency. In some respect, nested top
|
||||||
actions implement a form of open, linear nesting. Actions performed
|
actions implement a form of open, linear nesting. Actions performed
|
||||||
inside the nested top are not rolled back when the parent aborts.
|
inside the nested top action are not rolled back when the parent aborts.
|
||||||
However, the logical undo gives the programmer the option to
|
However, the logical undo gives the programmer the option to
|
||||||
compensate for the nested top action in aborted transactions. We expect
|
compensate for the nested top action in aborted transactions. We expect
|
||||||
that nested transactions
|
that nested transactions
|
||||||
|
@ -1559,18 +1554,6 @@ could be implemented as a layer on top of \yad.
|
||||||
|
|
||||||
\subsubsection{Distributed Programming Models}
|
\subsubsection{Distributed Programming Models}
|
||||||
|
|
||||||
%\rcs{ I think Argus makes use of shadow copies for durability, and for
|
|
||||||
%in-memory transactions~\cite{argusImplementation}. A tree of shadow
|
|
||||||
%copies exists, and is handled as follows (I think): All transaction
|
|
||||||
%locks are commit duration, per object. There are read locks and write
|
|
||||||
%locks, and it uses strict 2PL. Each transaction is a tree of
|
|
||||||
%``subactions'' that can get R/W locks according to the 2PL rules. Two
|
|
||||||
%subactions in the same action cannot get a write lock on the same
|
|
||||||
%object because each one gets its own copy of the object to write to.
|
|
||||||
%If a subaction or transaction abort their local copy is simply
|
|
||||||
%discarded. At commit, the local copy replaces the global copy.}
|
|
||||||
|
|
||||||
|
|
||||||
%System R was one of the first relational database implementations, and
|
%System R was one of the first relational database implementations, and
|
||||||
%defined a clean separation between its query processor and its storage
|
%defined a clean separation between its query processor and its storage
|
||||||
%subsystem. In fact, it supported a simple navigational interface to
|
%subsystem. In fact, it supported a simple navigational interface to
|
||||||
|
@ -1587,161 +1570,75 @@ rolled back and retried due to node failure.
|
||||||
|
|
||||||
Argus is a language for reliable distributed applications. An Argus
|
Argus is a language for reliable distributed applications. An Argus
|
||||||
program consists of guardians, which are essentially objects that
|
program consists of guardians, which are essentially objects that
|
||||||
encapsulate persistent and atomic data. Persistent data allows
|
encapsulate persistent and atomic data. Accesses to atomic data are
|
||||||
concurrent operations to be implemented, while accesses to atomic data
|
serializable; persistent data is not protected by the lock manager,
|
||||||
are serializable~\cite{argus}. Typically, the data structure that is being
|
and is used to implement concurrent data structures~\cite{argus}.
|
||||||
implemented is stored in persistent storage, but is agumented with
|
Typically, the data structure is stored in persistent storage, but is agumented with
|
||||||
extra information in atomic storage. This extra data tracks the
|
extra information in atomic storage. This extra data tracks the
|
||||||
status of each item stored in the structure. Conceptually, in a hash
|
status of each item stored in the structure. Conceptually, atomic
|
||||||
table, atomic storage would contain the values ``Not present'',
|
storage used by a hashtable would contain the values ``Not present'',
|
||||||
``Committed'' or ``Aborted; Old Value = x'' for each key in (or
|
``Committed'' or ``Aborted; Old Value = x'' for each key in (or
|
||||||
missing from) the hash. Before accessing the hash, the operation
|
missing from) the hash. Before accessing the hash, the operation
|
||||||
implementation would consult the appropriate piece of atomic data, and
|
implementation would consult the appropriate piece of atomic data, and
|
||||||
update the persitent storage if necessary. Because the atomic data is
|
update the persitent storage if necessary. Because the atomic data is
|
||||||
protected by a lock manager, attempts to update the hashtable are serializable.
|
protected by a lock manager, attempts to update the hashtable are serializable.
|
||||||
Therefore, clever use of atomic storage can be used to provide logical locking~\rcs{Double check this}
|
Therefore, clever use of atomic storage can be used to provide logical locking.
|
||||||
|
|
||||||
Note that implementation of efficient data structures using this
|
Note that operations that implement concurrent data structures using
|
||||||
method forces each operation implementation to track a great deal of
|
this method must track a great deal of extra state. Efficiently
|
||||||
extra state (they suggest implementing a log structure to support a
|
tracking such state is not straightforward. For example, the Argus
|
||||||
concurrent hash table), and to set policies regarding the granularity
|
hashtable implementation made use of its own log structure to
|
||||||
with which the data structures should be written to
|
efficiently track the status of each key that had been touched by an
|
||||||
disk~\cite{argusImplementation}. \yad avoids these problems by
|
active transaction. Also, the hashtable is responsible for setting
|
||||||
forcing operation implementors to provide logical undos, and by
|
policies regarding when, and with what granularity it would be written
|
||||||
leaving lock managment to higher-level code. We argue that logical
|
back to disk~\cite{argusImplementation}. \yad operations avoid this
|
||||||
undos are easily provided in most circumstances, while higher-level
|
complexity by providing logical undos, and by leaving lock managment
|
||||||
lock management decouples data structure implementations from
|
to higher-level code. This also separates write-back and concurrency
|
||||||
application concurrency models.
|
control policies from data structure implementations.
|
||||||
|
|
||||||
%The Argus designers assumed that only a few core concurrent
|
%The Argus designers assumed that only a few core concurrent
|
||||||
%transactional data structures would be implemented, and that higher
|
%transactional data structures would be implemented, and that higher
|
||||||
%level code would make use of these structures. Also, Argus assumed
|
%level code would make use of these structures. Also, Argus assumed
|
||||||
%that transactions should be serializable.
|
%that transactions should be serializable.
|
||||||
|
|
||||||
Camelot, a successor to Argus made a number of important
|
Camelot made a number of important
|
||||||
contributions, both in system design, and in algorithms for
|
contributions, both in system design, and in algorithms for
|
||||||
distributed transactions~\cite{camelot}. It left locking to application level code,
|
distributed transactions~\cite{camelot}. It leaves locking to application level code,
|
||||||
and updated data in place. (Argus used shadow copies to provide
|
and updates data in place. (Argus uses shadow copies to provide
|
||||||
atomic updates.) Camelot provided two logging modes: Redo only
|
atomic updates.) Camelot provides two logging modes: Redo only
|
||||||
(no-Steal,no-Force) and Undo/Redo (Steal, no-Force). It was
|
(no-Steal,no-Force) and Undo/Redo (Steal, no-Force). It uses
|
||||||
implemented using Mach, and provided recoverable virtual memory. It
|
facilities of Mach to provide recoverable virtual memory. It
|
||||||
was decoupled from Avalon, which used Camelot to provide a
|
is decoupled from Avalon, which uses Camelot to provide a
|
||||||
higher-level (C++) programming model. Camelot provided a lower-level
|
higher-level (C++) programming model. Camelot provides a lower-level
|
||||||
C interface that allowed other programming models to be
|
C interface that allows other programming models to be
|
||||||
implemented. It provided a limited form of closed nested transactions
|
implemented. It provides a limited form of closed nested transactions
|
||||||
where parents are suspended while children are active. Camelot also
|
where parents are suspended while children are active. Camelot also
|
||||||
provided mechanisms for distributed transactions and transactional
|
provides mechanisms for distributed transactions and transactional
|
||||||
RPC. However, concurrent operations in Camelot were similar to those
|
RPC. While Camelot does allow appliactions to provide their own lock
|
||||||
in Argus since Camelot did not provide logical undo. Camelot's focus
|
managers, implementation strategies for concurrent operations
|
||||||
was upon support for distributed transactions, therefore, it hardcoded
|
in Camelot are similar to those
|
||||||
|
in Argus since Camelot does not provide logical undo. Camelot focuses
|
||||||
|
on distributed transactions, and hardcodes
|
||||||
assumptions regarding the structure of nested transactions, consensus
|
assumptions regarding the structure of nested transactions, consensus
|
||||||
algorithms, communication mechanisms, and so on. In contrast, \yads
|
algorithms, communication mechanisms, and so on. In contrast, \yads
|
||||||
goal is to efficiently support a wide range of such mechanisms.
|
goal is to efficiently support a wide range of such mechanisms without
|
||||||
|
providing any built in support for distributed transactions.
|
||||||
|
|
||||||
More recent transactional programming schemes allow for more multiple
|
More recent transactional programming schemes allow for multiple
|
||||||
transaction implementations to cooperate as part of the same
|
transaction implementations to cooperate as part of the same
|
||||||
distributed transaction. For example, X/Open DTP provides a standard
|
distributed transaction. For example, X/Open DTP provides a standard
|
||||||
networking protocol that allows multiple transactional systems to be
|
networking protocol that allows multiple transactional systems to be
|
||||||
controlled by a single transaction manager~\cite{something}.
|
controlled by a single transaction manager~\cite{something}.
|
||||||
Enterprise Java Beans is a standard for developing transactional
|
Enterprise Java Beans is a standard for developing transactional
|
||||||
middleware that may make use of heterogenous storage. Its
|
middleware on top of heterogenous storage. Its
|
||||||
transactions may not be nested~\cite{something}. This simplifies its
|
transactions may not be nested~\cite{something}. This simplifies its
|
||||||
semantics somewhat, and leads to many, short transactions, which
|
semantics somewhat, and leads to many, short transactions,
|
||||||
improves concurrency. However, it is somewhat rigid, and may lead to
|
improving concurrency. However, flat transactions are somewhat rigid, and lead to
|
||||||
situations where committed transactions have to be manually rolled
|
situations where committed transactions have to be manually rolled
|
||||||
back by other transactions after the fact~\cite{ejbCritique}. Open
|
back by other transactions after the fact~\cite{ejbCritique}. The Open
|
||||||
Multithreaded Transactions provide a model for nested transactions
|
Multithreaded Transactions model is based on nested transactions,
|
||||||
that incorporates exception handling, and allows parents to execute
|
incorporates exception handling, and allows parents to execute
|
||||||
concurrently with their children.
|
concurrently with their children~\cite{omtt}.
|
||||||
|
|
||||||
%Argus transactions use shadow copies to provide atomic updates.
|
|
||||||
%Instead of making use of logical undo, concurrent guardians make use
|
|
||||||
%of two types of persistant state. One type behaves transactionally,
|
|
||||||
%and will be rolled back at abort, while the other type can be
|
|
||||||
%atomically written to disk, but is not automatically modified at
|
|
||||||
%commit or abort. The transactional portions of the state can be
|
|
||||||
%provided by built-in atomic types, or by another guardian.
|
|
||||||
|
|
||||||
%A transactional Argus hashtable could consist of a simple,
|
|
||||||
%non-transactional, hashtable that is written back to disk atomically
|
|
||||||
%each time it is updated and a set of transactional flags that are
|
|
||||||
%automatically updated each time a transaction accesses the table,
|
|
||||||
%commits or aborts. During a lookup, the hashtable would consult these
|
|
||||||
%flags to determine the status of the key in question. To minimize the
|
|
||||||
%amount of data written to disk, one could use a log to emulate
|
|
||||||
%explicit per-key flags, and partition the hashtable and logfile into
|
|
||||||
%multiple atomically updated regions~\cite{argusImplementation}.
|
|
||||||
|
|
||||||
%While this approach does allow the layout and implementation of the
|
|
||||||
%data structure to be completely independent from the mechanisms used
|
|
||||||
%for transactional updates, it forces the operation implementor to
|
|
||||||
%provide a module that explicitly tracks the relationship between
|
|
||||||
%object states and transactions. Some of this information is required
|
|
||||||
%for locking, making it easier to provide a logical lock mananger.
|
|
||||||
%However, taking that approach couples the data structure
|
|
||||||
%implementation to the application's concurrency model.
|
|
||||||
|
|
||||||
%The Argus also work provides high-level models for atomicity,
|
|
||||||
%reconfiguration, and other issues faced by developers of transactional
|
|
||||||
%systems. These models do not depend on the low-level Argus
|
|
||||||
%implementation, and may be useful to applications built on top of
|
|
||||||
%\yad.~\rcs{citations here?}
|
|
||||||
|
|
||||||
%Camelot is a distributed transaction processing system. It provides
|
|
||||||
%two physical logging modes; redo only (no-Steal, no-Force), and
|
|
||||||
%redo-undo (Steal, no-Force), but does not contain provisions for
|
|
||||||
%logical logging or compensations. It supports nested transactions,
|
|
||||||
%which makes it possible to implement concurrent data structures in a
|
|
||||||
%style similar to concurrent guardians in Argus.
|
|
||||||
|
|
||||||
%Therefore, commit duration locks are required to protect data
|
|
||||||
%structures from concurrent transactions, \rcs{This sentence is
|
|
||||||
%problematic for two reasons: (1) Camelot allowed hybrid atomicity and
|
|
||||||
%other schemes in addition to 2PL. (2) According to \cite{camelot}, pg
|
|
||||||
%433 ``Logical locks, implemented within servers, and support for
|
|
||||||
%hybrid atomicity provide the possibilty of high concurrency.'' I
|
|
||||||
%think this is a mistake in their paper; logical locking isn't very
|
|
||||||
%helpful when ``This [Camelot's Nested Transaction] model states that
|
|
||||||
%if one transaction modifies a region, the region cannot be modified by
|
|
||||||
%another transacion unless that transaction is an active descendant of
|
|
||||||
%original transaction or the original transaction compeletes... If
|
|
||||||
%comodification does occur, no guarantees concerning data integrity are
|
|
||||||
%given'' (Camelot + Avalon book, pg 117)'' I think the same mistake is
|
|
||||||
%repeated in the RVM paper, when they discuss multi-threaded code.
|
|
||||||
%Also, see the discussion on Argus; you could do concurrency that way
|
|
||||||
%on Camelot...} limiting the applicability of Camelot to
|
|
||||||
%high-concurrency applications or its scalability to multi-processor
|
|
||||||
%systems.
|
|
||||||
|
|
||||||
%Camelot makes use of a nested transaction model that allows
|
|
||||||
%concurrency within a single transaction. In Camelot, nested
|
|
||||||
%transactions can run in parallel and make use of locks acquired by the
|
|
||||||
%transaction that spawned them. Parent transactions are suspended
|
|
||||||
%until children transactions complete, and children are protected from
|
|
||||||
%each other using locks, or other similar methods. We beleive that
|
|
||||||
%\yads support for logical undo would allow it to support such
|
|
||||||
%transactions with more concurrency than Camelot allowed. Camelot is
|
|
||||||
%an early example of a C library that provides transactional semantics
|
|
||||||
%over custom data types. Also, it introduced a number of features,
|
|
||||||
%such as distributed logging and commit semantics, and transactional
|
|
||||||
%RPC that we plan to integrate into \yad as we add support for
|
|
||||||
%multi-node transactions. Avalon, which was built on top of Camelot is
|
|
||||||
%a persistent version of C++ that introduced the idea of persistent
|
|
||||||
%programming language types.
|
|
||||||
|
|
||||||
%Both Argus and Camelot make use of {\em closed} nested transactions.
|
|
||||||
%In this context, ``closed'' means that subtransactions must abort if
|
|
||||||
%their parents abort. In contrast, \yads nested transactions provide a
|
|
||||||
%limited form of {\em open} nested transactions, in that they are able
|
|
||||||
%to commit even if their parents abort. Currently, \yad limits each
|
|
||||||
%transaction (or nested top action) to have a single child (although
|
|
||||||
%these may be nested to arbitrary depths). This limitation is sometimes
|
|
||||||
%called {\em linear nesting}. Schemes to naturally integrate linear
|
|
||||||
%and open nesting of transactions with modern languages such as Java
|
|
||||||
%have recently been been proposed~\cite{nestedTransactionPoster}.
|
|
||||||
|
|
||||||
%\rcs{More information on nested transactions is available in this book
|
|
||||||
%(which I haven't looked at yet)\cite{nestedTransactionBook}.}
|
|
||||||
|
|
||||||
\subsection{Berkeley DB}
|
\subsection{Berkeley DB}
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue