Cleaned up intro a bit.

This commit is contained in:
Sears Russell 2005-03-22 02:14:14 +00:00
parent 379b50d8c7
commit ef29b13f51

View file

@ -150,76 +150,126 @@ meet and form the {\em raison d'\^{e}tre} for \yad: the framework delivers
these properties in a way that is reusable, thus providing and easy
way for systems to provide complete transactions.
\eab{list of contributions}
With these trends in mind, we have implemented a modular version of
ARIES that makes as few assumptions as possible about application data
structures or workload. Where such assumptions are inevitable, we have
produced narrow APIs that allow the application developer to plug in
alternative implementations of the modules that comprise our ARIES
implementation. Rather than hiding the underlying complexity of the
library from developers, we have produced narrow, simple API's and a
set of invariants that must be maintained in order to ensure
transactional consistency, allowing application developers to produce
high-performance extensions with only a little effort.
Specifically, there are a number of features that \yad provides that, when combined,
provide applications with control over:
\begin{itemize}
\item extensible logging
\item extensible 2pc, locking via APIs (others?)
\item flexible transactional data structures
\item high-concurrency reusable open-source implementation with good
performance
\item efficient persistent object layer
\item others?
\item On-disk representations
\item Access method implementations
\item Granularity of concurrency
\item The exact semantics of Atomicity Consistency, Isolation and Durability
\item Workload specific assumptions
\item Choice of synchronization primitives (deadlock detection, avoidance, etc).
\item Request scheduling policies.
\end{itemize}
These features include:
%\eab{list of contributions}
\begin{description}
\item[Flexible page formats] provide low level control over
transactional data representations.
\item[Extensible log formats] provide high-level control over
transaction data structures.
\item [High and low level control over the log] such as calls to ``log this
operation'' or ``write a compensation record''
\item [In memory logical logging] provides a data store independendent
record of application requests, allowing ``in flight'' log
reordering, manipulation and durability primatives to be
developed
\item[Custom durability operations] such as two phase commit's
prepare call, and savepoints.
\item[Extensible locking API] provides registration of custom lock managers
and a generic lock manager implementation.
\end{description}
We have produced a high-concurrency, high performance and reusable
open-source implementation of these concepts. Portions of our
implementation's API are still changing, but the interfaces to low
level primitives, and implementations of basic functionality have
stablized.
To validate these claims, we developed a number of applications such
as an efficient persistant object layer, {\em @todo locality preserving
graph traversal algorithm}, and a cluster hash table based upon
on-disk durability and two phase commit. We also provide benchmarking
results for some of \yad's primitives and the systems that it
supports.
%\item An efficient persistent object layer
%\item A cluster hash table based upon two-phase commit.
%\item others?
%\end{itemize}
\eab{We also need to list the APIs that are easy to change and maybe explain the interaction between locking and commit/abort.}
{\em I think this is more or less covered now, but we might want to be a little more concrete with the api's.}
\eab{need to incorporate paragraph 5 from below; I think the other 4 are covered.}
%\eab{need to incorporate paragraph 5 from below; I think the other 4 are covered.}
\begin{enumerate}
%\begin{enumerate}
% rcs: The original intro is left intact in the other file; it would be too hard to merge right now.
% This paragraph is a too narrow; the original was too vague
\item {\bf Current transactional systems handle conventional workloads
well, but object persistence mechanisms are a mess, as are
{}``version oriented'' data stores requiring large, efficient atomic
updates.}
\item {\bf {}``Impedance mismatch'' is a term that refers to a mismatch
between the data model provided by the data store and the data model
required by the application. A significant percentage of software
development effort is related to dealing with this problem. Related
problems that have had less treatment in the literature involve
mismatches between other performance-critical and labor intensive
programming primitives such as concurrency models, error handling
techniques and application development patterns.}
% rcs: see ##1## in other file for more examples
\item {\bf Past trends in the Database community have been driven by
demand for tools that allow extremely specialized (but commercially
important!) types of software to be developed quickly and
inexpensively. {[}System R, OODBMS, benchmarks, streaming databases,
etc{]} This has led to the development of large, monolithic database
severs that perform well under many circumstances, but that are not
nearly as flexible as modern programming languages or typical
in-memory data structure libraries {[}Java Collections,
STL{]}. Historically, programming language and software library
development has focused upon the production of a wide array of
composable general purpose tools, allowing the application developer
to pick algorithms and data structures that are most appropriate for
the problem at hand.}
\item {\bf In the past, modular database and transactional storage
implementations have hidden the complexities of page layout,
synchronization, locking, and data structure design under relatively
narrow interfaces, since transactional storage algorithms'
interdependencies and requirements are notoriously complicated.}
%Not implementing ARIES any more!
\item {\bf With these trends in mind, we have implemented a modular
version of ARIES that makes as few assumptions as possible about
application data structures or workload. Where such assumptions are
inevitable, we have produced narrow APIs that allow the application
developer to plug in alternative implementations of the modules that
comprise our ARIES implementation. Rather than hiding the underlying
complexity of the library from developers, we have produced narrow,
simple API's and a set of invariants that must be maintained in
order to ensure transactional consistency, allowing application
developers to produce high-performance extensions with only a little
effort.}
\end{enumerate}
% \item {\bf Current transactional systems handle conventional workloads
% well, but object persistence mechanisms are a mess, as are
% {}``version oriented'' data stores requiring large, efficient atomic
% updates.}
%
% \item {\bf {}``Impedance mismatch'' is a term that refers to a mismatch
% between the data model provided by the data store and the data model
% required by the application. A significant percentage of software
% development effort is related to dealing with this problem. Related
% problems that have had less treatment in the literature involve
% mismatches between other performance-critical and labor intensive
% programming primitives such as concurrency models, error handling
% techniques and application development patterns.}
%% rcs: see ##1## in other file for more examples
% \item {\bf Past trends in the Database community have been driven by
% demand for tools that allow extremely specialized (but commercially
% important!) types of software to be developed quickly and
% inexpensively. {[}System R, OODBMS, benchmarks, streaming databases,
% etc{]} This has led to the development of large, monolithic database
% severs that perform well under many circumstances, but that are not
% nearly as flexible as modern programming languages or typical
% in-memory data structure libraries {[}Java Collections,
% STL{]}. Historically, programming language and software library
% development has focused upon the production of a wide array of
% composable general purpose tools, allowing the application developer
% to pick algorithms and data structures that are most appropriate for
% the problem at hand.}
%
% \item {\bf In the past, modular database and transactional storage
% implementations have hidden the complexities of page layout,
% synchronization, locking, and data structure design under relatively
% narrow interfaces, since transactional storage algorithms'
% interdependencies and requirements are notoriously complicated.}
%
%%Not implementing ARIES any more!
%
%
% \item {\bf With these trends in mind, we have implemented a modular
% version of ARIES that makes as few assumptions as possible about
% application data structures or workload. Where such assumptions are
% inevitable, we have produced narrow APIs that allow the application
% developer to plug in alternative implementations of the modules that
% comprise our ARIES implementation. Rather than hiding the underlying
% complexity of the library from developers, we have produced narrow,
% simple API's and a set of invariants that must be maintained in
% order to ensure transactional consistency, allowing application
% developers to produce high-performance extensions with only a little
% effort.}
%
%\end{enumerate}
\section{Prior work}
\begin{enumerate}