edits
This commit is contained in:
parent
bba27699c3
commit
13927883c7
2 changed files with 39 additions and 44 deletions
Binary file not shown.
|
@ -32,26 +32,24 @@
|
|||
\title{\Large \bf LLADD: Extensible Transactional Storage FIXME}
|
||||
|
||||
\author{
|
||||
Your N.\ Here \\
|
||||
%{\em Your Department} \\
|
||||
{\em Your Institution}\\
|
||||
%{\em Your City, State, ZIP}\\
|
||||
Russell Sears and Eric Brewer\\
|
||||
{\em UC Berkeley}\\
|
||||
% is there a standard format for email/URLs??
|
||||
% remember that ~ doesn't do what you expect, use \~{}.
|
||||
{\normalsize yourname@host.site.dom, http://host.site.dom/yoururl} \\
|
||||
{\normalsize \{sears,brewer\}@cs.berkeley.edu, http://lladd.sourceforge.net} \\
|
||||
%
|
||||
% copy the following lines to add more authors
|
||||
\smallskip
|
||||
Name Two Here \\
|
||||
{\em Two's Institution}\\
|
||||
% \smallskip
|
||||
% Name Two Here \\
|
||||
%{\em Two's Institution}\\
|
||||
%% is there a standard format for email/URLs??
|
||||
{\normalsize two@host.site.dom, http://host.site.dom/twourl}
|
||||
%{\normalsize two@host.site.dom, http://host.site.dom/twourl}
|
||||
%
|
||||
} % end author
|
||||
|
||||
\maketitle
|
||||
|
||||
\thispagestyle{empty}
|
||||
\thispagestyle{plain}
|
||||
|
||||
\subsection*{Abstract}
|
||||
|
||||
|
@ -88,23 +86,22 @@ more direct management of data, LLADD offers a layered architecture
|
|||
that enables simple but robust data management.\footnote{A large class
|
||||
of such applications are deemed ``navigational'' in the database
|
||||
vocabulary, as they directly navigate data structures rather than
|
||||
perform set operations. We also believe that LLADD is applicable in
|
||||
the context of new, special purpose database systems (XML databases,
|
||||
streaming databases, database/semantic file systems, etc), which is a
|
||||
fruitful area of current work both within the database research
|
||||
community and in industry.}
|
||||
perform set operations.}
|
||||
We also believe that LLADD is applicable in
|
||||
the context of new, special-purpose database systems such as XML databases,
|
||||
streaming databases, and database/semantic file systems. These form a
|
||||
fruitful area of current research, but existing monolithic database systems tend to be a poor fit for these new areas.
|
||||
|
||||
The basic approach of LLADD, taken from ARIES [xx], is to build
|
||||
The basic approach of LLADD, taken from ARIES~\cite{aries}, is to build
|
||||
\emph{transactional pages}, which enables recovery on a page-by-page
|
||||
basis, despite support for high concurrency and the minimization of
|
||||
dish seeks during commit (by using a log). We show how to build a variety
|
||||
of useful data managers on top of this layer, including persistent
|
||||
hash tables, lightweight recoverable virtual memory, and simple
|
||||
hash tables, lightweight recoverable virtual memory~\cite{lrvm}, and simple
|
||||
databases. We also cover the details of crash recovery,
|
||||
application-level support for transaction abort and commit, and basic
|
||||
latching for multithreaded applications.
|
||||
|
||||
We also discuss the shortcomings of common applications , and explain
|
||||
Finally, we also discuss the shortcomings of common applications, and explain
|
||||
why LLADD provides an appropriate solution to these problems.
|
||||
|
||||
%[more coverage of kinds of apps? imap, lrvm, cht, file system, database]
|
||||
|
@ -113,19 +110,19 @@ Many implementations of transactional pages exist in industry and
|
|||
in the literature. Unfortunately, these algorithms tend either to
|
||||
be straightforward and unsuitable for real-world deployment, or are
|
||||
robust and scalable, but achieve these properties by relying upon
|
||||
intricate sets of internal (and often implicit) interactions. The
|
||||
ARIES algorithm falls into the second category, has been extremely
|
||||
intricate sets of internal and often implicit interactions. The
|
||||
ARIES algorithm falls into the second category, and has been extremely
|
||||
sucessful as part of the IBM DB2 database system.
|
||||
It provides performance and reliability that is comparable to that of current
|
||||
commercial and open-source products. Unfortunately, while the algorithm
|
||||
is conceptually simple, many subtlties arise in its implementation.
|
||||
We chose ARIES as the basis of LLADD, and have made a significant
|
||||
effort to document these interactions. Although a complete discussion
|
||||
of the AIRES algorithm is beyond the scope of this paper, we will
|
||||
of the ARIES algorithm is beyond the scope of this paper, we will
|
||||
provide a breif overview, and explain the details that are relevant
|
||||
to developers that wish to extend LLADD.
|
||||
|
||||
By documenting the interface between AIRES and higher-level primitives
|
||||
By documenting the interface between ARIES and higher-level primitives
|
||||
such as data structures, and by structuring LLADD to make this
|
||||
interface explicit in both the library and its extensions, we hope to
|
||||
make it easy to produce correct and efficient durable data
|
||||
|
@ -141,7 +138,7 @@ modules that ``do one thing and do it well'', we believe that
|
|||
LLADD can provide competitive performance while making future improvements
|
||||
to its core implementation significantly easier. In order to achieve
|
||||
this goal, LLADD has been split into a number of modules forming a
|
||||
'core library', and a number of extensions called 'operations' that
|
||||
{\em core library}, and a number of extensions called {\em operations} that
|
||||
build upon the core library. Since each of these modules exports a
|
||||
stable interface, they can be independently improved.
|
||||
|
||||
|
@ -151,8 +148,7 @@ stable interface, they can be independently improved.
|
|||
An extensive amount of prior work covers the algorithms presented in
|
||||
this paper. Most fundamentally, systems that provide transactional
|
||||
consistency to their users generally include a number of common
|
||||
modules. A high-level overview of a typical system is given in Figure
|
||||
\ref{cap:DB-Architecture}.
|
||||
modules. Figure~\ref{cap:DB-Architecture} presents a high-level overview of a typical system.
|
||||
|
||||
\begin{figure}
|
||||
\includegraphics[%
|
||||
|
@ -414,17 +410,18 @@ reacquired during recovery, the redo phase of the recovery process
|
|||
is single threaded. Since latches acquired by the wrapper function
|
||||
are held while the log entry and page are updated, the ordering of
|
||||
the log entries and page updates associated with a particular latch
|
||||
must be consistent. Because undo occurs during normal operation,
|
||||
will be consistent. Because undo occurs during normal operation,
|
||||
some care must be taken to ensure that undo operations obtain the
|
||||
proper latches.
|
||||
|
||||
|
||||
\subsection{Recovery}
|
||||
|
||||
In this section, we present the details of crach recovery, user-defined logging, and atomic actions that commit even if their enclosing transaction aborts.
|
||||
|
||||
\subsubsection{ANALYSIS / REDO / UNDO}
|
||||
|
||||
Recovery in AIRES consists of three stages, analysis, redo and undo.
|
||||
Recovery in ARIES consists of three stages, analysis, redo and undo.
|
||||
The first, analysis, is
|
||||
implemented by LLADD, but will not be discussed in this
|
||||
paper. The second, redo, ensures that each redo entry in the log
|
||||
|
@ -467,20 +464,19 @@ exactly as they would be during normal operation.
|
|||
\subsubsection{Physical, Logical and Phisiological Logging.}
|
||||
|
||||
The above discussion avoided the use of some terminology that is common
|
||||
in the database literature and which should be presented here. {}``Physical
|
||||
loggging'' is the practice of logging physical (byte level) upates
|
||||
and the physical (page number) addresses that they are applied to.
|
||||
in the database literature and which should be presented here. ``Physical
|
||||
loggging'' is the practice of logging physical (byte-level) updates
|
||||
and the physical (page number) addresses to which they are applied.
|
||||
|
||||
It is subtly different than {}``physiological logging,'' which is
|
||||
It is subtly different than ``physiological logging,'' which is
|
||||
what LLADD recommends for its redo records. In physiological logging,
|
||||
the physical (page number) address is stored, but the byte offset
|
||||
the physical address (page number) is stored, but the byte offset
|
||||
and the actual difference are stored implicitly in the parameters
|
||||
of some function. When the parameters are applied to the function,
|
||||
it will update the page in a way that preserves application semantics.
|
||||
This allows for some convenient optimizations. For example, data within
|
||||
The common use for this is {\em slotted pages}, which use a level of indirection to allow records to be rearranged on the page; redo operations use the index as the parameter rather than the page offset. For example, data within
|
||||
a single page can be re-arranged at runtime to produce contiguous
|
||||
regions of free space, or the parameters passed to the function may
|
||||
be significantly smaller than the physical change made to the page.
|
||||
regions of free space. LLADD generalizes this model; for example, the parameters passed to the function may be significantly smaller than the physical change made to the page.
|
||||
|
||||
{}``Logical logging'' can only be used for undo entries in LLADD,
|
||||
and is identical to physiological logging, except that it stores a
|
||||
|
@ -565,7 +561,7 @@ such a tool could easily be applied to existing LLADD operations.
|
|||
|
||||
Note that the ARIES algorithm is extremely complex, and we have left
|
||||
out most of the details needed to understand how ARIES works, or to
|
||||
implement it correctly.\footnote{The original ARIES paper was around 70 pages, and the ARIES/IM paper, which covered index implementation is roughly the same length.} Yet, we believe we have covered everything that a programmer needs
|
||||
implement it correctly.\footnote{The original ARIES paper is around 70 pages, and the ARIES/IM paper~\cite{aries-IM}, which coversd index implementation is roughly the same length.} Yet, we believe we have covered everything that a programmer needs
|
||||
to know in order to implement new data structures using the
|
||||
functionality that ARIES provides. This was possible due to the encapsulation
|
||||
of the ARIES algorithm inside of LLADD, which is the feature that
|
||||
|
@ -612,12 +608,11 @@ is able to efficiently support interesting data structures.
|
|||
Despite the complexity of the interactions between its modules, the
|
||||
basic ARIES algorithm itself is quite simple. Therefore, in order to
|
||||
keep LLADD simple, we started with a set of modules, and iteratively
|
||||
refined the boundaries between these modules. Figure
|
||||
\ref{cap:LLADD-Architecture} presents the resulting architecture. The
|
||||
refined the boundaries between these modules. Figure~\ref{cap:LLADD-Architecture} presents the resulting architecture. The
|
||||
core of the LLADD library is quite small at 2218 lines of code, 2155
|
||||
lines of implementations of operations and other extensions, and 408
|
||||
lines of installable header files.\footnote{generated using David
|
||||
A. Wheeler's ``SLOCCount''} The code has been documented extensively,
|
||||
lines of installable header files.\footnote{These counts were generated using David
|
||||
A. Wheeler's {\tt SLOCCount}.} The code has been documented extensively,
|
||||
and we hope that we have exposed most of the subtle interactions
|
||||
between internal modules in the online documentation.
|
||||
|
||||
|
@ -644,9 +639,9 @@ we would like to support transactional access to resources beyond
|
|||
simple page files. Some examples include transactional updates of
|
||||
multiple files on disk, transactional groups of program executions
|
||||
or network requests, or even leveraging some of the advances being
|
||||
made in the Linux and other modern operating system kernels. For example,
|
||||
ReiserFS recently added support for atomic file system operations.
|
||||
This could be used to provide variable sized pages
|
||||
made in the Linux and other modern OS kernels. For example,
|
||||
ReiserFS recently added support for atomic file-system operations.
|
||||
This could be used to provide variable-sized pages
|
||||
to LLADD. Combining these ideas should make it easy to
|
||||
implement some interesting applications, and to improve existing
|
||||
systems such as CVS, IMAP, and a host of ``simple'' desktop applications.
|
||||
|
|
Loading…
Reference in a new issue