edits

2004-10-22 21:09:45 +00:00 · 2004-10-22 21:09:45 +00:00 · 13927883c7
commit 13927883c7
parent bba27699c3
2 changed files with 39 additions and 44 deletions
--- a/doc/paper/LLADD-Freenix.pdf
+++ b/doc/paper/LLADD-Freenix.pdf
--- a/doc/paper/LLADD-Freenix.tex
+++ b/doc/paper/LLADD-Freenix.tex
@ -32,26 +32,24 @@
 \title{\Large \bf LLADD: Extensible Transactional Storage FIXME}

 \author{
-Your N.\ Here \\
-%{\em Your Department} \\
-{\em Your Institution}\\
-%{\em Your City, State, ZIP}\\
+Russell Sears  and  Eric Brewer\\
+{\em UC Berkeley}\\
 % is there a standard format for email/URLs??
 % remember that ~ doesn't do what you expect, use \~{}.
-{\normalsize yourname@host.site.dom, http://host.site.dom/yoururl} \\
+{\normalsize \{sears,brewer\}@cs.berkeley.edu, http://lladd.sourceforge.net} \\
 %
 % copy the following lines to add more authors
- \smallskip
- Name Two Here \\
-{\em Two's Institution}\\
+% \smallskip
+% Name Two Here \\
+%{\em Two's Institution}\\
 %% is there a standard format for email/URLs??
-{\normalsize two@host.site.dom, http://host.site.dom/twourl}
+%{\normalsize two@host.site.dom, http://host.site.dom/twourl}
 %
 } % end author

 \maketitle

-\thispagestyle{empty}
+\thispagestyle{plain}

 \subsection*{Abstract}

@ -88,23 +86,22 @@ more direct management of data, LLADD offers a layered architecture
 that enables simple but robust data management.\footnote{A large class
 of such applications are deemed ``navigational'' in the database
 vocabulary, as they directly navigate data structures rather than
-perform set operations.  We also believe that LLADD is applicable in
-the context of new, special purpose database systems (XML databases,
-streaming databases, database/semantic file systems, etc), which is a
-fruitful area of current work both within the database research
-community and in industry.}
+perform set operations.}
+We also believe that LLADD is applicable in
+the context of new, special-purpose database systems such as XML databases,
+streaming databases, and database/semantic file systems.  These form a 
+fruitful area of current research, but existing monolithic database systems tend to be a poor fit for these new areas.

-The basic approach of LLADD, taken from ARIES [xx], is to build
+The basic approach of LLADD, taken from ARIES~\cite{aries}, is to build
 \emph{transactional pages}, which enables recovery on a page-by-page
 basis, despite support for high concurrency and the minimization of
 dish seeks during commit (by using a log).  We show how to build a variety
 of useful data managers on top of this layer, including persistent
-hash tables, lightweight recoverable virtual memory, and simple
+hash tables, lightweight recoverable virtual memory~\cite{lrvm}, and simple
 databases.  We also cover the details of crash recovery,
 application-level support for transaction abort and commit, and basic
 latching for multithreaded applications.
-
-We also discuss the shortcomings of common applications , and explain
+Finally, we also discuss the shortcomings of common applications, and explain
 why LLADD provides an appropriate solution to these problems.

 %[more coverage of kinds of apps?  imap, lrvm, cht, file system, database]
@ -113,19 +110,19 @@ Many implementations of transactional pages exist in industry and
 in the literature. Unfortunately, these algorithms tend either to
 be straightforward and unsuitable for real-world deployment, or are
 robust and scalable, but achieve these properties by relying upon
-intricate sets of internal (and often implicit) interactions. The
-ARIES algorithm falls into the second category, has been extremely
+intricate sets of internal and often implicit interactions. The
+ARIES algorithm falls into the second category, and has been extremely
 sucessful as part of the IBM DB2 database system.
 It provides performance and reliability that is comparable to that of current
 commercial and open-source products. Unfortunately, while the algorithm
 is conceptually simple, many subtlties arise in its implementation.
 We chose ARIES as the basis of LLADD, and have made a significant
 effort to document these interactions. Although  a complete discussion
-of the AIRES algorithm is beyond the scope of this paper, we will
+of the ARIES algorithm is beyond the scope of this paper, we will
 provide a breif overview, and explain the details that are relevant
 to developers that wish to extend LLADD. 

-By documenting the interface between AIRES and higher-level primitives
+By documenting the interface between ARIES and higher-level primitives
 such as data structures, and by structuring LLADD to make this
 interface explicit in both the library and its extensions, we hope to
 make it easy to produce correct and efficient durable data
@ -141,7 +138,7 @@ modules that ``do one thing and do it well'', we believe that
 LLADD can provide competitive performance while making future improvements
 to its core implementation significantly easier. In order to achieve
 this goal, LLADD has been split into a number of modules forming a
-'core library', and a number of extensions called 'operations' that
+{\em core library}, and a number of extensions called {\em operations} that
 build upon the core library. Since each of these modules exports a
 stable interface, they can be independently improved.

@ -151,8 +148,7 @@ stable interface, they can be independently improved.
 An extensive amount of prior work covers the algorithms presented in
 this paper.  Most fundamentally, systems that provide transactional
 consistency to their users generally include a number of common
-modules.  A high-level overview of a typical system is given in Figure
-\ref{cap:DB-Architecture}.  
+modules.  Figure~\ref{cap:DB-Architecture} presents a high-level overview of a typical system.

 \begin{figure}
 \includegraphics[%
@ -414,17 +410,18 @@ reacquired during recovery, the redo phase of the recovery process
 is single threaded. Since latches acquired by the wrapper function
 are held while the log entry and page are updated, the ordering of
 the log entries and page updates associated with a particular latch
-must be consistent. Because undo occurs during normal operation, 
+will be consistent. Because undo occurs during normal operation, 
 some care must be taken to ensure that undo operations obtain the 
 proper latches.


 \subsection{Recovery}

+In this section, we present the details of crach recovery, user-defined logging, and atomic actions that commit even if their enclosing transaction aborts.

 \subsubsection{ANALYSIS / REDO / UNDO}

-Recovery in AIRES consists of three stages, analysis, redo and undo. 
+Recovery in ARIES consists of three stages, analysis, redo and undo. 
 The first, analysis, is
 implemented by LLADD, but will not be discussed in this
 paper. The second, redo, ensures that each redo entry in the log 
@ -467,20 +464,19 @@ exactly as they would be during normal operation.
 \subsubsection{Physical, Logical and Phisiological Logging.}

 The above discussion avoided the use of some terminology that is common
-in the database literature and which should be presented here. {}``Physical
-loggging'' is the practice of logging physical (byte level) upates
-and the physical (page number) addresses that they are applied to. 
+in the database literature and which should be presented here. ``Physical
+loggging'' is the practice of logging physical (byte-level) updates
+and the physical (page number) addresses to which they are applied.

-It is subtly different than {}``physiological logging,'' which is
+It is subtly different than ``physiological logging,'' which is
 what LLADD recommends for its redo records. In physiological logging,
-the physical (page number) address is stored, but the byte offset
+the physical address (page number) is stored, but the byte offset
 and the actual difference are stored implicitly in the parameters
 of some function. When the parameters are applied to the function,
 it will update the page in a way that preserves application semantics.
-This allows for some convenient optimizations. For example, data within
+The common use for this is {\em slotted pages}, which use a level of indirection to allow records to be rearranged on the page; redo operations use the index as the parameter rather than the page offset. For example, data within
 a single page can be re-arranged at runtime to produce contiguous
-regions of free space, or the parameters passed to the function may
-be significantly smaller than the physical change made to the page. 
+regions of free space. LLADD generalizes this model; for example, the parameters passed to the function may be significantly smaller than the physical change made to the page. 

 {}``Logical logging'' can only be used for undo entries in LLADD,
 and is identical to physiological logging, except that it stores a
@ -565,7 +561,7 @@ such a tool could easily be applied to existing LLADD operations.

 Note that the ARIES algorithm is extremely complex, and we have left
 out most of the details needed to understand how ARIES works, or to 
-implement it correctly.\footnote{The original ARIES paper was around 70 pages, and the ARIES/IM paper, which covered index implementation is roughly the same length.}  Yet, we believe we have covered everything that a programmer needs
+implement it correctly.\footnote{The original ARIES paper is around 70 pages, and the ARIES/IM paper~\cite{aries-IM}, which coversd index implementation is roughly the same length.}  Yet, we believe we have covered everything that a programmer needs
 to know in order to implement new data structures using the 
 functionality that ARIES provides. This was possible due to the encapsulation
 of the ARIES algorithm inside of LLADD, which is the feature that
@ -612,12 +608,11 @@ is able to efficiently support interesting data structures.
 Despite the complexity of the interactions between its modules, the
 basic ARIES algorithm itself is quite simple. Therefore, in order to
 keep LLADD simple, we started with a set of modules, and iteratively
-refined the boundaries between these modules. Figure
-\ref{cap:LLADD-Architecture} presents the resulting architecture.  The
+refined the boundaries between these modules. Figure~\ref{cap:LLADD-Architecture} presents the resulting architecture.  The
 core of the LLADD library is quite small at 2218 lines of code, 2155
 lines of implementations of operations and other extensions, and 408
-lines of installable header files.\footnote{generated using David
-A. Wheeler's ``SLOCCount''} The code has been documented extensively,
+lines of installable header files.\footnote{These counts were generated using David
+A. Wheeler's {\tt SLOCCount}.} The code has been documented extensively,
 and we hope that we have exposed most of the subtle interactions
 between internal modules in the online documentation.

@ -644,9 +639,9 @@ we would like to support transactional access to resources beyond
 simple page files. Some examples include transactional updates of
 multiple files on disk, transactional groups of program executions
 or network requests, or even leveraging some of the advances being
-made in the Linux and other modern operating system kernels. For example,
-ReiserFS recently added support for atomic file system operations.
-This could be used to provide variable sized pages
+made in the Linux and other modern OS kernels. For example,
+ReiserFS recently added support for atomic file-system operations.
+This could be used to provide variable-sized pages
 to LLADD.  Combining these ideas should make it easy to 
 implement some interesting applications, and to improve existing 
 systems such as CVS, IMAP, and a host of ``simple'' desktop applications.