diff --git a/doc/paper/LLADD-Freenix.pdf b/doc/paper/LLADD-Freenix.pdf index 410d738..ab51755 100644 Binary files a/doc/paper/LLADD-Freenix.pdf and b/doc/paper/LLADD-Freenix.pdf differ diff --git a/doc/paper/LLADD-Freenix.tex b/doc/paper/LLADD-Freenix.tex index dff1665..0c24465 100644 --- a/doc/paper/LLADD-Freenix.tex +++ b/doc/paper/LLADD-Freenix.tex @@ -32,26 +32,24 @@ \title{\Large \bf LLADD: Extensible Transactional Storage FIXME} \author{ -Your N.\ Here \\ -%{\em Your Department} \\ -{\em Your Institution}\\ -%{\em Your City, State, ZIP}\\ +Russell Sears and Eric Brewer\\ +{\em UC Berkeley}\\ % is there a standard format for email/URLs?? % remember that ~ doesn't do what you expect, use \~{}. -{\normalsize yourname@host.site.dom, http://host.site.dom/yoururl} \\ +{\normalsize \{sears,brewer\}@cs.berkeley.edu, http://lladd.sourceforge.net} \\ % % copy the following lines to add more authors - \smallskip - Name Two Here \\ -{\em Two's Institution}\\ +% \smallskip +% Name Two Here \\ +%{\em Two's Institution}\\ %% is there a standard format for email/URLs?? -{\normalsize two@host.site.dom, http://host.site.dom/twourl} +%{\normalsize two@host.site.dom, http://host.site.dom/twourl} % } % end author \maketitle -\thispagestyle{empty} +\thispagestyle{plain} \subsection*{Abstract} @@ -88,23 +86,22 @@ more direct management of data, LLADD offers a layered architecture that enables simple but robust data management.\footnote{A large class of such applications are deemed ``navigational'' in the database vocabulary, as they directly navigate data structures rather than -perform set operations. We also believe that LLADD is applicable in -the context of new, special purpose database systems (XML databases, -streaming databases, database/semantic file systems, etc), which is a -fruitful area of current work both within the database research -community and in industry.} +perform set operations.} +We also believe that LLADD is applicable in +the context of new, special-purpose database systems such as XML databases, +streaming databases, and database/semantic file systems. These form a +fruitful area of current research, but existing monolithic database systems tend to be a poor fit for these new areas. -The basic approach of LLADD, taken from ARIES [xx], is to build +The basic approach of LLADD, taken from ARIES~\cite{aries}, is to build \emph{transactional pages}, which enables recovery on a page-by-page basis, despite support for high concurrency and the minimization of dish seeks during commit (by using a log). We show how to build a variety of useful data managers on top of this layer, including persistent -hash tables, lightweight recoverable virtual memory, and simple +hash tables, lightweight recoverable virtual memory~\cite{lrvm}, and simple databases. We also cover the details of crash recovery, application-level support for transaction abort and commit, and basic latching for multithreaded applications. - -We also discuss the shortcomings of common applications , and explain +Finally, we also discuss the shortcomings of common applications, and explain why LLADD provides an appropriate solution to these problems. %[more coverage of kinds of apps? imap, lrvm, cht, file system, database] @@ -113,19 +110,19 @@ Many implementations of transactional pages exist in industry and in the literature. Unfortunately, these algorithms tend either to be straightforward and unsuitable for real-world deployment, or are robust and scalable, but achieve these properties by relying upon -intricate sets of internal (and often implicit) interactions. The -ARIES algorithm falls into the second category, has been extremely +intricate sets of internal and often implicit interactions. The +ARIES algorithm falls into the second category, and has been extremely sucessful as part of the IBM DB2 database system. It provides performance and reliability that is comparable to that of current commercial and open-source products. Unfortunately, while the algorithm is conceptually simple, many subtlties arise in its implementation. We chose ARIES as the basis of LLADD, and have made a significant effort to document these interactions. Although a complete discussion -of the AIRES algorithm is beyond the scope of this paper, we will +of the ARIES algorithm is beyond the scope of this paper, we will provide a breif overview, and explain the details that are relevant to developers that wish to extend LLADD. -By documenting the interface between AIRES and higher-level primitives +By documenting the interface between ARIES and higher-level primitives such as data structures, and by structuring LLADD to make this interface explicit in both the library and its extensions, we hope to make it easy to produce correct and efficient durable data @@ -141,7 +138,7 @@ modules that ``do one thing and do it well'', we believe that LLADD can provide competitive performance while making future improvements to its core implementation significantly easier. In order to achieve this goal, LLADD has been split into a number of modules forming a -'core library', and a number of extensions called 'operations' that +{\em core library}, and a number of extensions called {\em operations} that build upon the core library. Since each of these modules exports a stable interface, they can be independently improved. @@ -151,8 +148,7 @@ stable interface, they can be independently improved. An extensive amount of prior work covers the algorithms presented in this paper. Most fundamentally, systems that provide transactional consistency to their users generally include a number of common -modules. A high-level overview of a typical system is given in Figure -\ref{cap:DB-Architecture}. +modules. Figure~\ref{cap:DB-Architecture} presents a high-level overview of a typical system. \begin{figure} \includegraphics[% @@ -414,17 +410,18 @@ reacquired during recovery, the redo phase of the recovery process is single threaded. Since latches acquired by the wrapper function are held while the log entry and page are updated, the ordering of the log entries and page updates associated with a particular latch -must be consistent. Because undo occurs during normal operation, +will be consistent. Because undo occurs during normal operation, some care must be taken to ensure that undo operations obtain the proper latches. \subsection{Recovery} +In this section, we present the details of crach recovery, user-defined logging, and atomic actions that commit even if their enclosing transaction aborts. \subsubsection{ANALYSIS / REDO / UNDO} -Recovery in AIRES consists of three stages, analysis, redo and undo. +Recovery in ARIES consists of three stages, analysis, redo and undo. The first, analysis, is implemented by LLADD, but will not be discussed in this paper. The second, redo, ensures that each redo entry in the log @@ -467,20 +464,19 @@ exactly as they would be during normal operation. \subsubsection{Physical, Logical and Phisiological Logging.} The above discussion avoided the use of some terminology that is common -in the database literature and which should be presented here. {}``Physical -loggging'' is the practice of logging physical (byte level) upates -and the physical (page number) addresses that they are applied to. +in the database literature and which should be presented here. ``Physical +loggging'' is the practice of logging physical (byte-level) updates +and the physical (page number) addresses to which they are applied. -It is subtly different than {}``physiological logging,'' which is +It is subtly different than ``physiological logging,'' which is what LLADD recommends for its redo records. In physiological logging, -the physical (page number) address is stored, but the byte offset +the physical address (page number) is stored, but the byte offset and the actual difference are stored implicitly in the parameters of some function. When the parameters are applied to the function, it will update the page in a way that preserves application semantics. -This allows for some convenient optimizations. For example, data within +The common use for this is {\em slotted pages}, which use a level of indirection to allow records to be rearranged on the page; redo operations use the index as the parameter rather than the page offset. For example, data within a single page can be re-arranged at runtime to produce contiguous -regions of free space, or the parameters passed to the function may -be significantly smaller than the physical change made to the page. +regions of free space. LLADD generalizes this model; for example, the parameters passed to the function may be significantly smaller than the physical change made to the page. {}``Logical logging'' can only be used for undo entries in LLADD, and is identical to physiological logging, except that it stores a @@ -565,7 +561,7 @@ such a tool could easily be applied to existing LLADD operations. Note that the ARIES algorithm is extremely complex, and we have left out most of the details needed to understand how ARIES works, or to -implement it correctly.\footnote{The original ARIES paper was around 70 pages, and the ARIES/IM paper, which covered index implementation is roughly the same length.} Yet, we believe we have covered everything that a programmer needs +implement it correctly.\footnote{The original ARIES paper is around 70 pages, and the ARIES/IM paper~\cite{aries-IM}, which coversd index implementation is roughly the same length.} Yet, we believe we have covered everything that a programmer needs to know in order to implement new data structures using the functionality that ARIES provides. This was possible due to the encapsulation of the ARIES algorithm inside of LLADD, which is the feature that @@ -612,12 +608,11 @@ is able to efficiently support interesting data structures. Despite the complexity of the interactions between its modules, the basic ARIES algorithm itself is quite simple. Therefore, in order to keep LLADD simple, we started with a set of modules, and iteratively -refined the boundaries between these modules. Figure -\ref{cap:LLADD-Architecture} presents the resulting architecture. The +refined the boundaries between these modules. Figure~\ref{cap:LLADD-Architecture} presents the resulting architecture. The core of the LLADD library is quite small at 2218 lines of code, 2155 lines of implementations of operations and other extensions, and 408 -lines of installable header files.\footnote{generated using David -A. Wheeler's ``SLOCCount''} The code has been documented extensively, +lines of installable header files.\footnote{These counts were generated using David +A. Wheeler's {\tt SLOCCount}.} The code has been documented extensively, and we hope that we have exposed most of the subtle interactions between internal modules in the online documentation. @@ -644,9 +639,9 @@ we would like to support transactional access to resources beyond simple page files. Some examples include transactional updates of multiple files on disk, transactional groups of program executions or network requests, or even leveraging some of the advances being -made in the Linux and other modern operating system kernels. For example, -ReiserFS recently added support for atomic file system operations. -This could be used to provide variable sized pages +made in the Linux and other modern OS kernels. For example, +ReiserFS recently added support for atomic file-system operations. +This could be used to provide variable-sized pages to LLADD. Combining these ideas should make it easy to implement some interesting applications, and to improve existing systems such as CVS, IMAP, and a host of ``simple'' desktop applications.