Merged conflicts, update to section 4.
This commit is contained in:
parent
8cf5d11c21
commit
630112937b
1 changed files with 35 additions and 35 deletions
|
@ -379,10 +379,6 @@ the page metadata appropriately.
|
|||
|
||||
\subsubsection{Log entries and forward operation (the Tupdate() function)\label{sub:Tupdate}}
|
||||
|
||||
[TODO...need to make this clearer... I think we need to say that we define a function to do redo, and then we define an update that use
|
||||
it. Recovery uses the same function the same way.]
|
||||
|
||||
|
||||
In order to handle crashes correctly, and in order to the undo the
|
||||
effects of aborted transactions, LLADD provides operation implementors
|
||||
with a mechanism to log undo and redo information for their actions.
|
||||
|
@ -451,33 +447,29 @@ behave correctly even if an arbitrary number of intervening operations
|
|||
are performed on the data structure.
|
||||
|
||||
[TODO...this next paragraph doesn't make sense; also maybe move this whole subsection to later, since it is complicated]
|
||||
The remaining log entries are redo-only, and may perform structural
|
||||
Next, the operations writes one or more redo-only log entries that may perform structural
|
||||
modifications to the data structure. They should not make any assumptions
|
||||
about the consistency of the current version of the database. Finally,
|
||||
any prefix of the sequence of the redo-only operations performed by
|
||||
this operation must leave the database in a consistent state. The
|
||||
$B^{LINK}$ tree {[}...{]} is an example of a B-Tree implementation
|
||||
that behaves in this way, as is the linear hash table implementation
|
||||
discussed in Section \ref{sub:Linear-Hash-Table}.
|
||||
|
||||
Some of the logging constraints introduced in this section may seem
|
||||
strange at this point, but are motivated by the recovery process.
|
||||
|
||||
[TODO...need to explain this...]
|
||||
that behaves in this way, while the linear hash table implementation
|
||||
discussed in Section \ref{sub:Linear-Hash-Table} is a scalable
|
||||
hash table that meets these constraints.
|
||||
|
||||
\subsection{Recovery}
|
||||
|
||||
|
||||
\subsubsection{ANALYSIS / REDO / UNDO}
|
||||
|
||||
Recovery in AIRES consists of three stages, analysis, redo and undo
|
||||
. The first, analysis, is
|
||||
Recovery in AIRES consists of three stages, analysis, redo and undo.
|
||||
The first, analysis, is
|
||||
implemented by LLADD, but will not be discussed in this
|
||||
paper. The second, redo, ensures that each redo entry in the log
|
||||
will have been applied each page in the page file exactly once.
|
||||
The third phase, undo, rolls back any transactions that were active
|
||||
when the crash occured, as though the application manually aborted
|
||||
them with the {}``abort()'' call.
|
||||
them with the {}``abort'' function call.
|
||||
|
||||
After the analysis phase, the on-disk version of the page file
|
||||
is in the same state it was in when LLADD crashed. This means that
|
||||
|
@ -496,7 +488,7 @@ page are in an inconsistent state. Therefore, as the redo phase re-applies
|
|||
Therefore, the redo information for each operation in the log
|
||||
must contain the physical address (page number) of the information
|
||||
that it modifies, and the portion of the operation executed by a single
|
||||
log entry must only rely upon the contents of the page that the log
|
||||
redo log entry must only rely upon the contents of the page that the log
|
||||
entry refers to. Since we assume that pages are propagated to disk
|
||||
atomically, the REDO phase may rely upon information contained within
|
||||
a single page.
|
||||
|
@ -506,7 +498,7 @@ complete entries for all committed transactions. Therefore, we know that the pa
|
|||
a physically consistent state, although it contains portions of the
|
||||
results of uncomitted transactions. The final stage of recovery is
|
||||
the undo phase, which simply aborts all uncomitted transactions. Since
|
||||
the page file is physically consistent, the transactions are aborted
|
||||
the page file is physically consistent, the transactions may be aborted
|
||||
exactly as they would be during normal operation.
|
||||
|
||||
|
||||
|
@ -556,7 +548,7 @@ and highly-concurrent data structure using LLADD:
|
|||
the page that the redo function sees, then the wrapper should latch
|
||||
the relevant data.
|
||||
\item Redo operations should address pages by their physical offset,
|
||||
while Undo operations should use a more permenant address (such as
|
||||
while Undo operations should use a more permanent address (such as
|
||||
index key) if the data may move between pages over time.
|
||||
\item An undo operation must correctly update a data structure if any
|
||||
prefix of its corresponding redo operations are applied to the
|
||||
|
@ -567,12 +559,13 @@ Because undo and redo operations during normal operation and recovery
|
|||
are similar, most bugs will be found with conventional testing
|
||||
strategies. It is difficult to verify the final property, although a
|
||||
number of tools could be written to simulate various crash scenarios,
|
||||
and check the behavior of operations under these scenarios.
|
||||
and check the behavior of operations under these scenarios. Of course,
|
||||
such a tool could easily be applied to existing LLADD operations.
|
||||
|
||||
Note that the ARIES algorithm is extremely complex, and we have left
|
||||
out most of the details needed to understand how ARIES works, or to
|
||||
implement it correctly.\footnote{The original ARIES paper was around 70 pages, and the ARIES/IM paper, which covered index implementation is roughly the same length.} Yet, we believe we have covered everything that a programmer needs
|
||||
to know in order to implement new data structures using the basic
|
||||
to know in order to implement new data structures using the
|
||||
functionality that ARIES provides. This was possible due to the encapsulation
|
||||
of the ARIES algorithm inside of LLADD, which is the feature that
|
||||
most strongly differentiates LLADD from other, similar libraries.
|
||||
|
@ -596,7 +589,8 @@ it easy to improve and customize LLADD.}
|
|||
\end{figure}
|
||||
LLADD is a toolkit for building transaction managers.
|
||||
It provides user-defined redo and undo behavior, and has an extendible
|
||||
logging system with ... types of log entries so far. Most of these
|
||||
logging system with 19 types of log entries so far (not counting those
|
||||
internal to LLADD, such as ``begin'', ``abort'', and ``clr''). Most of these
|
||||
extensions deal with data layout or modification, but some deal with
|
||||
other aspects of LLADD, such as extensions to recovery semantics (Section
|
||||
\ref{sub:Two-Phase-Commit}). LLADD comes with some default page layout
|
||||
|
@ -609,25 +603,29 @@ Although it ships with basic operations that support variable length
|
|||
records, hash tables and other common data types, our goal is to
|
||||
decouple all decisions regarding data format from the implementation
|
||||
of the logging and recovery systems. Therefore, the preceeding section
|
||||
is essentially documentation for potential users of the library, while
|
||||
is essentially documentation for users of the library, while
|
||||
the purpose of the performance numbers in our evaluation section are
|
||||
not to validate our hash table, but to show that the underlying architecture
|
||||
is able to efficiently support interesting data structures.
|
||||
|
||||
Despite the complexity of the interactions among its modules, the
|
||||
basic ARIES algorithm itself is quite simple. Therefore, in order to keep
|
||||
LLADD simple, we started with a set of modules, and iteratively refined
|
||||
the boundaries among these modules. Figure \ref{cap:LLADD-Architecture} presents the resulting architecture. The core of the LLADD library
|
||||
is quite small at ... lines of code, and has been documented extensively.
|
||||
We hope that we have exposed most of the subtle interactions between
|
||||
internal modules in the online documentation. {[}... doxygen ...{]}
|
||||
Despite the complexity of the interactions between its modules, the
|
||||
basic ARIES algorithm itself is quite simple. Therefore, in order to
|
||||
keep LLADD simple, we started with a set of modules, and iteratively
|
||||
refined the boundaries between these modules. Figure
|
||||
\ref{cap:LLADD-Architecture} presents the resulting architecture. The
|
||||
core of the LLADD library is quite small at 2218 lines of code, 2155
|
||||
lines of implementations of operations and other extensions, and 408
|
||||
lines of installable header files.\footnote{generated using David
|
||||
A. Wheeler's ``SLOCCount''} The code has been documented extensively,
|
||||
and we hope that we have exposed most of the subtle interactions
|
||||
between internal modules in the online documentation.
|
||||
|
||||
As LLADD has evolved, many of its sub-systems have been incrementally
|
||||
improved, and we believe that the current set of modules is amenable
|
||||
to the addition of new functionality. For instance, the logging module
|
||||
interface encapsulates all of the details regarding its on disk format,
|
||||
which would make it straightforward to implement more exotic logging
|
||||
techniques such as using log shipping to maintain a 'warm replica'
|
||||
techniques such as using log shipping to maintain a ``warm replica''
|
||||
for failover purposes, or the use of log replication to avoid physical
|
||||
disk access at commit time. Similarly, the interface encodes the dependencies
|
||||
between the logger and other subsystems, so, for instance, the requirements
|
||||
|
@ -647,9 +645,10 @@ multiple files on disk, transactional groups of program executions
|
|||
or network requests, or even leveraging some of the advances being
|
||||
made in the Linux and other modern operating system kernels. For example,
|
||||
ReiserFS recently added support for atomic file system operations.
|
||||
This could be used to provide atomic variable sized pages
|
||||
to LLADD. Combining some of these ideas should make it easy to
|
||||
implement some interesting applications.
|
||||
This could be used to provide variable sized pages
|
||||
to LLADD. Combining these ideas should make it easy to
|
||||
implement some interesting applications, and to improve existing
|
||||
systems such as CVS, IMAP, and a host of ``simple'' desktop applications.
|
||||
|
||||
From the testing point of view, the advantage of LLADD's division
|
||||
into subsystems with simple interfaces is obvious. We are able to
|
||||
|
@ -659,8 +658,9 @@ making it easy to add new tests and debug old ones. Furthermore, by
|
|||
adding a 'simulate crash' operation to a few of the key components,
|
||||
we can simulate application level crashes by clearing LLADD's internal
|
||||
state, re-initializing the library and verifying that recovery was
|
||||
successful. These tests currently cover approximately 90\% of the
|
||||
code. We have not yet developed a mechanism that will allow us to
|
||||
successful. These tests currently cover approximately
|
||||
90\%\footnote{generated using ``gcov'', which is part of gcc, and ``lcov,'' which interprets gcov's output.}
|
||||
of the code. We have not yet developed a mechanism that will allow us to
|
||||
accurately model hardware failures, which is an area where futher
|
||||
work is needed. However, the basis for this work will be the development
|
||||
of test harnesses that verify operation behavior in exceptional circumstances.
|
||||
|
|
Loading…
Reference in a new issue