Fixed a few easy things based on reviewer feedback.

2006-07-17 23:48:30 +00:00 · 2006-07-17 23:48:30 +00:00 · bf8b230bbd
commit bf8b230bbd
parent bf98e32c73
2 changed files with 110 additions and 52 deletions
--- a/doc/paper3/LLADD.bib
+++ b/doc/paper3/LLADD.bib
@ -405,8 +405,8 @@
 }
@InProceedings{lfs,
-  author = 	 {The Design and Implementation of a Log-Structured File System},
+  title = 	 {The Design and Implementation of a Log-Structured File System},
-  title = 	 {Mendel Rosenblum and John K. Ousterhout},
+  author = 	 {Mendel Rosenblum and John K. Ousterhout},
  OPTcrossref =  {},
  OPTkey = 	 {},
  booktitle = {Proceedings of the 13th ACM Symposium on Operating Systems Principles},
--- a/doc/paper3/LLADD.tex
+++ b/doc/paper3/LLADD.tex
@ -30,8 +30,9 @@
 \newcommand{\yads}{Stasys'\xspace}
 \newcommand{\oasys}{Oasys\xspace}
-%\newcommand{\eab}[1]{\textcolor{red}{\bf EAB: #1}}
+\newcommand{\diff}[1]{\textcolor{blue}{\bf #1}}
-%\newcommand{\rcs}[1]{\textcolor{green}{\bf RCS: #1}}
+\newcommand{\eab}[1]{\textcolor{red}{\bf EAB: #1}}
 \newcommand{\rcs}[1]{\textcolor{green}{\bf RCS: #1}}
 %\newcommand{\mjd}[1]{\textcolor{blue}{\bf MJD: #1}}
 \newcommand{\eat}[1]{}
@ -261,10 +262,9 @@ routines into two broad modules: {\em conceptual
 mappings}~\cite{batoryConceptual} and {\em physical
 database models}~\cite{batoryPhysical}.
-A conceptual mapping might translate a relation into a set of keyed
+%A physical model would then translate a set of tuples into an
-tuples.  A physical model would then translate a set of tuples into an
+%on-disk B-Tree, and provide support for iterators and range-based query
-on-disk B-Tree, and provide support for iterators and range-based query
+%operations.
 operations.
 It is the responsibility of a database implementor to choose a set of
 conceptual mappings that implement the desired higher-level
@ -272,8 +272,19 @@ abstraction (such as the relational model).  The physical data model
 is chosen to efficiently support the set of mappings that are built on
 top of it.
 \diff{A conceptual mapping based on the relational model might
 translate a relation into a set of keyed tuples.  If the database were
 going to be used for short, write-intensive and high-concurrency
 transactions (OLTP), the physical model would probably translate sets
 of tuples into an on-disk B-Tree.  In contrast, if the database needed
 to support long-running, read only aggregation queries (OLAP), a
 physical model tuned for such queries\rcs{be more concrete here} would
 be more appropriate.  While both OLTP and OLAP databases are based
 upon the relational model they make use of different physical models
 in order to serve different classes of applications.}
 A key observation of this paper is that no known physical data model
-can support more than a small percentage of today's applications.
+can efficiently support more than a small percentage of today's applications.  
 Instead of attempting to create such a model after decades of database
 research has failed to produce one, we opt to provide a transactional
@ -515,7 +526,7 @@ redo the lost updates during recovery.
 For this to work, recovery must be able to decide which updates to
 re-apply.  This is solved by using a per-page sequence number called a
-{\em log sequence number}. Each log entry contains the sequence
+{\em log sequence number \diff{(LSN)}}. Each log entry contains the sequence
 number, and each page contains the sequence number of the last applied
 update.  Thus on recovery, we load a page, look at its sequence
 number, and re-apply all later updates.  Similarly, to restore a page
@ -712,24 +723,45 @@ commit even if their containing transaction aborts; thus follow-on
 transactions can use the data structure without fear of cascading
 aborts.  
-The key idea is to distinguish between the logical operations of a
+The key idea is to distinguish between the {\em logical operations} of a
-data structure, such as inserting a key, and the physical operations
+data structure, such as inserting a key, and the {\em physical operations}
 such as splitting tree nodes or or rebalancing a tree. The physical
 operations do not need to be undone if the containing logical operation
-(insert) aborts.
+(insert) aborts.  \diff{We record such operations using {\em logical 
 logging} and {\em physical logging}, respectively.}
-Because nested top actions are easy to use and do not lead to 
+\diff{Each nested top action performs a single logical operation by applying
-deadlock, we wrote a simple \yad extension that
+a number of physical operations to the page file.  Physical REDO log
-implements nested top actions.  The extension may be used as follows:
+entries are stored in the log so that recovery can repair any
 temporary inconsistency that the nested top action introduces.
 Logical UNDO entries are recorded so that the nested top action can be
 rolled back even if concurrent transactions manipulate the data
 structure.  Finally, physical UNDO entries are recorded so that 
 the nested top action may be rolled back if the system crashes before 
 it completes.}
 \diff{When making use of nested top actions, we think of them as a
 special type of latch that hides temporary inconsistencies from the
 procedures executed during recovery.  Generally, such inconsistencies
 must be hidden from other transactions in a multithreaded environment;
 therefore we usually protect nested top actions with a mutex.}
 \diff{This observation leads to the following mechanical conversion of
 non-concurrent operations to thread-safe code that handles concurrent
 transactions correctly:}
 %Because nested top actions are easy to use and do not lead to 
 %deadlock, we wrote a simple \yad extension that
 %implements nested top actions.  The extension may be used as follows:
 \begin{enumerate}
 \item Wrap a mutex around each operation.  With care, it may be possible to use finer-grained locks, but it is rarely necessary.
 \item Define a {\em logical} UNDO for each operation (rather than just using
  a set of page-level UNDO's).  For example, this is easy for a
  hashtable: the UNDO for {\em insert} is {\em remove}.
-\item For mutating operations, (not read-only), add a ``begin nested
+\item Add a ``begin nested
  top action'' right after the mutex acquisition, and a ``commit
-  nested top action'' right before the mutex is released.
+  nested top action'' right before the mutex is released.  \diff{\yad provides a default nested top action implementation as an extension.}
 \end{enumerate}
 \noindent If the transaction that encloses the operation aborts, the logical
@ -755,30 +787,32 @@ then they would not be written atomically with their page, which
 defeats their purpose.
 LSNs were introduced to prevent recovery from applying updates more
-than once. However, by constraining itself to a special type of idempotent redo and undo
+than once.  \diff{However, \yad can eliminate the LSN on each page by
-entries,\endnote{Idempotency does not guarantee that $f(g(x)) =
+constraining itself to deterministic REDO log entries that do not read
-  f(g(f(g(x))))$.  Therefore, idempotency does not guarantee that it is safe
+the contents of the page they update.}
-  to assume that a page is older than it is.}
+
-\yad can eliminate the LSN on each page.
+%However, by constraining itself to a special type of idempotent redo and undo
 %entries,\endnote{Idempotency does not guarantee that $f(g(x)) =
 %  f(g(f(g(x))))$.  Therefore, idempotency does not guarantee that it is safe
 %  to assume that a page is older than it is.}
 %\yad can eliminate the LSN on each page.
 Consider purely physical logging operations that overwrite a fixed
 byte range on the page regardless of the page's initial state.  
 We say that such operations perform ``blind writes.''
 If all
 operations that modify a page have this property, then we can remove
-the LSN field, and have recovery conservatively assume that it is
+the LSN field, and have recovery \diff{use a conservative estimate 
-dealing with a version of the page that is at least as old as the one
+of the LSN of each page that it is dealing with.}
 on disk.  
-\eat{
+\diff{For example, it 
-This allows non-idempotent operations to be implemented.  For
+could use the LSN of the most recent truncation point in the log, 
-example, a log entry could simply tell recovery to increment a value
+or during normal operation, \yad could occasionally write the 
-on a page by some value, or to allocate a new record on the page.  
+LSN of the oldest dirty page to the log.}
-If the recovery algorithm did not know exactly which
+
-version of a page it is dealing with, the operation could
+% conservatively assume that it is
-inadvertently be applied more than once, incrementing the value twice,
+%dealing with a version of the page that is at least as old as the one
-or double allocating a record.
+%on disk.  
 }
 To understand why this works, note that the log entries
 update some subset of the bits on the page.  If the log entries do not
@ -803,14 +837,31 @@ log entry is thus a conservative but close estimate.
 Section~\ref{sec:zeroCopy} explains how LSN-free pages led us to new 
 approaches for recoverable virtual memory and for large object storage.  
 Section~\ref{sec:oasys} uses blind writes to efficiently update records 
-on pages that are manipulated using more general operations.
+on pages that are manipulated using more general operations.  \diff{We 
 have not yet implemented LSN-free pages, so our experimental setup mimics
 their behavior.}
 \diff{Also note that while LSN-free pages assume that only bits that
 are being updated will change, they do not assume that disk writes are
 atomic.  Most disks do not atomically update more a single 512-byte
 sector at a time.  However, most database systems make use of pages
 that are larger than 512 bytes.  Recovery schemes that rely upon LSN
 fields in pages must detect and deal with torn pages
 directly~\cite{tornPageStuffMohan}.  Because LSN-free page recovery
 does not assume page writes are atomic, it handles torn pages with no
 extra effort.}
 \subsection{Media recovery}
-Like ARIES, \yad can recover lost pages in the page file by
+\diff{Hard drives may lose data due to hardware failures, or because a
-reinitializing the page to zero, and playing back the entire log.  In
+sector is being written when power is lost.  The drive hardware stores a
-practice, a system administrator would periodically back up the page file
+checksum with each sector, and will issue a read error if the checksum
-up, thus enabling log truncation and shortening recovery time.
+does not match~\cite{something}.}  Like ARIES, \yad can recover lost pages in the page
 file by reinitializing the page to zero, and playing back the entire
 log.  In practice, a system administrator would periodically back up
 the page file up, thus enabling log truncation and shortening recovery
 time.
 \eat{  This is pretty redundant.
 \subsection{Modular operations semantics}
@ -917,8 +968,8 @@ appropriate.
 \yad allows application developers to easily add new operations to the
 system.  Many of the customizations described below can be implemented
 using custom log operations.  In this section, we describe how to implement an
-``ARIES style'' concurrent, steal/no force operation using 
+``ARIES style'' concurrent, steal/no-force operation using 
-full physiological logging and per-page LSN's.
+\diff{physical redo, logical undo} and per-page LSN's.
 Such operations are typical of high-performance commercial database
 engines.
@ -1283,10 +1334,14 @@ Database optimizers operate over relational algebra expressions that
 correspond to logical operations over streams of data.  \yad
 does not provide query languages, relational algebra, or other such query processing primitives.  
-However, it does include an extensible logging infrastructure.  Furthermore, many
+However, it does include an extensible logging infrastructure.
-operations that make use of physiological logging implicitly
+Furthermore, \diff{most operations that support concurrent transactions already
-implement UNDO (and often REDO) functions that interpret logical
+provide logical UNDO (and therefore logical REDO, if each operation has an
-requests.
+inverse).}
 %many
 %operations that make use of physiological logging implicitly
 %implement UNDO (and often REDO) functions that interpret logical
 %requests.
 Logical operations often have some nice properties that this section
 will exploit.  Because they can be invoked at arbitrary times in the
@ -1314,8 +1369,9 @@ in non-transactional memory.
 %entries.  Therefore, applications may need to implement custom
 %operations to make use of the ideas in this section.
-Although \yad has rudimentary support for a two-phase commit based
+%Although \yad has rudimentary support for a \diff{cluster hash table\cite{cht}} that uses 
-cluster hash table, we have not yet implemented networking primitives for logical logs.
+%two-phase commit to recover from node crashes}, we have not yet implemented networking primitives for logical logs.
 \rcs{Cut sentence about two-phase commit cluster hash table, networking primitves for logical logs.}
 Therefore, we implemented a single node log-reordering scheme that increases request locality
 during the traversal of a random graph.  The graph traversal system
 takes a sequence of (read) requests, and partitions them using some
@ -1364,12 +1420,14 @@ algorithm's outperforms the naive traversal.
 \subsection{LSN-Free pages}
 \label{sec:zeroCopy}
 In Section~\ref{sec:blindWrites}, we describe how operations can avoid recording
-LSN's on the pages they modify.  Essentially, operations that make use
+LSN's on the pages they modify.  Essentially, operations that update pages \diff{without examining their contents}
-of purely physical logging need not heed page boundaries, as
+% make use of purely physical logging 
-physiological operations must.  Recall that purely physical logging
+need not heed page boundaries.
 %, as physiological operations must.  
 Recall that purely physical logging
 interacts poorly with concurrent transactions that modify the same
 data structures or pages, so LSN-Free pages are not applicable in all
-situations.
+situations. \rcs{I think we can support physiological logging; once REDO is done, we know the LSN.  Why not do logical UNDO?}
 Consider the retrieval of a large (page spanning) object stored on
 pages that contain LSN's.  The object's data will not be contiguous.