Merged in some comments, added OLAP reference.

2006-08-02 19:34:01 +00:00 · 2006-08-02 19:34:01 +00:00 · 7e5825aa74
commit 7e5825aa74
parent b5ce838df0
2 changed files with 102 additions and 57 deletions
--- a/doc/paper3/LLADD.bib
+++ b/doc/paper3/LLADD.bib
@ -75,6 +75,28 @@
  OPTannote = 	 {}
 }
@InProceedings{molap,
  author = 	 {Yihong Zhao and Prasad M. Deshpande and Jeffrey F. Naughton},
  title = 	 {An Array-Based Algorithm for Simultaneous Multidimensional Aggregates},
  OPTcrossref =  {},
  OPTkey = 	 {},
  booktitle = {Proceedings of SIGMOD},
  pages = 	 {159-170},
  year = 	 {1997},
  OPTeditor = 	 {},
  OPTvolume = 	 {},
  OPTnumber = 	 {},
  OPTseries = 	 {},
  OPTaddress = 	 {},
  OPTmonth = 	 {},
  OPTorganization = {},
  OPTpublisher = {},
  OPTnote = 	 {},
  OPTannote = 	 {}
 }
@Misc{hibernate,
  key = 	 {hibernate},
  OPTauthor = 	 {},
--- a/doc/paper3/LLADD.tex
+++ b/doc/paper3/LLADD.tex
@ -276,9 +276,9 @@ translate a relation into a set of keyed tuples.  If the database were
 going to be used for short, write-intensive and high-concurrency
 transactions (OLTP), the physical model would probably translate sets
 of tuples into an on-disk B-Tree.  In contrast, if the database needed
-to support long-running, read only aggregation queries (OLAP), a
+to support long-running, read only aggregation queries (OLAP) over high 
-physical model tuned for such queries\rcs{be more concrete here} would
+dimensional data, a physical model that stores the data in sparse array format would
-be more appropriate.  While both OLTP and OLAP databases are based
+be more appropriate~\cite{molap}.  While both OLTP and OLAP databases are based
 upon the relational model they make use of different physical models
 in order to serve different classes of applications.}
@ -481,8 +481,14 @@ may reorder writes on sector boundaries, causing an arbitrary subset
 of a page's sectors to be updated during a crash.  
 {\em Torn page detection} can be used to detect this phenomonon.  Torn
-and corrupted pages may be recovered by restoring the page from
+and corrupted pages may be recovered by using {\em media recovery} to
-backup.  For simplicity, this section ignores mechanisms that detect
+restore the page from backup.  Media recovery works by reinitializing
 the page to zero, and playing back the REDO entries in the log that
 modify the page.  In practice, a system administrator would
 periodically back up the page file, thus enabling log truncation and
 shortening recovery time.
 For simplicity, this section ignores mechanisms that detect
 and restore torn pages, and assumes that page writes are atomic.
 While the techniques described in this section rely on the ability to
 atomically update disk pages, this restriction is relaxed by other 
@ -491,21 +497,47 @@ recovery mechanisms.
 \subsubsection{Extending \yad with new operations}
-Figure~\ref{fig:structure} shows how custom operations interact with
+Figure~\ref{fig:structure} shows how operations interact with \yad.  A
-\yad.  If an application does not need to make use of concurrent
+number of default operations come with \yad.  These include operations
 that allocate and manipulate records, operations that implement hash
 tables, and a number of methods that add functionality to recovery.
 If an operation does not need to be used by concurrent
 transactions, directly manipulating the page file is as simple as
-ensuring that each update to the page file occurs inside of an
+ensuring that each update to the page file occurs inside of the
 operation's implementation.  Operation implementations must be invoked
 by registering a callback with \yad at startup, and then calling {\em
-Tupdate()} to invoke the operation at runtime.  Each operation should
+Tupdate()} to invoke the operation at runtime.  
-be deterministic, provide an inverse, and acquire all of its arguments
+
-from a struct that is passed via Tupdate().  (Operations that affect
+Each operation should be deterministic, provide an inverse, and
-more than one page, and ones that do not provide inverses will be
+acquire all of its arguments from a struct that is passed via
-described later.)  The same callbacks are used during forward opertion
+Tupdate() and from the page it updates.  The callbacks that are used
-as during recovery.  Therefore operations provide a single redo
+during forward opertion are also used during recovery.  Therefore
-function and a single undo function.  (There is no ``do''
+operations provide a single redo function and a single undo function.
-function.)  This reduces the amount of recovery-specific code in the
+(There is no ``do'' function.)  This reduces the amount of
-system.
+recovery-specific code in the system.  Tupdate() writes the struct
 that is passed to it to the log before invoking the operation's
 implementation.  Recovery simply reads the struct from disk and passes
 it into the operation implementation.
 In this portion of the discussion, operations are limited
 to a single page, and provide an undo function.  Operations that
 affect multiple pages and that do not provide inverses will be
 discussed later.
 Operations are limited to a single page because their results must be
 applied to the page file atomically.  Some operations use the data
 stored on the page to update the page.  If this data were corrupted by
 a non-atomic disk write, then such operations would fail during recovery.
 Note that we could implement a limited form of transactions by
 limiting each transaction to a single operation, and by forcing the
 page that each operation updates to disk in order.  This would not
 require any sort of logging, but is quite inefficient in practice.
 The rest of this section describes how recovery can be extended, first
 to efficiently support multiple operations per transaction, and then
 to allow more than one transaction to modify the same data before
 committing.
 \subsubsection{\yads Recovery Algorithm}
@ -522,8 +554,8 @@ log forward in time, applying any updates that did not make it to disk
 before the system crashed.  ``Undo'' runs the log backwards in time,
 only applying portions that correspond to aborted transactions.  This
 section only considers physical undo.  Section~\ref{sec:nta} describes
-the distinction between physical and logical undo, and describes
+the distinction between physical and logical undo.
-logical undo.  A summary of the stages of recovery and the invariants
+A summary of the stages of recovery and the invariants
 they establish is presented in Figure~\ref{fig:conventional-recovery}.
 Redo is the only phase that makes use of LSN's stored on pages.
@ -575,7 +607,7 @@ committed.
 \subsection{Concurrent Transactions}
-\diff{Two factors make it more difficult to write operations that may be
+Two factors make it more difficult to write operations that may be
 used in concurrent transactions.  The first is familiar to anyone that
 has written multi-threaded code: Accesses to shared data structures
 must be protected by latches (mutexes).  The second problem stems from
@ -583,20 +615,7 @@ the fact that concurrent transactions prevent abort from simply
 rolling back the physical updates that a transaction made.
 Fortunately, it is straightforward to reduce this second,
 transaction-specific, problem to the familiar problem of writing
-multi-threaded software.}
+multi-threaded software.
 \rcs{This text needs to make the following two points: (1)Multi-page transactions break the
 atomicity assumption because their results are not applied to disk
 atomically.  (2) Concurrent transactions break the assumption that a
 series of physical undos is the inverse of a transaction.  Nested top
 actions restore these two broken invariants, but are orthoganol to the
 mechanisms that apply the atomic updates.}
 \rcs{Work this in too:  Nested top actions work by
 performing physical operations on a data structure, and then
 registering a CLR.  The CLR contains a logical undo entry for the
 operation.  When recovery and abort encounter a CLR they skip the
 physical undo entries, and instead apply the logical undo.}
 To understand the problems that arise with concurrent transactions,
 consider what would happen if one transaction, A, rearranged the
@ -631,15 +650,18 @@ operations do not need to be undone if the containing logical operation
 (insert) aborts.  \diff{We record such operations using {\em logical 
 logging} and {\em physical logging}, respectively.}
-\diff{Each nested top action performs a single logical operation by applying
+\diff{Each nested top action performs a single logical operation by
-a number of physical operations to the page file.  Physical REDO log
+applying a number of physical operations to the page file.  Physical
-entries are stored in the log so that recovery can repair any
+REDO and UNDO log entries are stored in the log so that recovery can
-temporary inconsistency that the nested top action introduces.
+repair any temporary inconsistency that the nested top action
-Logical UNDO entries are recorded so that the nested top action can be
+introduces.  Once the nested top action has completed, a logical UNDO
-rolled back even if concurrent transactions manipulate the data
+entry is recorded, and a CLR is used to tell recovery to ignore the
-structure.  Finally, physical UNDO entries are recorded so that 
+physical UNDO entries.  The logical UNDO can be safely applied even if
-the nested top action may be rolled back if the system crashes before 
+concurrent transactions manipulate the data structure, and physical
-it completes.}
+UNDO can safely roll back incomplete attempts to manipulate the data
 structure.  Therefore, as long as the physical updates are protected
 from other transactions, the nested top action can always be rolled
 back.}
 This leads to a mechanical approach that converts non-reentrant
 operations that do not support concurrent transactions into reentrant,
@ -650,12 +672,12 @@ concurrent operations:
  to use finer-grained latches in a \yad operation, but it is rarely necessary.
 \item Define a {\em logical} UNDO for each operation (rather than just
  using a set of page-level UNDO's).  For example, this is easy for a
-  hashtable: the UNDO for {\em insert} is {\em remove}.  \diff{This logical
+  hashtable: the UNDO for {\em insert} is {\em remove}.  This logical
  undo function should arrange to acquire the mutex when invoked by
-  abort or recovery.}
+  abort or recovery.
 \item Add a ``begin nested
  top action'' right after the mutex acquisition, and an ``end
-  nested top action'' right before the mutex is released.  \diff{\yad provides a default nested top action implementation as an extension.}
+  nested top action'' right before the mutex is released.  \yad provides operations to implement nested top actions.
 \end{enumerate}
 If the transaction that encloses a nested top action aborts, the
@ -744,10 +766,16 @@ technique.  As far as we know, is used by all database systems that
 update data in place.  Unfortunately, this makes it difficult to map
 large objects onto pages, as the LSN's break up the object.  It
 is tempting to store the LSN's elsewhere, but then they would not be
-written atomically with their page, which defeats their purpose.~\eab{Fit in RVM?}
+written atomically with their page, which defeats their purpose.
 This section explains how we can avoid storing LSN's on pages in \yad
-without giving up durable transactional updates.  In the process, we
+without giving up durable transactional updates.  The techniques here
 are similar to those used by RVM~\cite{lrvm}, a system that supports
 transactional updates to virtual memory.  However, \yad generalizes
 the concept, allowing it to co-exist with traditional pages and fully
 support concurrent transactions.
 In the process of removing LSN's from pages, we
 are able to relax the atomicity assumptions that we make regarding
 writes to disk.  These relaxed assumptions allow recovery to repair
 torn pages without performing media recovery, and allow arbitrary
@ -884,11 +912,7 @@ use of per-page LSN's assume that each page is written to disk
 atomically even though that is generally not the case.  Such schemes
 deal with this problem by using page formats that allow partially
 written pages to be detected.  Media recovery allows them to recover
-these pages.  \rcs{This would be a good place to explain exactly how media recovery works.  Old text: Like ARIES, \yad can recover lost pages in the page
+these pages.  
 file by reinitializing the page to zero, and playing back the entire
 log.  In practice, a system administrator would periodically back up
 the page file, thus enabling log truncation and shortening recovery
 time.}
 The Redo phase of the LSN-free recovery algorithm actually creates a
 torn page each time it applies an old log entry to a new page.
@ -963,10 +987,9 @@ bottom-up approach yields unexpected flexibility.}
 \rcs{All the text in this section is orphaned, but should be worked in elsewhere.}
-We call such pages ``LSN-free'' pages.  Although this technique is
+Regarding LSN-free pages:
-novel for databases, it resembles the mechanism used by
+
-RVM~\cite{lrvm}; \yad generalizes the concept and allows it to
+Furthermore, efficient recovery and
 co-exist with traditional pages.  Furthermore, efficient recovery and
 log truncation require only minor modifications to our recovery
 algorithm.  In practice, this is implemented by providing a buffer manager callback
 for LSN free pages.  The callback computes a