cleanup

2006-04-24 23:22:46 +00:00 · 2006-04-24 23:22:46 +00:00 · 67a0295a6b
commit 67a0295a6b
parent c0d143529c
1 changed files with 29 additions and 33 deletions
--- a/doc/paper3/LLADD.tex
+++ b/doc/paper3/LLADD.tex
@ -78,10 +78,8 @@ write-ahead-logging algorithms.  Our partial implementation of these
 ideas already provides specialized (and cleaner) semantics to applications.
 We evaluate the performance of a traditional transactional storage
-system based on \yad, and show that it performs comparably to existing
+system based on \yad, and show that it performs favorably relative to existing
-systems.  
+systems.  We present examples that make use of custom access methods, modifed
 We present examples that make use of custom access methods, modifed
 buffer manager semantics, direct log file manipulation, and LSN-free
 pages that facilitate zero-copy optimizations, and discuss the
 composability of these extensions.  Many of these optimizations are
@ -548,7 +546,7 @@ Thus, the single-page transactions of \yad work as follows.  An {\em
 operation} consists of both a redo and an undo function, both of which
 take one argument. An update is always the redo function applied to
 the page (there is no ``do'' function), and it always ensures that the
-redo log entry (with its LSN and argument) reach the disk before
+redo log entry (with its LSN and argument) reaches the disk before
 commit.  Similarly, an undo log entry, with its LSN and argument,
 always reaches the disk before a page is stolen.  ARIES works
 essentially the same way, but hard-codes recommended page 
@ -607,8 +605,6 @@ assigned a new LSN so the page LSN will be different.  Also, each undo
 is also written to the log.
 }
 \eab{describe recovery?}
 This section very briefly described how a simplified
 write-ahead-logging algorithm might work, and glossed over many
 details.  Like ARIES, \yad actually implements recovery in three
@ -707,7 +703,7 @@ each data structure until the end of the transaction.  Releasing the
 lock after the modification, but before the end of the transaction,
 increases concurrency.  However, it means that follow-on transactions that use
 that data may need to abort if a current transaction aborts ({\em
-cascading aborts}.  These issues are studied in great detail in terms of optimistic concurrency control~\cite{optimisticConcurrencyControl, optimisticConcurrenctPerformance}.
+cascading aborts}).  These issues are studied in great detail in terms of optimistic concurrency control~\cite{optimisticConcurrencyControl, optimisticConcurrenctPerformance}.
 Unfortunately, the long locks held by total isolation cause bottlenecks when applied to key
 data structures.
@ -736,10 +732,11 @@ implements nested top actions.  The extension may be used as follows:
  nested top action'' right before the mutex is released.
 \end{enumerate}
-If the transaction that encloses the operation aborts, the logical
+\noindent If the transaction thata encloses the operation aborts, the logical
 undo will {\em compensate} for its effects, leaving the structural
-changes intact.  Note that this recipe does not ensure transactional
+changes intact. 
-consistency and is largely orthogonol to the use of a lock manager.
+% Note that this recipe does not ensure iso transactional
 %consistency and is largely orthogonol to the use of a lock manager.
 We have found that it is easy to protect operations that make
 structural changes to data structures with this recipe.
@ -769,7 +766,7 @@ We say that such operations perform ``blind writes.''
 If all
 operations that modify a page have this property, then we can remove
 the LSN field, and have recovery conservatively assume that it is
-dealing with a version of the page that is at least as old on the one
+dealing with a version of the page that is at least as old as the one
 on disk.  
 \eat{
@ -964,7 +961,7 @@ multiple machines and file systems.
 \begin{figure}
 \includegraphics[%
   width=1\columnwidth]{figs/structure.pdf}
-\caption{\sf\label{fig:structure} The portions of \yad that new operations directly interact with.}
+\caption{\sf\label{fig:structure} The portions of \yad that interact with new operations directly.}
 \end{figure}
 \yad allows application developers to easily add new operations to the
 system.  Many of the customizations described below can be implemented
@ -981,7 +978,7 @@ a new set of log interfaces is to decide upon an interface that these log
 interfaces will export to callers outside of \yad.  
 The externally visible interface is implemented by wrapper functions
-and read only access methods.  The wrapper function modifies the state
+and read-only access methods.  The wrapper function modifies the state
 of the page file by packaging the information that will be needed for
 undo and redo into a data format of its choosing.  This data structure
 is passed into Tupdate().  Tupdate() copies the data to the log, and
@ -998,13 +995,12 @@ implementation must obey a few more invariants:
 \begin{itemize}
 \item Pages should only be updated inside REDO and UNDO functions.
-\item Page updates atomically update page LSN's by pinning the page.
+\item Page updates atomically update the page's LSN by pinning the page.
 \item If the data seen by a wrapper function must match data seen
  during REDO, then the wrapper should use a latch to protect against
  concurrent attempts to update the sensitive data (and against
  concurrent attempts to allocate log entries that update the data).
-\item Nested top actions (and logical undo), or ``big locks'' (which
+\item Nested top actions (and logical undo), or ``big locks'' (total isolation but lower concurrency) should be used to implement multi-page updates. (Section~\ref{sec:nta})
  reduce concurrency) should be used to implement multi-page updates. (Section~\ref{sec:nta})
 \end{itemize}
 \subsection{Linear hash table}
@ -1049,23 +1045,23 @@ The hand-tuned hashtable also uses a linear hash
 function.  However, it is monolithic and uses carefully ordered writes to
 reduce runtime overheads such as log bandwidth.  Berkeley DB's
 hashtable is a popular, commonly deployed implementation, and serves
-as a baseline for our experiements.
+as a baseline for our experiments.
 Both of our hashtables outperform Berkeley DB on a workload that
-bulk loads the tables by repeatedly inserting (key, value) pairs.
+bulk loads the tables by repeatedly inserting (key, value) pairs,
-We do not claim that our partial implementation of \yad
+although we do not wish to imply this is always the case.
-generally outperforms, or is a robust alternative
+%We do not claim that our partial implementation of \yad
-to Berkeley DB.  Instead, this test shows that \yad is comparable to
+%generally outperforms, or is a robust alternative
-existing systems, and that its modular design does not introduce gross
+%to Berkeley DB.  Instead, this test shows that \yad is comparable to
-inefficiencies at runtime.
+%existing systems, and that its modular design does not introduce gross
-
+%inefficiencies at runtime.
 The comparison between the \yad  implementations is more
 enlightening.  The performance of the simple hash table shows that
 straightfoward datastructure implementations composed from
 simpler structures can perform as well as the implementations included 
 in existing monolithic systems.  The hand-tuned
 implementation shows that \yad allows application developers to
-optimize the primitives they build their applications upon.  
+optimize key primitives.
 % I cut this because berkeley db supports custom data structures....
@ -1130,8 +1126,8 @@ modules that implement persistant storage, and includes plugins
 for Berkeley DB and MySQL.  
 This section will describe how the \yad
-\oasys plugin reduces the runtime serialization/deserialization cpu
+\oasys plugin reduces the runtime serialization/deserialization CPU
-overhead of write intensive workloads, while using half as much system
+overhead of write-intensive workloads, while using half as much system
 memory as the other two systems.
 We present three variants of the \yad plugin here.  The first treats \yad like
@ -1149,7 +1145,7 @@ CPU utilization, and it also allows us to drastically decrease the
 size of the page file.  In turn this allows us to increase the size of
 the application's cache of live objects.
-We implemented the \yad buffer pool optimization by adding two new
+We implemented the \yad buffer-pool optimization by adding two new
 operations, update(), which only updates the log, and flush(), which
 updates the page file.  
@ -1194,7 +1190,7 @@ manager optimizations.  However, it only writes the changed portions of
 objects to the log.  Because of \yad's support for custom log entry
 formats, this optimization is straightforward.
-In addition to the buffer pool optimizations, \yad provides several 
+In addition to the buffer-pool optimizations, \yad provides several 
 options to handle UNDO records in the context
 of object serialization. The first is to use a single transaction for
 each object modification, avoiding the cost of generating or logging
@ -1309,7 +1305,7 @@ cluster hash table, we have not yet implemented networking primitives for logica
 Therefore, we implemented a single node log reordering scheme that increases request locality
 during the traversal of a random graph.  The graph traversal system
 takes a sequence of (read) requests, and partitions them using some
-function.  It then proceses each partition in isolation from the
+function.  It then processes each partition in isolation from the
 others.  We considered two partitioning functions.  The first divides the page file
 into equally sized contiguous regions, which increases locality.  The second takes the hash
 of the page's offset in the file, which enables load balancing.
@ -1337,9 +1333,9 @@ one edge from each node has good locality while the others generally
 have poor locality.
 The second experiment explicitly measures the effect of graph locality
-on our optimization. (Figure~\ref{fig:hotGraph}) It extends the idea
+on our optimization (Figure~\ref{fig:hotGraph}). It extends the idea
 of a hot set to graph generation.  Each node has a distinct hot set
-which includes the 10\% of the nodes that are closest to it in ring
+that includes the 10\% of the nodes that are closest to it in ring
 order.  The remaining nodes are in the cold set.  We use random edges
 instead of ring edges for this test.  This does not ensure graph
 connectivity, but we used the same random seeds for the two systems.