cleanup

2006-08-15 01:00:55 +00:00 · 2006-08-15 01:00:55 +00:00 · 9e4cb7d7c4
commit 9e4cb7d7c4
parent 8bf2cb65ef
1 changed files with 135 additions and 112 deletions
--- a/doc/paper3/LLADD.tex
+++ b/doc/paper3/LLADD.tex
@ -141,7 +141,7 @@ management~\cite{perl}, with mixed success~\cite{excel}.
 Our hypothesis is that 1) each of these areas has a distinct top-down
 conceptual model (which may not map well to the relational model); and
-2) there exists a bottom-up layering that can better support all of these
+2) there exists a bottom-up layered framework that can better support all of these
 models and others. 
 Just within databases, relational, object-oriented, XML, and streaming
@ -311,7 +311,7 @@ all of these systems.  We look at these in more detail in
 Section~\ref{related=work}.
 In some sense, our hypothesis is trivially true in that there exists a
-bottom-up layering called the ``operating system'' that can implement
+bottom-up framework called the ``operating system'' that can implement
 all of the models. A famous database paper argues that it does so
 poorly (Stonebraker 1980~\cite{Stonebraker80}). Our task is really to
 simplify the implementation of transactional systems through more
@ -328,7 +328,7 @@ databases~\cite{libtp}.  At its core, it provides the physical database model
 %most relational database systems~\cite{libtp}.  
 In particular, 
 it provides fully transactional (ACID) operations over B-Trees, 
-hashtables, and other access methods.  It provides flags that 
+hash tables, and other access methods.  It provides flags that 
 let its users tweak various aspects of the performance of these
 primitives, and selectively disable the features it provides.
@ -437,7 +437,7 @@ it into the operation implementation.
 In this portion of the discussion, operations are limited
 to a single page, and provide an undo function.  Operations that
-affect multiple pages and that do not provide inverses will be
+affect multiple pages or do not provide inverses will be
 discussed later.
 Operations are limited to a single page because their results must be
@ -452,8 +452,8 @@ pages and failed sectors, this does not
 require any sort of logging, but is quite inefficient in practice, as
 it forces the disk to perform a potentially random write each time the
 page file is updated.  The rest of this section describes how recovery
-can be extended, first to efficiently support multiple operations per
+can be extended, first to support multiple operations per
-transaction, and then to allow more than one transaction to modify the
+transaction efficiently, and then to allow more than one transaction to modify the
 same data before committing.
 \subsubsection{\yads Recovery Algorithm}
@ -461,12 +461,11 @@ same data before committing.
 Recovery relies upon the fact that each log entry is assigned a {\em
 Log Sequence Number (LSN)}.  The LSN is monitonically increasing and
 unique.  The LSN of the log entry that was most recently applied to
-each page is stored with the page, which allows recovery to selectively
+each page is stored with the page, which allows recovery to replay log entries selectively.  This only works if log entries change exactly one
 replay log entries.  This only works if log entries change exactly one
 page and if they are applied to the page atomically.
 Recovery occurs in three phases, Analysis, Redo and Undo.
-``Analysis'' is beyond the scope of this paper.  ``Redo'' plays the
+``Analysis'' is beyond the scope of this paper, but essentially determines the commit/abort status of every transaction.  ``Redo'' plays the
 log forward in time, applying any updates that did not make it to disk
 before the system crashed.  ``Undo'' runs the log backwards in time,
 only applying portions that correspond to aborted transactions.  This
@ -475,7 +474,7 @@ the distinction between physical and logical undo.
 A summary of the stages of recovery and the invariants
 they establish is presented in Figure~\ref{fig:conventional-recovery}.
-Redo is the only phase that makes use of LSN's stored on pages.
+Redo is the only phase that makes use of LSNs stored on pages.
 It simply compares the page LSN to the LSN of each log entry.  If the
 log entry's LSN is higher than the page LSN, then the log entry is
 applied.  Otherwise, the log entry is skipped.  Redo does not write
@ -556,12 +555,11 @@ increases concurrency.  However, it means that follow-on transactions that use
 that data may need to abort if a current transaction aborts ({\em
 cascading aborts}).  %Related issues are studied in great detail in terms of optimistic concurrency control~\cite{optimisticConcurrencyControl, optimisticConcurrencyPerformance}.
-Unfortunately, the long locks held by total isolation cause bottlenecks when applied to key
+Unfortunately, the long locks held by total isolation cause
-data structures.
+bottlenecks when applied to key data structures.  Nested top actions
-Nested top actions are essentially mini-transactions that can
+are essentially mini-transactions that can commit even if their
-commit even if their containing transaction aborts; thus follow-on
+containing transaction aborts; thus follow-on transactions can use the
-transactions can use the data structure without fear of cascading
+data structure without fear of cascading aborts.
 aborts.  
 The key idea is to distinguish between the {\em logical operations} of a
 data structure, such as inserting a key, and the {\em physical operations}
@ -593,7 +591,7 @@ concurrent operations:
  to use finer-grained latches in a \yad operation, but it is rarely necessary.
 \item Define a {\em logical} UNDO for each operation (rather than just
  using a set of page-level UNDO's).  For example, this is easy for a
-  hashtable: the UNDO for {\em insert} is {\em remove}.  This logical
+  hash table: the UNDO for {\em insert} is {\em remove}.  This logical
  undo function should arrange to acquire the mutex when invoked by
  abort or recovery.
 \item Add a ``begin nested top action'' right after the mutex
@ -626,7 +624,7 @@ not able to safely combine them to create concurrent transactions.
 Note that the transactions described above only provide the
 ``Atomicity'' and ``Durability'' properties of ACID.\endnote{The ``A'' in ACID really means atomic persistence
 of data, rather than atomic in-memory updates, as the term is normally
-used in systems work; %~\cite{GR97}; 
+used in systems work~\cite{GR97}; 
 the latter is covered by ``C'' and
 ``I''.}  ``Isolation'' is
 typically provided by locking, which is a higher-level but
@ -679,22 +677,22 @@ We make no assumptions regarding lock managers being used by higher-level code i
 \section{LSN-free pages.}
 \label{sec:lsn-free}
-The recovery algorithm described above uses LSN's to determine the
+The recovery algorithm described above uses LSNs to determine the
 version number of each page during recovery.  This is a common
 technique.  As far as we know, is used by all database systems that
 update data in place.  Unfortunately, this makes it difficult to map
-large objects onto pages, as the LSN's break up the object.  It
+large objects onto pages, as the LSNs break up the object.  It
-is tempting to store the LSN's elsewhere, but then they would not be
+is tempting to store the LSNs elsewhere, but then they would not be
 written atomically with their page, which defeats their purpose.
-This section explains how we can avoid storing LSN's on pages in \yad
+This section explains how we can avoid storing LSNs on pages in \yad
 without giving up durable transactional updates.  The techniques here
 are similar to those used by RVM~\cite{lrvm}, a system that supports
 transactional updates to virtual memory.  However, \yad generalizes
 the concept, allowing it to co-exist with traditional pages and fully
 support concurrent transactions.
-In the process of removing LSN's from pages, we
+In the process of removing LSNs from pages, we
 are able to relax the atomicity assumptions that we make regarding
 writes to disk.  These relaxed assumptions allow recovery to repair
 torn pages without performing media recovery, and allow arbitrary
@ -707,7 +705,7 @@ protocol for atomically and durably applying updates to the page file.
 This will require the addition of a new page type (\yad currently has
 3 such types, not including a few minor variants).  The new page type
 will need to communicate with the logger and recovery modules in order
-to estimate page LSN's, which will need to make use of callbacks in
+to estimate page LSNs, which will need to make use of callbacks in
 those modules.  Of course, upon providing support for LSN free pages,
 we will want to add operations to \yad that make use of them.  We plan
 to eventually support the coexistance of LSN-free pages, traditional
@ -715,7 +713,7 @@ pages, and similar third-party modules within the same page file, log,
 transactions, and even logical operations.
 \subsection{Blind writes}
-Recall that LSN's were introduced to prevent recovery from applying
+Recall that LSNs were introduced to prevent recovery from applying
 updates more than once, and to prevent recovery from applying old
 updates to newer versions of pages.  This was necessary because some
 operations that manipulate pages are not idempotent, or simply make
@ -769,14 +767,14 @@ practical problem.
 The rest of this section describes how concurrent, LSN-free pages 
 allow standard file system and database optimizations to be easily
-combined, and shows that the removal of LSN's from pages actually
+combined, and shows that the removal of LSNs from pages actually
 simplifies some aspects of recovery.
 \subsection{Zero-copy I/O} 
 We originally developed LSN-free pages as an efficient method for
 transactionally storing and updating large (multi-page) objects.  If a
-large object is stored in pages that contain LSN's, then in order to
+large object is stored in pages that contain LSNs, then in order to
 read that large object the system must read each page individually,
 and then use the CPU to perform a byte-by-byte copy of the portions of
 the page that contain object data into a second buffer.
@ -819,14 +817,14 @@ objects~\cite{esm}.
 Our LSN-free pages are somewhat similar to the recovery scheme used by
 RVM, recoverable virtual memory.  \rcs{, and camelot, argus(?)} That system used purely physical
 logging and LSN-free pages so that it could use mmap() to map portions
-of the page file into application memory\cite{lrvm}.  However, without
+of the page file into application memory~\cite{lrvm}.  However, without
 support for logical log entries and nested top actions, it would be
 difficult to implement a concurrent, durable data structure using RVM.
 In contrast, LSN-free pages allow for logical undo, allowing for the
 use of nested top actions and concurrent transactions.
-We plan to add RVM style transactional memory to \yad in a way that is
+We plan to add RVM-style transactional memory to \yad in a way that is
 compatible with fully concurrent collections such as hash tables and
 tree structures.  Of course, since \yad will support coexistance of
 conventional and LSN-free pages, applications would be free to use the
@ -835,7 +833,7 @@ conventional and LSN-free pages, applications would be free to use the
 \subsection{Page-independent transactions}
 \label{sec:torn-page}
 \rcs{I don't like this section heading...}  Recovery schemes that make
-use of per-page LSN's assume that each page is written to disk
+use of per-page LSNs assume that each page is written to disk
 atomically even though that is generally not the case.  Such schemes
 deal with this problem by using page formats that allow partially
 written pages to be detected.  Media recovery allows them to recover
@ -944,7 +942,7 @@ around typical problems with existing transactional storage systems.
 system.  Many of the customizations described below can be implemented
 using custom log operations.  In this section, we describe how to implement an
 ``ARIES style'' concurrent, steal/no-force operation using 
-\diff{physical redo, logical undo} and per-page LSN's.
+\diff{physical redo, logical undo} and per-page LSNs.
 Such operations are typical of high-performance commercial database
 engines.
@ -973,7 +971,7 @@ with.  UNDO works analogously, but is invoked when an operation must
 be undone (usually due to an aborted transaction, or during recovery).
 This pattern applies in many cases.  In
-order to implement a ``typical'' operation, the operations
+order to implement a ``typical'' operation, the operation's
 implementation must obey a few more invariants:
 \begin{itemize}
@ -983,22 +981,27 @@ implementation must obey a few more invariants:
  during REDO, then the wrapper should use a latch to protect against
  concurrent attempts to update the sensitive data (and against
  concurrent attempts to allocate log entries that update the data).
-\item Nested top actions (and logical undo), or ``big locks'' (total isolation but lower concurrency) should be used to implement multi-page updates. (Section~\ref{sec:nta})
+\item Nested top actions (and logical undo) or ``big locks'' (total isolation but lower concurrency) should be used to manage concurrency (Section~\ref{sec:nta}).
 \end{itemize}
 \section{Experiments}
 \label{experiments}
 \eab{add transition that explains where we are going}
 \subsection{Experimental setup}
 \label{sec:experimental_setup}
 We chose Berkeley DB in the following experiments because, among
 commonly used systems, it provides transactional storage primitives
-that are most similar to \yad.  Also, Berkeley DB is commercially
+that are most similar to \yad.  Also, Berkeley DB is 
-supported and is designed to provide high performance and high
+supported commercially and is designed to provide high performance and high
 concurrency.  For all tests, the two libraries provide the same
-transactional semantics, unless explicitly noted.
+transactional semantics unless explicitly noted.
 All benchmarks were run on an Intel Xeon 2.8 GHz with 1GB of RAM and a
 10K RPM SCSI drive formatted using with ReiserFS~\cite{reiserfs}.\endnote{We found that the
@ -1039,15 +1042,17 @@ multiple machines and file systems.
 \subsection{Linear hash table}
 \label{sec:lht}
 \begin{figure}[t]
 \includegraphics[%
   width=1\columnwidth]{figs/bulk-load.pdf}
 %\includegraphics[%
 %   width=1\columnwidth]{bulk-load-raw.pdf}
 %\vspace{-30pt}
-\caption{\sf\label{fig:BULK_LOAD} Performance of \yad and Berkeley DB hashtable implementations.  The
+\caption{\sf\label{fig:BULK_LOAD} Performance of \yad and Berkeley DB hash table implementations.  The
 test is run as a single transaction, minimizing overheads due to synchronous log writes.}
 \end{figure}
 \begin{figure}[t]
 %\hspace*{18pt}
 %\includegraphics[%
@ -1055,35 +1060,37 @@ test is run as a single transaction, minimizing overheads due to synchronous log
 \includegraphics[%
   width=1\columnwidth]{figs/tps-extended.pdf}
 %\vspace{-36pt}
-\caption{\sf\label{fig:TPS} High concurrency performance of Berkeley DB and \yad.  We were unable to get Berkeley DB to work correctly with more than 50 threads. (See text)
+\caption{\sf\label{fig:TPS} High concurrency hash table performance of Berkeley DB and \yad.  We were unable to get Berkeley DB to work correctly with more than 50 threads (see text).
 }
 \end{figure}
 Although the beginning of this paper describes the limitations of
 physical database models and relational storage systems in great
 detail, these systems are the basis of most common transactional
-storage routines.  Therefore, we implement a key-based access 
+storage routines.  Therefore, we implement a key-based access method
-method in this section. We argue that 
+in this section. We argue that obtaining reasonable performance in
-obtaining reasonable performance in such a system under \yad is
+such a system under \yad is straightforward.  We then compare our
-straightforward.  We then compare our simple, straightforward 
+simple, straightforward implementation to our hand-tuned version and
-implementation to our hand-tuned version and Berkeley DB's implementation.
+Berkeley DB's implementation.
-The simple hash table uses nested top actions to update its
+The simple hash table uses nested top actions to update its internal
-internal structure atomically.  It uses a {\em linear} hash function~\cite{lht}, allowing
+structure atomically.  It uses a {\em linear} hash
-it to incrementally grow its buffer list.  It is based on a number of
+function~\cite{lht}, allowing it to increase capacity
-modular subcomponents.  Notably, its bucket list is a growable array
+ incrementally.  It is based on a number of modular subcomponents.
-of fixed length entries (a linkset, in the terms of the physical
+Notably, its ``table'' is a growable array of fixed-length entries (a
-database model) and the user's choice of two different linked list
+linkset, in the terms of the physical database model) and the user's
-implementations.
+choice of two different linked-list implementations. \eab{still
 unclear}
-The hand-tuned hashtable also uses a linear hash
+The hand-tuned hash table is also built on \yad and also uses a linear hash
 function.  However, it is monolithic and uses carefully ordered writes to
 reduce runtime overheads such as log bandwidth.  Berkeley DB's
-hashtable is a popular, commonly deployed implementation, and serves
+hash table is a popular, commonly deployed implementation, and serves
 as a baseline for our experiments.
-Both of our hashtables outperform Berkeley DB on a workload that
+Both of our hash tables outperform Berkeley DB on a workload that bulk
-bulk loads the tables by repeatedly inserting (key, value) pairs.
+loads the tables by repeatedly inserting (key, value) pairs
 (Figure~\ref{fig:BULK_LOAD}).
 %although we do not wish to imply this is always the case.
 %We do not claim that our partial implementation of \yad
 %generally outperforms, or is a robust alternative
@ -1122,13 +1129,12 @@ a single synchronous I/O.\endnote{The multi-threaded benchmarks
 \yad scaled quite well, delivering over 6000 transactions per
 second,\endnote{The concurrency test was run without lock managers, and the
  transactions obeyed the A, C, and D properties.  Since each
-  transaction performed exactly one hashtable write and no reads, they also
+  transaction performed exactly one hash table write and no reads, they also
  obeyed I (isolation) in a trivial sense.}  and provided roughly
-double Berkeley DB's throughput (up to 50 threads).  We do not report
+double Berkeley DB's throughput (up to 50 threads).  Although not
-the data here, but we implemented a simple load generator that makes
+shown here, we found that the latencies of Berkeley DB and \yad were
-use of a fixed pool of threads with a fixed think time.  We found that
+similar, which confirms that \yad is not simply trading latency for
-the latencies of Berkeley DB and \yad were similar, showing that \yad is 
+throughput during the concurrency benchmark.
 not simply trading latency for throughput during the concurrency benchmark.
 \begin{figure*}
@ -1140,10 +1146,12 @@ not simply trading latency for throughput during the concurrency benchmark.
 The effect of \yad object serialization optimizations under low and high memory pressure.}
 \end{figure*}
 \subsection{Object persistence}
 \label{sec:oasys}
 Numerous schemes are used for object serialization.  Support for two
-different styles of object serialization have been implemented in
+different styles of object serialization has been implemented in
 \yad.  We could have just as easily implemented a persistence
 mechanism for a statically typed functional programming language, a
 dynamically typed scripting language, or a particular application,
@ -1160,17 +1168,21 @@ serialization library, \oasys.  \oasys makes use of pluggable storage
 modules that implement persistent storage, and includes plugins
 for Berkeley DB and MySQL.  
-This section will describe how the \yad
+This section will describe how the \yad \oasys plugin reduces the
-\oasys plugin reduces amount of data written to log, while using half as much system
+amount of data written to log, while using half as much system memory
-memory as the other two systems.
+as the other two systems.
-We present three variants of the \yad plugin here.  The first treats \yad like
+We present three variants of the \yad plugin here.  The first treats
-Berkeley DB.  The second, ``update/flush'' customizes the behavior of the buffer
+\yad like Berkeley DB.  The second, the ``update/flush'' variant
-manager.  Instead of maintaining an up-to-date version of each object
+customizes the behavior of the buffer manager, and the third,
-in the buffer manager or page file, it allows the buffer manager's
+``delta'', extends the second wiht support for logging only the deltas
-view of live application objects to become stale.  This is safe since
+between versions.
-the system is always able to reconstruct the appropriate page entry
+
-from the live copy of the object.
+The update/flush variant avoids maintaining an up-to-date
 version of each object in the buffer manager or page file: it allows
 the buffer manager's view of live application objects to become stale.
 This is safe since the system is always able to reconstruct the
 appropriate page entry from the live copy of the object.
 By allowing the buffer manager to contain stale data, we reduce the
 number of times the \yad \oasys plugin must update serialized objects in the buffer manager.
@ -1186,41 +1198,45 @@ updates the page file.
 The reason it would be difficult to do this with Berkeley DB is that
 we still need to generate log entries as the object is being updated.
-  This would cause Berkeley DB to write data back to the
+  This would cause Berkeley DB to write data back to the page file,
-page file, increasing the working set of the program, and increasing
+increasing the working set of the program, and increasing disk
-disk activity.
+activity.
 Furthermore, objects may be written to disk in an
 order that differs from the order in which they were updated, 
 violating one of the write-ahead logging invariants.  One way to 
-deal with this is to maintain multiple LSN's per page.  This means we would need to register a
+deal with this is to maintain multiple LSNs per page.  This means we would need to register a
-callback with the recovery routine to process the LSN's (a similar
+callback with the recovery routine to process the LSNs (a similar
 callback will be needed in Section~\ref{sec:zeroCopy}), and 
-extend \yads page format to contain per-record LSN's.  
+extend \yads page format to contain per-record LSNs.  
 Also, we must prevent \yads storage allocation routine from overwriting the per-object 
-LSN's of deleted objects that may still be addressed during abort or recovery.  
+LSNs of deleted objects that may still be addressed during abort or recovery.\eab{tombstones discussion here?}  
 \eab{we should at least implement this callback if we have not already}
 Alternatively, we could arrange for the object pool to cooperate 
 further with the buffer pool by atomically updating the buffer 
 manager's copy of all objects that share a given page, removing the 
-need for multiple LSN's per page, and simplifying storage allocation.
+need for multiple LSNs per page, and simplifying storage allocation.
-However, the simplest solution, and the one we take here, is based on the observation that
+However, the simplest solution, and the one we take here, is based on
-updates (not allocations or deletions) of fixed length objects are blind writes.
+the observation that updates (not allocations or deletions) of
-This allows us to do away with per-object LSN's entirely.  Allocation and deletion can then be handled
+fixed-length objects are blind writes.  This allows us to do away with
-as updates to normal LSN containing pages.  At recovery time, object
+per-object LSNs entirely.  Allocation and deletion can then be
-updates are executed based on the existence of the object on the page
+handled as updates to normal LSN containing pages.  At recovery time,
-and a conservative estimate of its LSN.  (If the page doesn't contain
+object updates are executed based on the existence of the object on
-the object during REDO then it must have been written back to disk
+the page and a conservative estimate of its LSN.  (If the page doesn't
-after the object was deleted.  Therefore, we do not need to apply the
+contain the object during REDO then it must have been written back to
-REDO.)  This means that the system can ``forget'' about objects that 
+disk after the object was deleted.  Therefore, we do not need to apply
-were freed by committed transactions, simplifying space reuse 
+the REDO.)  This means that the system can ``forget'' about objects
-tremendously.  (Because LSN-free pages and recovery are not yet implemented, 
+that were freed by committed transactions, simplifying space reuse
-this benchmark mimics their behavior at runtime, but does not support recovery.) 
+tremendously.  (Because LSN-free pages and recovery are not yet
 implemented, this benchmark mimics their behavior at runtime, but does
 not support recovery.)
-The third \yad plugin, ``delta'' incorporates the buffer
+The third plugin variant, ``delta'', incorporates the update/flush
-manager optimizations.  However, it only writes the changed portions of
+optimizations, but only writes the changed portions of
-objects to the log.  Because of \yads support for custom log entry
+objects to the log.  Because of \yads support for custom log-entry
 formats, this optimization is straightforward.
 %In addition to the buffer-pool optimizations, \yad provides several 
@ -1264,8 +1280,8 @@ close, but does not quite provide the correct durability semantics.)
 The operations required for these two optimizations required 
 150 lines of C code, including whitespace, comments and boilerplate
 function registrations.\endnote{These figures do not include the
-  simple LSN free object logic required for recovery, as \yad does not
+  simple LSN-free object logic required for recovery, as \yad does not
-  yet support LSN free operations.}  Although the reasoning required
+  yet support LSN-free operations.}  Although the reasoning required
 to ensure the correctness of this code is complex, the simplicity of
 the implementation is encouraging.
@ -1289,6 +1305,9 @@ we see that update/flush indeed improves memory utilization.
 \subsection{Manipulation of logical log entries}
 \eab{this section unclear, including title}
 \label{sec:logging}
 \begin{figure}
 \includegraphics[width=1\columnwidth]{figs/graph-traversal.pdf}
@ -1345,7 +1364,7 @@ is used by RVM's log-merging operations~\cite{lrvm}.
 Furthermore, application-specific
 procedures that are analogous to standard relational algebra methods
 (join, project and select) could be used to efficiently transform the data
-while it is still layed out sequentially
+while it is still laid out sequentially
 in non-transactional memory.
 %Note that read-only operations do not necessarily generate log
@ -1371,9 +1390,9 @@ position size so that each partition can fit in \yads buffer pool.
 We ran two experiments.  Both stored a graph of fixed size objects in
 the growable array implementation that is used as our linear
-hashtable's bucket list.
+hash table's bucket list.
 The first experiment (Figure~\ref{fig:oo7})
-is loosely based on the OO7 database benchmark.~\cite{oo7}.  We
+is loosely based on the OO7 database benchmark~\cite{oo7}.  We
 hard-code the out-degree of each node, and use a directed graph.  OO7
 constructs graphs by first connecting nodes together into a ring.
 It then randomly adds edges between the nodes until the desired
@ -1583,7 +1602,7 @@ databases~\cite{libtp}.  At its core, it provides the physical database model
 %most relational database systems~\cite{libtp}.  
 In particular, 
 it provides fully transactional (ACID) operations over B-Trees, 
-hashtables, and other access methods.  It provides flags that 
+hash tables, and other access methods.  It provides flags that 
 let its users tweak various aspects of the performance of these
 primitives, and selectively disable the features it provides.
@ -1642,14 +1661,16 @@ Although most file systems attempt to lay out data in logically sequential
 order, write-optimized file systems lay files out in the order they
 were written~\cite{lfs}.  Schemes to improve locality between small
 objects exist as well. Relational databases allow users to specify the order
-in which tuples will be layed out, and often leave portions of pages
+in which tuples will be laid out, and often leave portions of pages
 unallocated to reduce fragmentation as new records are allocated.
 \rcs{The new allocator is written + working, so this should be reworded. We have one that is based on hoard; support for other possibilities would be nice.}
 Memory allocation routines also address this problem.  For example, the Hoard memory
 allocator is a highly concurrent version of malloc that
 makes use of thread context to allocate memory in a way that favors
-cache locality~\cite{hoard}.  %Other work makes use of the caller's stack to infer
+cache locality~\cite{hoard}. 
 %Other work makes use of the caller's stack to infer
 %information about memory management.~\cite{xxx} \rcs{Eric, do you have
 %  a reference for this?}
@ -1664,7 +1685,7 @@ plan to use ideas from LFS~\cite{lfs} and POSTGRES~\cite{postgres}
 to implement this.
 Starburst~\cite{starburst} provides a flexible approach to index
-management, and database trigger support, as well as hints for small
+management and database trigger support, as well as hints for small
 object layout.
 The Boxwood system provides a networked, fault-tolerant transactional
@ -1673,8 +1694,8 @@ complement to such a system, especially given \yads focus on
 intelligence and optimizations within a single node, and Boxwood's
 focus on multiple node systems.  In particular, it would be
 interesting to explore extensions to the Boxwood approach that make
-use of \yads customizable semantics (Section~\ref{sec:wal}), and fully logical logging
+use of \yads customizable semantics (Section~\ref{sec:wal}) and fully logical logging
-mechanism. (Section~\ref{sec:logging})
+mechanisms (Section~\ref{sec:logging}).
@ -1706,7 +1727,7 @@ algorithms related to write-ahead logging.  For instance,
 we suspect that support for appropriate callbacks will 
 allow us to hard-code a generic recovery algorithm into the 
 system.  Similarly, any code that manages book-keeping information, such as 
-LSN's may be general enough to be hard-coded.  
+LSNs may be general enough to be hard-coded.  
 Of course, we also plan to provide \yads current functionality, including the algorithms
 mentioned above as modular, well-tested extensions.
@ -1733,13 +1754,15 @@ extended in the future to support a larger range of systems.
 \section{Acknowledgements}
-The idea behind the \oasys buffer manager optimization is from Mike
+Thanks to shepherd Bill Weihl for helping us present these ideas well,
-Demmer.  He and Bowei Du implemented \oasys.  Gilad Arnold and Amir Kamil implemented
+or at least better. The idea behind the \oasys buffer manager
 optimization is from Mike Demmer.  He and Bowei Du implemented \oasys.
 Gilad Arnold and Amir Kamil implemented
 pobj.  Jim Blomo, Jason Bayer, and Jimmy
 Kittiyachavalit worked on an early version of \yad.
 Thanks to C. Mohan for pointing out the need for tombstones with
-per-object LSN's.  Jim Gray provided feedback on an earlier version of
+per-object LSNs.  Jim Gray provided feedback on an earlier version of
 this paper, and suggested we use a resource manager to manage
 dependencies within \yads API.  Joe Hellerstein and Mike Franklin
 provided us with invaluable feedback.