update oasys section including a blurb on the comparison with mysql

2005-03-26 01:28:51 +00:00 · 2005-03-26 01:28:51 +00:00 · 939b6a8845
commit 939b6a8845
parent 88c19d1880
1 changed files with 42 additions and 25 deletions
--- a/doc/paper2/LLADD.tex
+++ b/doc/paper2/LLADD.tex
@ -1838,19 +1838,24 @@ to object serialization.  First, since \yad supports
 custom log entries, it is trivial to have it store deltas to
 the log instead of writing the entire object during an update.
 Such an optimization would be difficult to achieve with Berkeley DB 
-since its record diffing mechanism assumes that changes span contiguous 
+since the only diff-based mechanism it supports requires changes to 
-byte ranges, and this may not be the case for arbitrary object updates.
+span contiguous, which is not necessarily the case for arbitrary
-\rcs { MIKE IMPLEMENTED THIS!  FIXME }
+object updates.
-but could be performed by a database server if the fields of the
+In a database server context, this type of optimization can be
-objects were broken into database table columns. 
+supported if the fields of the object are broken into database table
-\footnote{It is unclear if
+columns. and a SQL update query only modifies a subset of the fields. 
-this optimization would outweigh the overheads associated with an SQL
+However, as we have seen in some preliminary evaluation, 
-based interface.  Depending on the database server, it may be
+the overheads associated with a SQL based interface outweigh the
-necessary to issue a SQL update query that only updates a subset of a
+advantages of this optimization.
-tuple's fields in order to generate a diff-based log entry.  Doing so
+
-would preclude the use of prepared statements, or would require a large
+%% \footnote{It is unclear if
-number of prepared statements to be maintained by the DBMS. We plan to
+%% this optimization would outweigh the overheads associated with an SQL
-investigate the overheads of SQL in this context in the future.}
+%% based interface.  Depending on the database server, it may be
 %% necessary to issue a SQL update query that only updates a subset of a
 %% tuple's fields in order to generate a diff-based log entry.  Doing so
 %% would preclude the use of prepared statements, or would require a large
 %% number of prepared statements to be maintained by the DBMS. We plan to
 %% investigate the overheads of SQL in this context in the future.}
 %  If IPC or 
 %the network is being used to communicate with the DBMS, then it is very
@ -2009,14 +2014,26 @@ only one integer field from a ~1KB object is modified, the fully
 optimized \yad correspond to a twofold speedup over the unoptimized
 \yad.
-In the second graph, we constrained the \yad buffer pool size to be a
+In all cases, the update rate for mysql\footnote{We ran mysql using
-fraction of the size of the object cache, and bypass the filesystem
+InnoDB for the table engine, as it is the fastest engine that provides
 similar durability to \yad. For this test, we also linked directly
 with the mysqld daemon library, bypassing the RPC layer. In
 experiments that used the RPC layer, test completion times were orders
 of magnitude slower.} is slower than Berkeley DB,
 which is slower than any of the \yad variants. This performance
 difference is in line with those observed in Section
 \ref{sub:Linear-Hash-Table}. We also see the increased overhead due to
 the SQL processing for the mysql implementation, although we note that
 a SQL variant of the diff-based optimization also provides performance
 benefits.
 In the second graph, we constrained the \yad buffer pool size to be a
 small fraction of the size of the object cache, and bypass the filesystem
 buffer cache via the O\_DIRECT option. The goal of this experiment is to
 focus on the benefits of the update/flush optimization in a simulated
 scenario of memory pressure. From this graph, we see that as the percentage of
 requests that are serviced by the cache increases, the
-performance of the optimized \yad also greatly increases.
+performance of the optimized \yad dramatically increases.
 This result supports the hypothesis of the optimization, and
 shows that by leveraging the object cache, we can reduce the load on
 the page file and therefore the size of the buffer pool.
@ -2027,18 +2044,18 @@ whitespace, comments and boilerplate function registrations.  Although
 the reasoning required to ensure the correctness of this code is
 complex, the simplicity of the implementation is encouraging.
-In addition to the hashtable, which is required by OASYS's API, this
+In addition to the hashtable, which is required by \oasys's API, this
-section made use of custom log formace and semantics to reduce log
+section made use of custom log formats and semantics to reduce log
 bandwidth and page file usage.  Berkeley DB supports a similar
-mechanism that is designed to reduce log bandwidth, but it only
+partial update mechanism, but it only
-supports range updates and does not map naturally to OASYS's data
+supports range updates and does not map naturally to \oasys's data
-model.  Contrast the to our \yad extension which simply makes upcalls
+model.  In contrast, our \yad extension simply makes upcalls
 into the object serialization layer during recovery to ensure that the
-compact, object specific diffs that OASYS produces are correctly
+compact, object-specific diffs that \oasys produces are correctly
 applied.  The custom log format, when combined with direct access to
-the page file and buffer pool drastcally reduced disk and memory usage
+the page file and buffer pool, drastically reduces disk and memory usage
-for write intensive loads, and a simple extension to our recovery algorithm makes it
+for write intensive loads. A simple extension to our recovery algorithm makes it
-easy to implement similar optimizations in the future.
+easy to implement other similar optimizations in the future.
 %This section uses:
 %