update oasys experiment section

2005-03-25 20:11:55 +00:00 · 2005-03-25 20:11:55 +00:00 · 52ecf13d7a
commit 52ecf13d7a
parent 5ed0f2005c
1 changed files with 34 additions and 24 deletions
--- a/doc/paper2/LLADD.tex
+++ b/doc/paper2/LLADD.tex
@ -1869,7 +1869,7 @@ most difficult to implement in another storage system.
 \includegraphics[%
   width=1\columnwidth]{mem-pressure.pdf}
 \caption{\label{fig:OASYS} \yad optimizations for object
-serialization. The first graph shows the effect of the two lladd
+serialization. The first graph shows the effect of the two \yad
 optimizations as a function of the portion of the object that is being
 modified. The second graph focuses on the 
 benefits of the update/flush optimization in cases of system
@ -1931,36 +1931,46 @@ forward, but this is not required.

 We implemented a \yad plugin for \oasys, a C++ object serialization
 library that can use various object serialization backends. 
-We set up an experiment in which objects are
-retrieved from a cache according to a hot-set distribution\footnote{In
-an example hot-set distribution, 10\% of the objects (the hot set) are
-selected 90\% of the time.} and then have certain fields modified. The
-object cache size is set to twice the size of the hot set, and all
-experiments were run with identical cache sizings and random seeds for
-both Berkeley DB and the various \yad configurations.
+We set up an experiment in which objects are randomly
+retrieved from the cache according to a hot-set distribution\footnote{In
+an example hot-set distribution, 10\% of the objects (the hot set size) are
+selected 90\% of the time (the hot set probability).} 
+and then have certain fields modified and
+updated into the data store. For all experiments, the number of object
+is fixed at 5,000, the
+hot set is set to 10\% of the objects, the object cache is set to
+double the size of the hot set, we update 100 objects per
+transaction, and all experiments were run with identical random seeds 
+for all configurations.

-The first graph in Figure \ref{fig:OASYS} shows the time to perform
-100,000 updates to the object as we vary the fraction of the object
-data that is modified in each update. In all
-cases, we see that that the savings in log bandwidth and
-buffer-pool overhead by generating diffs and having separate 
-update() and flush() calls outweighs the overhead of the operations.
+The first graph in Figure \ref{fig:OASYS} shows the update rate as we
+vary the fraction of the object that is modified by each update for
+Berkeley DB, unmodified \yad, \yad with the update/flush optimization,
+and \yad with both the update/flush optimization and diff based log
+records.
+The graph confrms that the savings in log bandwidth and
+buffer pool overhead by both \yad optimizations 
+outweighs the overhead of the operations, especially when only a small
+fraction of the object is modified.
 In the most extreme case, when
-only one integer field from an ~1KB object is modified, the fully
-optimized \yad shows a \eab{threefold?} speedup over Berkeley DB. 
+only one integer field from a ~1KB object is modified, the fully
+optimized \yad correspond to a twofold speedup over the unoptimized
+\yad.

 In the second graph, we constrained the \yad buffer pool size to be a
 fraction of the size of the object cache, and bypass the filesystem
-buffer cache via the O\_DIRECT option. This experiment specifically
-focuses on the benefits of the update() and flush() optimizations
-described above. From this graph, we see that as the percentage of
-requests that are serviced by the cache increases, we see that the
-performance increases greatly. Furthermore, even when only 10\% of the
-requests hit the cache, the optimized update/flush \yad variant
-achieves almost equivalent performance to the unoptimized \yad.
+
+buffer cache via the O\_DIRECT option. The goal of this experiment is to
+focus on the benefits of the update/flush optimization in a simulated
+scenario of memory pressure. From this graph, we see that as the percentage of
+requests that are serviced by the cache increases, the
+performance of the optimized \yad also greatly increases.
+This result supports the hypothesis of the optimization, and
+shows that by leveraging the object cache, we can reduce the load on
+the page file and therefore the size of the buffer pool.

 The operations required for these
-two optimizations are roughly 150 lines of C code, including
+two optimizations required a mere 150 lines of C code, including
 whitespace, comments and boilerplate function registrations.  Although
 the reasoning required to ensure the correctness of this code is
 complex, the simplicity of the implementation is encouraging.