update oasys section including a blurb on the comparison with mysql

This commit is contained in:
Mike Demmer 2005-03-26 01:28:51 +00:00
parent 88c19d1880
commit 939b6a8845

View file

@ -1838,19 +1838,24 @@ to object serialization. First, since \yad supports
custom log entries, it is trivial to have it store deltas to custom log entries, it is trivial to have it store deltas to
the log instead of writing the entire object during an update. the log instead of writing the entire object during an update.
Such an optimization would be difficult to achieve with Berkeley DB Such an optimization would be difficult to achieve with Berkeley DB
since its record diffing mechanism assumes that changes span contiguous since the only diff-based mechanism it supports requires changes to
byte ranges, and this may not be the case for arbitrary object updates. span contiguous, which is not necessarily the case for arbitrary
\rcs { MIKE IMPLEMENTED THIS! FIXME } object updates.
but could be performed by a database server if the fields of the In a database server context, this type of optimization can be
objects were broken into database table columns. supported if the fields of the object are broken into database table
\footnote{It is unclear if columns. and a SQL update query only modifies a subset of the fields.
this optimization would outweigh the overheads associated with an SQL However, as we have seen in some preliminary evaluation,
based interface. Depending on the database server, it may be the overheads associated with a SQL based interface outweigh the
necessary to issue a SQL update query that only updates a subset of a advantages of this optimization.
tuple's fields in order to generate a diff-based log entry. Doing so
would preclude the use of prepared statements, or would require a large %% \footnote{It is unclear if
number of prepared statements to be maintained by the DBMS. We plan to %% this optimization would outweigh the overheads associated with an SQL
investigate the overheads of SQL in this context in the future.} %% based interface. Depending on the database server, it may be
%% necessary to issue a SQL update query that only updates a subset of a
%% tuple's fields in order to generate a diff-based log entry. Doing so
%% would preclude the use of prepared statements, or would require a large
%% number of prepared statements to be maintained by the DBMS. We plan to
%% investigate the overheads of SQL in this context in the future.}
% If IPC or % If IPC or
%the network is being used to communicate with the DBMS, then it is very %the network is being used to communicate with the DBMS, then it is very
@ -2009,14 +2014,26 @@ only one integer field from a ~1KB object is modified, the fully
optimized \yad correspond to a twofold speedup over the unoptimized optimized \yad correspond to a twofold speedup over the unoptimized
\yad. \yad.
In the second graph, we constrained the \yad buffer pool size to be a In all cases, the update rate for mysql\footnote{We ran mysql using
fraction of the size of the object cache, and bypass the filesystem InnoDB for the table engine, as it is the fastest engine that provides
similar durability to \yad. For this test, we also linked directly
with the mysqld daemon library, bypassing the RPC layer. In
experiments that used the RPC layer, test completion times were orders
of magnitude slower.} is slower than Berkeley DB,
which is slower than any of the \yad variants. This performance
difference is in line with those observed in Section
\ref{sub:Linear-Hash-Table}. We also see the increased overhead due to
the SQL processing for the mysql implementation, although we note that
a SQL variant of the diff-based optimization also provides performance
benefits.
In the second graph, we constrained the \yad buffer pool size to be a
small fraction of the size of the object cache, and bypass the filesystem
buffer cache via the O\_DIRECT option. The goal of this experiment is to buffer cache via the O\_DIRECT option. The goal of this experiment is to
focus on the benefits of the update/flush optimization in a simulated focus on the benefits of the update/flush optimization in a simulated
scenario of memory pressure. From this graph, we see that as the percentage of scenario of memory pressure. From this graph, we see that as the percentage of
requests that are serviced by the cache increases, the requests that are serviced by the cache increases, the
performance of the optimized \yad also greatly increases. performance of the optimized \yad dramatically increases.
This result supports the hypothesis of the optimization, and This result supports the hypothesis of the optimization, and
shows that by leveraging the object cache, we can reduce the load on shows that by leveraging the object cache, we can reduce the load on
the page file and therefore the size of the buffer pool. the page file and therefore the size of the buffer pool.
@ -2027,18 +2044,18 @@ whitespace, comments and boilerplate function registrations. Although
the reasoning required to ensure the correctness of this code is the reasoning required to ensure the correctness of this code is
complex, the simplicity of the implementation is encouraging. complex, the simplicity of the implementation is encouraging.
In addition to the hashtable, which is required by OASYS's API, this In addition to the hashtable, which is required by \oasys's API, this
section made use of custom log formace and semantics to reduce log section made use of custom log formats and semantics to reduce log
bandwidth and page file usage. Berkeley DB supports a similar bandwidth and page file usage. Berkeley DB supports a similar
mechanism that is designed to reduce log bandwidth, but it only partial update mechanism, but it only
supports range updates and does not map naturally to OASYS's data supports range updates and does not map naturally to \oasys's data
model. Contrast the to our \yad extension which simply makes upcalls model. In contrast, our \yad extension simply makes upcalls
into the object serialization layer during recovery to ensure that the into the object serialization layer during recovery to ensure that the
compact, object specific diffs that OASYS produces are correctly compact, object-specific diffs that \oasys produces are correctly
applied. The custom log format, when combined with direct access to applied. The custom log format, when combined with direct access to
the page file and buffer pool drastcally reduced disk and memory usage the page file and buffer pool, drastically reduces disk and memory usage
for write intensive loads, and a simple extension to our recovery algorithm makes it for write intensive loads. A simple extension to our recovery algorithm makes it
easy to implement similar optimizations in the future. easy to implement other similar optimizations in the future.
%This section uses: %This section uses:
% %