update oasys section
This commit is contained in:
parent
382f75ba13
commit
8d6ea32717
1 changed files with 46 additions and 72 deletions
|
@ -1740,8 +1740,8 @@ serialization is also a convenient way of adding persistent storage to
|
||||||
an existing application without developing an explicit file format or
|
an existing application without developing an explicit file format or
|
||||||
dealing with low-level I/O interfaces.
|
dealing with low-level I/O interfaces.
|
||||||
|
|
||||||
A simple serialization scheme would bulk-write and bulk-read
|
A simple object serialization scheme would bulk-write and bulk-read
|
||||||
sets of application objects to an OS file. These
|
sets of application objects to an OS file. These simple
|
||||||
schemes suffer from high read and write latency, and do not handle
|
schemes suffer from high read and write latency, and do not handle
|
||||||
small updates well. More sophisticated schemes store each object in a
|
small updates well. More sophisticated schemes store each object in a
|
||||||
seperate, randomly accessible record, such as a database tuple or
|
seperate, randomly accessible record, such as a database tuple or
|
||||||
|
@ -1749,10 +1749,11 @@ a Berkeley DB hashtable entry. These schemes allow for fast single
|
||||||
object reads and writes, and are typically the solutions used by
|
object reads and writes, and are typically the solutions used by
|
||||||
application servers.
|
application servers.
|
||||||
|
|
||||||
One drawback of many such schemes is that any update typically
|
However, one drawback of many such schemes is that any update requires
|
||||||
requires a full serialization of the entire object. In many
|
a full serialization of the entire object. In some application
|
||||||
application scenarios, this can be highly inefficient, as it may be
|
scenarios, this can be extremely inefficient, as it may be the case
|
||||||
that only a single field of a complex object has been modified.
|
that only a single field from a large complex object has been
|
||||||
|
modified.
|
||||||
|
|
||||||
Furthermore, most of these schemes ``double cache'' object
|
Furthermore, most of these schemes ``double cache'' object
|
||||||
data. Typically, the application maintains a set of in-memory
|
data. Typically, the application maintains a set of in-memory
|
||||||
|
@ -1760,60 +1761,31 @@ objects in their unserialized form, so they can be accessed with low latency.
|
||||||
The backing data store also
|
The backing data store also
|
||||||
maintains a separate in-memory buffer pool with the serialized versions of
|
maintains a separate in-memory buffer pool with the serialized versions of
|
||||||
some objects, as a cache of the on-disk data representation.
|
some objects, as a cache of the on-disk data representation.
|
||||||
Accesses to objects that are only present in this buffer
|
Accesses to objects that are only present in the serialized buffers
|
||||||
pool incur medium latency, as they must be unmarshalled (deserialized)
|
pool incur medium latency, as they must be unmarshalled (deserialized)
|
||||||
before the application may access them.
|
before the application may access them.
|
||||||
|
There may even be a third copy of this data resident in the filesystem
|
||||||
|
buffer cache, accesses to which incur latency of both system call overhead and
|
||||||
|
the unmarshalling cost.
|
||||||
|
|
||||||
\rcs{ MIKE FIX THIS }
|
However, naively constraining the size of the data store's buffer pool
|
||||||
Worse, most transactional layers (including ARIES) must read a page into memory to
|
causes performance degradation. Most transactional layers
|
||||||
service a write request to the page. If the transactional layer's page cache
|
(including ARIES) must read a page
|
||||||
is too small, write requests must be serviced with potentially random disk I/O.
|
into memory to service a write request to the page; if the buffer pool
|
||||||
This removes the primary advantage of write ahead logging, which is to ensure
|
is too small, these operations trigger potentially random disk I/O.
|
||||||
application data durability with sequential disk I/O.
|
This removes the primary
|
||||||
|
advantage of write ahead logging, which is to ensure application data
|
||||||
|
durability with mostly sequential disk I/O.
|
||||||
|
|
||||||
In summary, this system architecture (though commonly deployed~\cite{ejb,ordbms,jdo,...}) is fundamentally
|
In summary, this system architecture (though commonly
|
||||||
|
deployed~\cite{ejb,ordbms,jdo,...}) is fundamentally
|
||||||
flawed. In order to access objects quickly, the application must keep
|
flawed. In order to access objects quickly, the application must keep
|
||||||
its working set in cache. In order to service write requests, the
|
its working set in cache. Yet in order to efficiently service write
|
||||||
transactional layer must store a redundant copy of the entire working
|
requests, the
|
||||||
set in memory or resort to random I/O. Therefore, roughly half of
|
transactional layer must store a copy of serialized objects
|
||||||
system memory must be wasted by any write intensive application.
|
in memory or resort to random I/O.
|
||||||
|
Thus, any given working set size requires roughly double the system
|
||||||
%There is often yet a third
|
memory to achieve good performance.
|
||||||
%copy of the serialized data in the filesystem's buffer cache.
|
|
||||||
|
|
||||||
|
|
||||||
%Finally, some objects may
|
|
||||||
%only reside on disk, and require a disk read.
|
|
||||||
|
|
||||||
%Since these applications are typically data-centric, it is important
|
|
||||||
%to make efficient use of system memory in order to reduce hardware
|
|
||||||
%costs.
|
|
||||||
|
|
||||||
For I/O bound applications, efficient use of in-memory caching is
|
|
||||||
well-known to be critical to performance. Note that for these schemes,
|
|
||||||
the memory consumed by the buffer pool is basically redundant, since
|
|
||||||
it just caches the translated form of the object so it can be read or
|
|
||||||
written to disk. However, naively restricting the memory consumed by
|
|
||||||
the buffer pool results in poor performance in existing transactional
|
|
||||||
storage systems. This is due to the fact that an object update must
|
|
||||||
update the current state of the backing store, which typically
|
|
||||||
requires reading in the old copy of the page on which the object is
|
|
||||||
stored to update the object data.
|
|
||||||
|
|
||||||
%% A straightforward solution to this problem would be to bound
|
|
||||||
%% the amount of memory the application may consume by preventing it from
|
|
||||||
%% caching deserialized objects. This scheme conserves memory, but it
|
|
||||||
%% incurs the cost of an in-memory deserialization to read the object,
|
|
||||||
%% and an in-memory deserialization/serialization cycle to write to an
|
|
||||||
%% object.
|
|
||||||
|
|
||||||
%% Alternatively, the amount of memory consumed by the buffer pool could
|
|
||||||
%% be bounded to some small value, and the application could maintain a
|
|
||||||
%% large object cache. This scheme would incur no overhead for a read
|
|
||||||
%% request. However, it would incur the overhead of a disk-based
|
|
||||||
%% serialization in order to service a write request.\footnote{In
|
|
||||||
%% practice, the transactional backing store would probably fetch the
|
|
||||||
%% page that contains the object from disk, causing two disk I/O's.}
|
|
||||||
|
|
||||||
\subsection{\yad Optimizations}
|
\subsection{\yad Optimizations}
|
||||||
|
|
||||||
|
@ -1845,29 +1817,32 @@ investigate the overheads of SQL in this context in the future.}
|
||||||
% @todo WRITE SQL OASYS BENCHMARK!!
|
% @todo WRITE SQL OASYS BENCHMARK!!
|
||||||
|
|
||||||
The second optimization is a bit more sophisticated, but still easy to
|
The second optimization is a bit more sophisticated, but still easy to
|
||||||
implement in \yad. We do not believe that it would be possible to
|
implement in \yad. This optimization allows us to drastically limit
|
||||||
achieve using existing relational database systems or with Berkeley
|
the size of the
|
||||||
DB. This optimization allows us to drastically limit the size of the
|
|
||||||
\yad buffer cache, yet still achieve good performance.
|
\yad buffer cache, yet still achieve good performance.
|
||||||
|
We do not believe that it would be possible to
|
||||||
|
achieve using existing relational database systems or with Berkeley DB.
|
||||||
|
|
||||||
The basic idea of this optimization is to postpone expensive
|
The basic idea of this optimization is to postpone expensive
|
||||||
operations that update the page file for objects that are frequently
|
operations that update the page file for frequently modified objects,
|
||||||
modified, relying on some support from the application's object cache
|
relying on some support from the application's object cache
|
||||||
to maintain the transactional semantics.
|
to maintain transactional semantics.
|
||||||
|
|
||||||
To implement this, we added two custom \yad operations. The
|
To implement this, we added two custom \yad operations. The
|
||||||
{\tt``update()''} operation is called when an object is modified and
|
{\tt``update()''} operation is called when an object is modified and
|
||||||
still exists in the object cache. This causes a log entry to be
|
still exists in the object cache. This causes a log entry to be
|
||||||
written, but does not update the page file. The fact that the modified
|
written, but does not update the page file. The fact that the modified
|
||||||
object still resides in the object cache guarantees that the now stale
|
object still resides in the object cache guarantees that the (now stale)
|
||||||
records will not be read from the page file. The {\tt ``flush()''}
|
records will not be read from the page file. The {\tt ``flush()''}
|
||||||
operation is called whenever a modified object is evicted from the
|
operation is called whenever a modified object is evicted from the
|
||||||
cache. This operation updates the object in the buffer pool (and
|
cache. This operation updates the object in the buffer pool (and
|
||||||
therefore the page file), likely incurring the cost of a disk {\em
|
therefore the page file), likely incurring the cost of both a disk {\em
|
||||||
read} to pull in the page, and a {\em write} to evict another page
|
read} to pull in the page, and a {\em write} to evict another page
|
||||||
from the relative small buffer pool. Multiple modifications that
|
from the relative small buffer pool. However, since popular
|
||||||
update an object can then incur relatively inexpensive log additions,
|
objects tend to remain in the object cache, multiple update
|
||||||
and are then coalesced into a single update to the page file.
|
modifications will incur relatively inexpensive log additions,
|
||||||
|
and are only coalesced into a single modification to the page file
|
||||||
|
when the object is flushed from cache.
|
||||||
|
|
||||||
\yad provides a few mechanisms to handle undo records in the context
|
\yad provides a few mechanisms to handle undo records in the context
|
||||||
of object serialization. The first is to use a single transaction for
|
of object serialization. The first is to use a single transaction for
|
||||||
|
@ -1896,12 +1871,11 @@ most difficult to implement in another storage system.
|
||||||
\includegraphics[%
|
\includegraphics[%
|
||||||
width=1\columnwidth]{mem-pressure.pdf}
|
width=1\columnwidth]{mem-pressure.pdf}
|
||||||
\caption{\label{fig:OASYS} \yad optimizations for object
|
\caption{\label{fig:OASYS} \yad optimizations for object
|
||||||
serialization. The first graph shows the effectiveness of both the
|
serialization. The first graph shows the effect of the two lladd
|
||||||
diff-based log records and the update/flush optimization as a function
|
optimizations as a function of the portion of the object that is being
|
||||||
of the portion of each object that is modified. The second graph
|
modified. The second graph focuses on the
|
||||||
disables the filesystem buffer cache (via O\_DIRECT) and shows the
|
benefits of the update/flush optimization in cases of system
|
||||||
benefits of the update/flush optimization when there is memory
|
memory pressure.}
|
||||||
pressure.}
|
|
||||||
\end{figure*}
|
\end{figure*}
|
||||||
|
|
||||||
An observant reader may have noticed a subtle problem with this
|
An observant reader may have noticed a subtle problem with this
|
||||||
|
|
Loading…
Reference in a new issue