Made a pass on the experimental setup.
This commit is contained in:
parent
da502b4920
commit
505f3ac605
1 changed files with 103 additions and 171 deletions
|
@ -108,7 +108,7 @@ easy to implement and significantly improve performance.
|
||||||
|
|
||||||
|
|
||||||
\section{Introduction}
|
\section{Introduction}
|
||||||
|
\label{sec:intro}
|
||||||
As our reliance on computing infrastructure increases, a wider range
|
As our reliance on computing infrastructure increases, a wider range
|
||||||
of applications requires robust data management. Traditionally, data
|
of applications requires robust data management. Traditionally, data
|
||||||
management has been the province of database management systems
|
management has been the province of database management systems
|
||||||
|
@ -302,7 +302,7 @@ support, or to abandon the database approach entirely, and forgo the
|
||||||
use of a structured physical model and abstract conceptual mappings.
|
use of a structured physical model and abstract conceptual mappings.
|
||||||
|
|
||||||
\subsection{The Systems View}
|
\subsection{The Systems View}
|
||||||
|
\label{sec:systems}
|
||||||
The systems community has also worked on this mismatch for 20 years,
|
The systems community has also worked on this mismatch for 20 years,
|
||||||
which has led to many interesting projects. Examples include
|
which has led to many interesting projects. Examples include
|
||||||
alternative durability models such as QuickSilver~\cite{experienceWithQuickSilver},
|
alternative durability models such as QuickSilver~\cite{experienceWithQuickSilver},
|
||||||
|
@ -1059,26 +1059,24 @@ We used Berkeley DB 4.2.52
|
||||||
%as it existed in Debian Linux's testing branch during March of 2005,
|
%as it existed in Debian Linux's testing branch during March of 2005,
|
||||||
with the flags DB\_TXN\_SYNC (sync log on commit), and
|
with the flags DB\_TXN\_SYNC (sync log on commit), and
|
||||||
DB\_THREAD (thread safety) enabled. These flags were chosen to match Berkeley DB's
|
DB\_THREAD (thread safety) enabled. These flags were chosen to match Berkeley DB's
|
||||||
configuration to \yads as closely as possible. In cases where
|
configuration to \yads as closely as possible. We
|
||||||
Berkeley DB implements a feature that is not provided by \yad, we
|
increased Berkeley DB's buffer cache and log buffer sizes to match
|
||||||
only enable the feature if it improves Berkeley DB's performance on the benchmarks.
|
\yads default sizes. When
|
||||||
|
Berkeley DB implements a feature that \yad is missing, we enable the feature if it
|
||||||
|
improves benchmark performance.
|
||||||
|
|
||||||
Optimizations to Berkeley DB that we performed included disabling the
|
We disable Berkeley DB's lock manager for the benchmarks,
|
||||||
lock manager, though we still use ``Free Threaded'' handles for all
|
though we still use ``Free Threaded'' handles for all
|
||||||
tests. This yielded a significant increase in performance because it
|
tests. This yields a significant increase in performance because it
|
||||||
removed the possibility of transaction deadlock, abort, and
|
removes the possibility of transaction deadlock, abort, and
|
||||||
repetition. However, disabling the lock manager caused highly
|
repetition. However, disabling the lock manager caused
|
||||||
concurrent Berkeley DB benchmarks to become unstable, suggesting either a
|
concurrent Berkeley DB benchmarks to become unstable, suggesting either a
|
||||||
bug or misuse of the feature.
|
bug or misuse of the feature.
|
||||||
|
|
||||||
With the lock manager enabled, Berkeley
|
With the lock manager enabled, Berkeley
|
||||||
DB's performance in the multithreaded test in Section~\ref{sec:lht} strictly decreased with
|
DB's performance in the multithreaded test in Section~\ref{sec:lht} strictly decreased with
|
||||||
increased concurrency. (The other tests were single-threaded.) We also
|
increased concurrency. (The other tests were single-threaded.)
|
||||||
increased Berkeley DB's buffer cache and log buffer sizes to match
|
|
||||||
\yads default sizes.
|
|
||||||
|
|
||||||
We expended a considerable effort tuning Berkeley DB, and our efforts
|
|
||||||
significantly improved Berkeley DB's performance on these tests.
|
|
||||||
Although further tuning by Berkeley DB experts would probably improve
|
Although further tuning by Berkeley DB experts would probably improve
|
||||||
Berkeley DB's numbers, we think that we have produced a reasonably
|
Berkeley DB's numbers, we think that we have produced a reasonably
|
||||||
fair comparison. The results presented here have been reproduced on
|
fair comparison. The results presented here have been reproduced on
|
||||||
|
@ -1109,14 +1107,21 @@ test is run as a single transaction, minimizing overheads due to synchronous log
|
||||||
}
|
}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
Although the beginning of this paper describes the limitations of
|
This section presents two hashtable implementations built on top of
|
||||||
physical database models and relational storage systems in great
|
\yad, and compares them with the hashtable provided by Berkeley DB.
|
||||||
detail, these systems are the basis of most common transactional
|
One of the \yad implementations is simple and modular, while
|
||||||
storage routines. Therefore, we implement a key-based access method
|
the other is monolithic and hand-tuned. Our experiments show that
|
||||||
in this section. We argue that obtaining reasonable performance in
|
\yads performance is competitive, both with single threaded, and
|
||||||
such a system under \yad is straightforward. We then compare our
|
high-concurency transactions.
|
||||||
straightforward, modular implementation to our hand-tuned version and
|
|
||||||
Berkeley DB's implementation.
|
%Although the beginning of this paper describes the limitations of
|
||||||
|
%physical database models and relational storage systems in great
|
||||||
|
%detail, these systems are the basis of most common transactional
|
||||||
|
%storage routines. Therefore, we implement a key-based access method
|
||||||
|
%in this section. We argue that obtaining reasonable performance in
|
||||||
|
%such a system under \yad is straightforward. We then compare our
|
||||||
|
%straightforward, modular implementation to our hand-tuned version and
|
||||||
|
%Berkeley DB's implementation.
|
||||||
|
|
||||||
The modular hash table uses nested top actions to update its internal
|
The modular hash table uses nested top actions to update its internal
|
||||||
structure atomically. It uses a {\em linear} hash
|
structure atomically. It uses a {\em linear} hash
|
||||||
|
@ -1222,7 +1227,7 @@ customizes the behavior of the buffer manager. Finally, the
|
||||||
between versions of objects.
|
between versions of objects.
|
||||||
|
|
||||||
The update/flush variant avoids maintaining an up-to-date
|
The update/flush variant avoids maintaining an up-to-date
|
||||||
version of each object in the buffer manager or page file: it allows
|
version of each object in the buffer manager or page file. Instead, it allows
|
||||||
the buffer manager's view of live application objects to become stale.
|
the buffer manager's view of live application objects to become stale.
|
||||||
This is safe since the system is always able to reconstruct the
|
This is safe since the system is always able to reconstruct the
|
||||||
appropriate page entry from the live copy of the object.
|
appropriate page entry from the live copy of the object.
|
||||||
|
@ -1232,10 +1237,10 @@ number of times the \yad \oasys plugin must update serialized objects in the buf
|
||||||
% Reducing the number of serializations decreases
|
% Reducing the number of serializations decreases
|
||||||
%CPU utilization, and it also
|
%CPU utilization, and it also
|
||||||
This allows us to drastically decrease the
|
This allows us to drastically decrease the
|
||||||
amount of memory used by the buffer manager. In turn this allows us to increase the size of
|
amount of memory used by the buffer manager, and increase the size of
|
||||||
the application's cache of live objects.
|
the application's cache of live objects.
|
||||||
|
|
||||||
We implemented the \yad buffer-pool optimization by adding two new
|
We implemented the \yad buffer pool optimization by adding two new
|
||||||
operations, update(), which updates the log when objects are modified, and flush(), which
|
operations, update(), which updates the log when objects are modified, and flush(), which
|
||||||
updates the page when an object is eviced from the application's cache.
|
updates the page when an object is eviced from the application's cache.
|
||||||
|
|
||||||
|
@ -1250,76 +1255,35 @@ are evicted from cache, not the order in which they are udpated.
|
||||||
Therefore, the version of each object on a page cannot be determined
|
Therefore, the version of each object on a page cannot be determined
|
||||||
from a single LSN.
|
from a single LSN.
|
||||||
|
|
||||||
We solve this problem by using blind writes\rcs{term?} to update
|
We solve this problem by using blind updates to modify
|
||||||
objects in place, but maintain a per-page LSN that is updated whenever
|
objects in place, but maintain a per-page LSN that is updated whenever
|
||||||
an object is allocated or deallocated. At recovery, we apply
|
an object is allocated or deallocated. At recovery, we apply
|
||||||
allocations and deallocations as usual. To redo an update, we first
|
allocations and deallocations based on the page LSN. To redo an
|
||||||
decide whether the object that is being updated exists on the page.
|
update, we first decide whether the object that is being updated
|
||||||
If so, we apply the blind write. If not, then we know that the
|
exists on the page. If so, we apply the blind update. If not, then
|
||||||
version of the page we have was written to disk after the applicable
|
the object must have already been freed, so we do not apply the
|
||||||
object was freed, so do not apply the update. (Because support for
|
update. Because support for blind updates is not yet implemented, the
|
||||||
blind writes is not yet implemented, our benchmarks mimic this
|
experiments presented below mimic this behavior at runtime, but do not
|
||||||
behavior at runtime, but do not support recovery.)
|
support recovery.
|
||||||
|
|
||||||
Before we came to this solution, we considered storing multiple LSNs
|
Before we came to this solution, we considered storing multiple LSNs
|
||||||
per page, but this would force us to register a callback with recovery
|
per page, but this would force us to register a callback with recovery
|
||||||
to process the LSNs, and extend one of \yads page format so contain
|
to process the LSNs, and extend one of \yads page format so contain
|
||||||
per-record LSNs More importantly, the storage allocation routine need
|
per-record LSNs. More importantly, the storage allocation routine need
|
||||||
to avoid overwriting the per-object LSN of deleted objects that may be
|
to avoid overwriting the per-object LSN of deleted objects that may be
|
||||||
manipulated during REDO.
|
manipulated during REDO.
|
||||||
|
|
||||||
%One way to
|
|
||||||
%deal with this is to maintain multiple LSNs per page. This means we would need to register a
|
|
||||||
%callback with the recovery routine to process the LSNs (a similar
|
|
||||||
%callback will be needed in Section~\ref{sec:zeroCopy}), and
|
|
||||||
%extend \yads page format to contain per-record LSNs.
|
|
||||||
%Also, we must prevent \yads storage allocation routine from overwriting the per-object
|
|
||||||
%LSNs of deleted objects that may still be addressed during abort or recovery.\eab{tombstones discussion here?}
|
|
||||||
|
|
||||||
\eab{we should at least implement this callback if we have not already}
|
\eab{we should at least implement this callback if we have not already}
|
||||||
|
|
||||||
Alternatively, we could arrange for the object pool to cooperate
|
Alternatively, we could arrange for the object pool to cooperate
|
||||||
further with the buffer pool by atomically updating the buffer
|
further with the buffer pool by atomically updating the buffer
|
||||||
manager's copy of all objects that share a given page.
|
manager's copy of all objects that share a given page.
|
||||||
%, removing the
|
|
||||||
%need for multiple LSNs per page, and simplifying storage allocation.
|
|
||||||
|
|
||||||
%However, the simplest solution, and the one we take here, is based on
|
|
||||||
%the observation that updates (not allocations or deletions) of
|
|
||||||
%fixed-length objects are blind writes. This allows us to do away with
|
|
||||||
%per-object LSNs entirely. Allocation and deletion can then be
|
|
||||||
%handled as updates to normal LSN containing pages. At recovery time,
|
|
||||||
%object updates are executed based on the existence of the object on
|
|
||||||
%the page and a conservative estimate of its LSN. (If the page doesn't
|
|
||||||
%contain the object during REDO then it must have been written back to
|
|
||||||
%disk after the object was deleted. Therefore, we do not need to apply
|
|
||||||
%the REDO.) This means that the system can ``forget'' about objects
|
|
||||||
%that were freed by committed transactions, simplifying space reuse
|
|
||||||
%tremendously.
|
|
||||||
|
|
||||||
The third plugin variant, ``delta'', incorporates the update/flush
|
The third plugin variant, ``delta'', incorporates the update/flush
|
||||||
optimizations, but only writes the changed portions of
|
optimizations, but only writes the changed portions of
|
||||||
objects to the log. Because of \yads support for custom log-entry
|
objects to the log. Because of \yads support for custom log-entry
|
||||||
formats, this optimization is straightforward.
|
formats, this optimization is straightforward.
|
||||||
|
|
||||||
%In addition to the buffer-pool optimizations, \yad provides several
|
|
||||||
%options to handle UNDO records in the context
|
|
||||||
%of object serialization. The first is to use a single transaction for
|
|
||||||
%each object modification, avoiding the cost of generating or logging
|
|
||||||
%any UNDO records. The second option is to assume that the
|
|
||||||
%application will provide a custom UNDO for the delta,
|
|
||||||
%which increases the size of the log entry generated by each update,
|
|
||||||
%but still avoids the need to read or update the page
|
|
||||||
%file.
|
|
||||||
%
|
|
||||||
%The third option is to relax the atomicity requirements for a set of
|
|
||||||
%object updates and again avoid generating any UNDO records. This
|
|
||||||
%assumes that the application cannot abort individual updates,
|
|
||||||
%and is willing to
|
|
||||||
%accept that some prefix of logged but uncommitted updates may
|
|
||||||
%be applied to the page
|
|
||||||
%file after recovery.
|
|
||||||
|
|
||||||
\oasys does not provide a transactional interface to its callers.
|
\oasys does not provide a transactional interface to its callers.
|
||||||
Instead, it is designed to be used in systems that stream objects over
|
Instead, it is designed to be used in systems that stream objects over
|
||||||
an unreliable network connection. The objects are independent of each
|
an unreliable network connection. The objects are independent of each
|
||||||
|
@ -1360,7 +1324,7 @@ transactions. (Although it is applying each individual operation
|
||||||
atomically.)
|
atomically.)
|
||||||
|
|
||||||
In non-memory bound systems, the optimizations nearly double \yads
|
In non-memory bound systems, the optimizations nearly double \yads
|
||||||
performance by reducing the CPU overhead of copying marshalling and
|
performance by reducing the CPU overhead of marshalling and
|
||||||
unmarshalling objects, and by reducing the size of log entries written
|
unmarshalling objects, and by reducing the size of log entries written
|
||||||
to disk.
|
to disk.
|
||||||
|
|
||||||
|
@ -1371,7 +1335,7 @@ so that 10\% fit in a {\em hot set} that is small enough to fit into
|
||||||
memory. We then measured \yads performance as we varied the
|
memory. We then measured \yads performance as we varied the
|
||||||
percentage of object updates that manipulate the hot set. In the
|
percentage of object updates that manipulate the hot set. In the
|
||||||
memory bound test, we see that update/flush indeed improves memory
|
memory bound test, we see that update/flush indeed improves memory
|
||||||
utilization.
|
utilization. \rcs{Graph axis should read ``percent of updates in hot set''}
|
||||||
|
|
||||||
\subsection{Request reordering}
|
\subsection{Request reordering}
|
||||||
|
|
||||||
|
@ -1401,10 +1365,13 @@ In the cases where depth first search performs well, the
|
||||||
reordering is inexpensive.}
|
reordering is inexpensive.}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
Logical operations often have some convenient properties that this section
|
We are interested in using \yad to directly manipulate sequences of
|
||||||
will exploit. Because they can be invoked at arbitrary times in the
|
application requests. By translating these requests into the logical
|
||||||
future, they tend to be independent of the database's physical state.
|
operations that are used for logical undo, we can use parts of \yad to
|
||||||
Often, they correspond application-level operations
|
manipulate and interpret such requests. Because logical operations
|
||||||
|
can be invoked at arbitrary times in the future, they tend to be
|
||||||
|
independent of the database's physical state. Also, they generally
|
||||||
|
correspond to application-level operations.
|
||||||
|
|
||||||
Because of this, application developers can easily determine whether
|
Because of this, application developers can easily determine whether
|
||||||
logical operations may be reordered, transformed, or even dropped from
|
logical operations may be reordered, transformed, or even dropped from
|
||||||
|
@ -1412,10 +1379,10 @@ the stream of requests that \yad is processing. For example, if
|
||||||
requests manipulate disjoint sets of data, they can be split across
|
requests manipulate disjoint sets of data, they can be split across
|
||||||
many nodes, providing load balancing. If many requests perform
|
many nodes, providing load balancing. If many requests perform
|
||||||
duplicate work, or repeatedly update the same piece of information,
|
duplicate work, or repeatedly update the same piece of information,
|
||||||
they can be merged into a single requests (RVM's ``log-merging''
|
they can be merged into a single request (RVM's ``log-merging''
|
||||||
implements this type of optimization~\cite{lrvm}). Stream operators
|
implements this type of optimization~\cite{lrvm}). Stream aggregation
|
||||||
and relational albebra operators could be used to efficiently
|
techniques and relational albebra operators could be used to
|
||||||
transform data while it is still laid out sequentially in
|
efficiently transform data while it is still laid out sequentially in
|
||||||
non-transactional memory.
|
non-transactional memory.
|
||||||
|
|
||||||
To experiment with the potenial of such optimizations, we implemented
|
To experiment with the potenial of such optimizations, we implemented
|
||||||
|
@ -1446,7 +1413,7 @@ of a hot set to graph generation. Each node has a distinct hot set
|
||||||
that includes the 10\% of the nodes that are closest to it in ring
|
that includes the 10\% of the nodes that are closest to it in ring
|
||||||
order. The remaining nodes are in the cold set. We use random edges
|
order. The remaining nodes are in the cold set. We use random edges
|
||||||
instead of ring edges for this test. This does not ensure graph
|
instead of ring edges for this test. This does not ensure graph
|
||||||
connectivity, but we use the same random seeds for the two systems.
|
connectivity, but we use the same set of graphs when evaluating the two systems.
|
||||||
|
|
||||||
When the graph has good locality, a normal depth first search
|
When the graph has good locality, a normal depth first search
|
||||||
traversal and the prioritized traversal both perform well. The
|
traversal and the prioritized traversal both perform well. The
|
||||||
|
@ -1701,69 +1668,37 @@ available to applications. In QuickSilver, nested transactions would
|
||||||
have been most useful when composing a series of program invocations
|
have been most useful when composing a series of program invocations
|
||||||
into a larger logical unit~\cite{experienceWithQuickSilver}.
|
into a larger logical unit~\cite{experienceWithQuickSilver}.
|
||||||
|
|
||||||
\subsection{Berkeley DB}
|
\subsection{Transactional data structures}
|
||||||
|
|
||||||
\eab{this text is also in Sec 2; need a new comparison}
|
\rcs{Better section name?}
|
||||||
|
|
||||||
Berkeley DB is a highly successful alternative to conventional
|
As mentioned in Section~\ref{sec:system}, Berkeley DB is a system
|
||||||
databases~\cite{libtp}. At its core, it provides the physical database model
|
quite similar to \yad, and essentially provides raw access to
|
||||||
(relational storage system~\cite{systemR}) of a conventional database server.
|
transactional data structures for application
|
||||||
%It is based on the
|
programmers~\cite{libtp}. As we mentioned earlier, we beleive that
|
||||||
%observation that the storage subsystem is a more general (and less
|
\yad is general enough to support a library like Berkeley DB, but that
|
||||||
%abstract) component than a monolithic database, and provides a
|
Berkeley DB is too specialized to be useful to a reimplementation of
|
||||||
%stand-alone implementation of the storage primitives built into
|
\yad.
|
||||||
%most relational database systems~\cite{libtp}.
|
|
||||||
In particular,
|
|
||||||
it provides fully transactional (ACID) operations over B-trees,
|
|
||||||
hash tables, and other access methods. It provides flags that
|
|
||||||
let its users tweak various aspects of the performance of these
|
|
||||||
primitives, and selectively disable the features it provides.
|
|
||||||
|
|
||||||
With the
|
Cluster hash tables provide scalable, replicated hashtable
|
||||||
exception of the benchmark designed to fairly compare the two systems, none of the \yad
|
implementation by partitioning the hash's buckets across multiple
|
||||||
applications presented in Section~\ref{sec:extensions} are efficiently
|
systems. Boxwood treats each system in a cluster of machines as a
|
||||||
supported by Berkeley DB. This is a result of Berkeley DB's
|
``chunk store,'' and builds a transactional, fault tolerant B-Tree on
|
||||||
assumptions regarding workloads and decisions regarding low level data
|
top of the chunks that these machines export.
|
||||||
representation. Thus, although Berkeley DB could be built on top of \yad,
|
|
||||||
Berkeley DB's data model and write-ahead logging system are too specialized to support \yad.
|
|
||||||
|
|
||||||
\subsection{Transactional storage servers}
|
\yad is complementary to Boxwood and cluster hash tables; those
|
||||||
|
systems intelligentally compose a set of systems for scalability and
|
||||||
|
fault tolerance. In contrast, \yad makes it easy to push intelligence
|
||||||
|
into the individual nodes, allowing them to provide primitives that
|
||||||
|
are appropriate for the higher level service.
|
||||||
|
|
||||||
\rcs{Boxwood, cluster hash tables here.}
|
\subsection{Data layout policies}
|
||||||
|
|
||||||
\subsection{stuff to add somewhere}
|
Data layout policies typically make decisions that have significant
|
||||||
|
impacts upon performace. Generally, these decisions are based upon
|
||||||
cover P2 (the old one, not Pier 2 if there is time...
|
assumptions about the application. Allowing \yad operations to make
|
||||||
|
use of application-specific layout policies would increase their
|
||||||
|
flexibilty.\rcs{Fix sentence.}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
More recently, WinFS, Microsoft's database based
|
|
||||||
file meta data management system, has been replaced in favor of an
|
|
||||||
embedded indexing engine that imposes less structure (and provides
|
|
||||||
fewer consistency guarantees) than the original
|
|
||||||
proposal~\cite{needtocitesomething}.
|
|
||||||
|
|
||||||
Scaling to the very large doesn't work (SAP used DB2 as a hash table
|
|
||||||
for years), search engines, cad/VLSI didn't happen. scalable GIS
|
|
||||||
systems use shredded blobs (terraserver, google maps), scaling to many
|
|
||||||
was more difficult than implementing from scratch (winfs), scaling
|
|
||||||
down doesn't work (variance in performance, footprint),
|
|
||||||
|
|
||||||
|
|
||||||
---- old related work start ---
|
|
||||||
\subsection{Implementation Ideas}
|
|
||||||
|
|
||||||
%This paper has described a number of custom transactional storage
|
|
||||||
%extensions, and explained why can \yad support them.
|
|
||||||
|
|
||||||
This section
|
|
||||||
will describe existing ideas in the literature that we would like to
|
|
||||||
incorporate into \yad.
|
|
||||||
|
|
||||||
% An overview of database systems that have
|
|
||||||
%goals similar to our own is in Section~\ref{sec:otherDBs}.
|
|
||||||
|
|
||||||
Different large object storage systems provide different API's.
|
Different large object storage systems provide different API's.
|
||||||
Some allow arbitrary insertion and deletion of bytes~\cite{esm}
|
Some allow arbitrary insertion and deletion of bytes~\cite{esm}
|
||||||
|
@ -1812,28 +1747,6 @@ minimum, this is particularly attractive on a single disk system. We
|
||||||
plan to use ideas from LFS~\cite{lfs} and POSTGRES~\cite{postgres}
|
plan to use ideas from LFS~\cite{lfs} and POSTGRES~\cite{postgres}
|
||||||
to implement this.
|
to implement this.
|
||||||
|
|
||||||
\yads record allocation currently implements a policy that is similar
|
|
||||||
to Hoard and McRT, although it has not been as heavily optmized for
|
|
||||||
CPU utilization. The record allocator obtains pages from a region
|
|
||||||
allocator that provides contiguous regions of space to other
|
|
||||||
allocators.
|
|
||||||
|
|
||||||
Starburst~\cite{starburst} provides a flexible approach to index
|
|
||||||
management and database trigger support, as well as hints for small
|
|
||||||
object layout.
|
|
||||||
|
|
||||||
The Boxwood system provides a networked, fault-tolerant transactional
|
|
||||||
B-tree and ``Chunk Manager.'' We believe that \yad is an interesting
|
|
||||||
complement to such a system, especially given \yads focus on
|
|
||||||
intelligence and optimizations within a single node, and Boxwood's
|
|
||||||
focus on multiple node systems. In particular, it would be
|
|
||||||
interesting to explore extensions to the Boxwood approach that make
|
|
||||||
use of \yads customizable semantics (Section~\ref{sec:wal}) and fully
|
|
||||||
logical logging mechanisms (Section~\ref{sec:logging}).
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\section{Future Work}
|
\section{Future Work}
|
||||||
|
|
||||||
Complexity problems may begin to arise as we attempt to implement more
|
Complexity problems may begin to arise as we attempt to implement more
|
||||||
|
@ -1895,11 +1808,13 @@ Gilad Arnold and Amir Kamil implemented
|
||||||
pobj. Jim Blomo, Jason Bayer, and Jimmy
|
pobj. Jim Blomo, Jason Bayer, and Jimmy
|
||||||
Kittiyachavalit worked on an early version of \yad.
|
Kittiyachavalit worked on an early version of \yad.
|
||||||
|
|
||||||
Thanks to C. Mohan for pointing out the need for tombstones with
|
Thanks to C. Mohan for pointing out that per-object LSNs may be
|
||||||
per-object LSNs. Jim Gray provided feedback on an earlier version of
|
inadvertantly overwritten during recovery. Jim Gray suggested we use
|
||||||
this paper, and suggested we use a resource manager to manage
|
a resource manager to track dependencies within \yad and provided
|
||||||
dependencies within \yads API. Joe Hellerstein and Mike Franklin
|
feedback on the LSN-free recovery algorithms. Joe Hellerstein and
|
||||||
provided us with invaluable feedback.
|
Mike Franklin provided us with invaluable feedback.
|
||||||
|
|
||||||
|
Intel Research Berkeley supported portions of this work.
|
||||||
|
|
||||||
\section{Availability}
|
\section{Availability}
|
||||||
\label{sec:avail}
|
\label{sec:avail}
|
||||||
|
@ -2005,4 +1920,21 @@ implementation must obey a few more invariants:
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
\subsection{stuff to add somewhere}
|
||||||
|
|
||||||
|
cover P2 (the old one, not Pier 2 if there is time...
|
||||||
|
|
||||||
|
More recently, WinFS, Microsoft's database based
|
||||||
|
file meta data management system, has been replaced in favor of an
|
||||||
|
embedded indexing engine that imposes less structure (and provides
|
||||||
|
fewer consistency guarantees) than the original
|
||||||
|
proposal~\cite{needtocitesomething}.
|
||||||
|
|
||||||
|
Scaling to the very large doesn't work (SAP used DB2 as a hash table
|
||||||
|
for years), search engines, cad/VLSI didn't happen. scalable GIS
|
||||||
|
systems use shredded blobs (terraserver, google maps), scaling to many
|
||||||
|
was more difficult than implementing from scratch (winfs), scaling
|
||||||
|
down doesn't work (variance in performance, footprint),
|
||||||
|
|
||||||
|
|
||||||
\end{document}
|
\end{document}
|
||||||
|
|
Loading…
Reference in a new issue