stasis-aries-wal/doc/position-paper/LLADD.txt

237 lines
12 KiB
Text
Raw Normal View History

2005-03-29 02:09:12 +00:00
Russell Sears
Eric Brewer
Automated Verification and Optimization of Extensions to Transactional
Storage Systems.
Existing transactional systems are designed to handle specific
workloads well. Unfortunately, these systems' implementations are
geared toward specific workloads and data layouts such as those
traditionally associated with SQL. Lower level implementations such
as Berkeley DB handle a wider variety of workloads and are built in a
modular fashion. However, they do not provide APIs to allow
applications to build upon and modify low level policies such as
allocation strategies, page layout or details of recovery semantics.
2005-03-30 17:57:43 +00:00
Furthermore, data structure implementations are typically not broken
into separable, public APIs, encouraging a "from scratch" approach to
2005-03-31 02:48:34 +00:00
the implementation of new transactional data structures.
2005-03-29 02:09:12 +00:00
Contrast this to the handling of data structures within modern object
2005-03-31 02:48:34 +00:00
oriented programming languages such as C++ or Java. Such languages
2005-03-30 17:57:43 +00:00
typically provide a large number of data storage algorithm
implementations. These structures may be used interchangeably with
application-specific data collections, and collection implementations
2005-03-31 02:48:34 +00:00
may be composed into more sophisticated data structures.
2005-03-29 02:09:12 +00:00
We have implemented LLADD (/yad/), an extensible transactional storage
2005-03-30 17:57:43 +00:00
library that takes a composable and layered approach to transactional
storage. Below, we present some of its high level features and
performance characteristics and discuss our plans to
2005-03-30 17:57:43 +00:00
extend the system into distributed domains. Finally we introduce our
current research focus, the application of automated program
verification and optimization techniques to application specific
extensions. Such techniques should significantly enhance the
usability and performance of our system, allowing application
developers to implement sophisticated cross-layer optimizations easily.
Overview of the LLADD Architecture
2005-03-31 02:48:34 +00:00
----------------------------------
General purpose transactional storage systems are extremely complex
and only handle certain types of workloads efficiently. However, new
types of applications and workloads are introduced on a regular basis.
This results in the implementation of specialized, ad-hoc data storage
systems from scratch, wasting resources and preventing code reuse.
Instead of developing a set of general purpose data structures that
attempt to behave well across many workloads, we have implemented a
lower level API that makes it easy for application designers to
implement specialized data structures. Essentially, we have
implemented an extensible navigational database system. We
believe that this system will support modern development practices and
address rapidly evolving applications before
2005-03-31 02:48:34 +00:00
appropriate general-purpose solutions have been developed.
In cases
where the development of a general-purpose solution is not economical,
our approach should lead to maintainable and efficient long-term
2005-03-31 02:48:34 +00:00
solutions. Semi-structured data stores provide good examples of both
types of scenarios. General XML storage technologies are improving
rapidly, but still fail to handle many types of applications.
For instance,
we know of no general purpose solution that seriously addresses
semi-structured scientific information, such as the large repositories
typical of bioinformatics research efforts[PDB, NCBI, Gene Ontology].
While many scientific projects are moving toward XML for their data
representation, we have found that XML is used primarily as a data
interchange format, and that existing XML tools fail to address the
needs of automated data mining, scientific computing and interactive
query systems.
LLADD is based upon an extensible version of ARIES but does not
2005-03-30 17:57:43 +00:00
hard-code details such as page format or data structure
implementation. It provides a number of "operation" implementations
which consist of redo/undo methods and wrapper functions. The redo/undo
2005-03-31 02:48:34 +00:00
methods manipulate the page file by applying log entries while the
wrapper functions produce log entries. Redo methods handle all page
file manipulation during normal forward operation, reducing the amount
of code that must be developed in order to implement new data structures.
LLADD handles the scheduling of redo/undo invocations, disk I/O, and all
of the other details specified by the ARIES recovery algorithm, allowing
operation implementors to focus on the details that are important to the
functionality their extension provides.
LLADD ships with a number of default data structures and
layouts, ranging from byte-level page layouts to linear hashtables
and application-specific recovery schemes and data structures.
These structures were developed with reusability in mind, encouraging
developers to compose existing operations into application-specific data
structures. For example, the hashtable is
implemented on top of reusable modules that implement a resizable array
and two exchangeable linked list variants.
In other work, we show that the system is competitive with
Berkeley DB on traditional (hashtable based) workloads, and have shown
significant performance improvements for less conventional workloads
2005-03-30 17:57:43 +00:00
including custom data structure implementations, graph traversal
algorithms and transactional object persistence workloads.
2005-03-31 02:48:34 +00:00
The transactional object persistence system was based upon the
observation that most object perstistence schemes cache a second copy
of each in-memory object in a page file, and often keep a third copy
in operating system cache. By implementing custom operations that
assume the program maintains a correctly implemented object cache, we
allow LLADD to service object update requests without updating the
page file.
Since LLADD implements no-force, the only reason to update
the page file is to service future application read requests.
Therefore, we defer page file updates until the object is evicted from
the application's object cache, eliminating the need to maintain a large
page cache in order to efficiently service write requests. We also
leveraged our customizable log format to log differences to objects
instead of entire copies of objects.
With these optimizations, we showed a 2-3x performance improvement over Berkeley DB on object
persistence across our benchmarks, and a 3-4x improvement over an
in-process version of MySQL with the InnoDB backend. (A traditional
2005-03-31 02:48:34 +00:00
MySQL setup that made use of a separate server process was prohibitively
slow. InnoDB provided the best performance among MySQL's durable storage managers.)
2005-03-31 02:48:34 +00:00
Furthermore, our system uses memory more efficiently,
increasing its performance advantage in situations where the size of
system memory is a bottleneck.
We
2005-03-30 17:57:43 +00:00
leave systematic performance tuning of LLADD to future work, and
believe that further optimizations will improve our performance on
2005-03-30 17:57:43 +00:00
these benchmarks significantly.
LLADD's customizability provides superior performance over existing,
2005-03-31 02:48:34 +00:00
complicated systems. Because of its natural integration into standard
2005-03-30 17:57:43 +00:00
system software development practices, we think that LLADD can be
naturally extended into networked and distributed domains.
For example, typical write-ahead-logging protocols implicitly
implement machine independent, reorderable log entries in order to
implement logical undo. These two properties have been crucial in
past system software designs, including data replication,
distribution, and conflict resolution algorithms. Therefore, we plan
to provide a networked, logical redo log as an application-level
primitive, and to explore system designs that leverage this approach.
2005-03-29 02:09:12 +00:00
2005-03-31 02:48:34 +00:00
Current Research Focus
----------------------
LLADD's design assumes that application developers will
2005-03-30 17:57:43 +00:00
implement high performance transactional data structures. This is a
big assumption, as these data structures are notoriously difficult to
implement correctly. Our current research attempts to address these
concerns.
2005-03-29 02:58:54 +00:00
2005-03-30 17:57:43 +00:00
For our infrastructure to be generally useful the functionality that
it provides should be efficient, reliable and applicable to new
application domains. We believe that improvements to the development
process can address each of these goals.
2005-03-29 02:58:54 +00:00
2005-03-29 03:00:26 +00:00
Application developers typically have a limited amount of time to
2005-03-29 02:09:12 +00:00
spend implementing and verifying application-specific storage
extensions, but bugs in these extensions have dire consequences.
2005-03-30 17:57:43 +00:00
Also, while data structure algorithms tend to be simple and easily
understood, performance tuning and verification of implementation
correctness is extremely difficult.
2005-03-29 02:09:12 +00:00
Recovery based algorithms must behave correctly during forward
2005-03-29 02:58:54 +00:00
operation and also under arbitrary recovery scenarios. The latter
2005-03-29 02:09:12 +00:00
requirement is particularly difficult to verify due to the large
number of materialized page file states that could occur after a
crash.
2005-03-29 03:00:26 +00:00
Fortunately, write-ahead-logging schemes such as ARIES make use of
nested-top-actions to vastly simplify the problem. Given the
2005-03-29 02:09:12 +00:00
correctness of page based physical undo and redo, logical undo may
assume that page spanning operations are applied to the data store
2005-03-29 03:00:26 +00:00
atomically.
2005-03-29 02:09:12 +00:00
Existing work in the static-analysis community has verified that
device driver implementations correctly adhere to complex operating
2005-03-29 02:58:54 +00:00
system kernel locking schemes[SLAM]. If we formalize LLADD's latching
and logging APIs, we believe that analyses such as these will be
directly applicable, allowing us to verify that data structure
2005-03-30 17:57:43 +00:00
behavior during recovery is equivalent to the behavior that would
result if an abort() was issued on each prefix of the log that is
generated during normal forward operation.
2005-03-29 03:00:26 +00:00
By using coarse (one latch per logical operation) latching, we can
drastically reduce the size of this space, allowing conventional
state-state based search techniques (such as randomized or exhaustive
2005-03-31 02:48:34 +00:00
state-space searches, or unit testing techniques) to be
2005-03-29 03:00:26 +00:00
practical. It has been shown that such coarse grained latching can
2005-03-31 02:48:34 +00:00
yield high-performance concurrent data structures if
2005-03-29 03:00:26 +00:00
semantics-preserving optimizations such as page prefetching are
applied[ARIES/IM].
2005-03-29 02:09:12 +00:00
2005-03-30 17:57:43 +00:00
A separate approach to the static analysis of LLADD extensions uses
compiler optimization techniques. Software built on top of layered
APIs frequently makes repeated calls to low level functions that result
in repeated work. A common example in LLADD involves loops over data with
2005-03-30 17:57:43 +00:00
good locality in the page file. The vast majority of the time, these
loops result in a series of high level API calls that repeatedly pin
and unpin the same underlying data.
2005-03-29 03:00:26 +00:00
The code for each of these high level API calls could be copied into
many different variants with different pinning/unpinning and
latching/unlatching behavior, but this would greatly complicate the
API that application developers must work with, and complicate any
application code that made use of such optimizations.
2005-03-29 02:58:54 +00:00
Compiler optimization techniques such as partial common subexpression
2005-03-31 02:48:34 +00:00
elimination solve an analogous problem to remove redundant algebraic
2005-03-29 02:09:12 +00:00
computations. We hope to extend such techniques to reduce the number
of buffer manager and locking calls made by existing code at runtime.
2005-03-31 02:48:34 +00:00
Anecdotal evidence and personal experience suggest
that similar optimization techniques are applicable to
application code. Because local LLADD calls are simply normal
2005-03-31 02:48:34 +00:00
function calls, it may even be possible to apply the transformations that these optimizations
perform to application code that is unaware of the underlying storage implementation.
This class of
optimizations would be very difficult to implement with existing
transactional storage systems but should significantly improve application performance.
2005-03-29 02:09:12 +00:00
Our implementation of LLADD is still unstable and inappropriate for
2005-03-31 02:48:34 +00:00
use on important data. We hope to validate our ideas about static analysis
by incorporating them into the development process as we increase
the reliability and overall quality of LLADD's implementation and its
APIs.
2005-03-29 03:00:26 +00:00
2005-03-31 02:48:34 +00:00
Our architecture provides a set of tools that allow applications to implement
2005-03-29 03:00:26 +00:00
custom transactional data structures and page layouts. This avoids
2005-03-31 02:48:34 +00:00
"impedance mismatch," simplifying applications and providing appropriate
applications with performance that is comparable or superior to other
general-purpose solutions.
By adding support for automated code verification and
2005-03-29 03:00:26 +00:00
transformations we hope to make it easy to produce correct extensions
and to allow simple, maintainable implementations to compete with
2005-03-31 02:48:34 +00:00
special purpose, hand-optimized code.