intro
This commit is contained in:
parent
c31b497b62
commit
c97082e3a0
1 changed files with 69 additions and 54 deletions
|
@ -16,13 +16,14 @@
|
|||
% by the Word sample file.
|
||||
% This version uses the latex2e styles, not the very ancient 2.09 stuff.
|
||||
\documentclass[letterpaper,twocolumn,10pt]{article}
|
||||
\usepackage{usenix,epsfig,endnotes,xspace}
|
||||
\usepackage{usenix,epsfig,endnotes,xspace,color}
|
||||
|
||||
% Name candidates:
|
||||
% Anza
|
||||
% Void
|
||||
% Station (from Genesis's "Grand Central" component)
|
||||
% TARDIS: Atomic, Recoverable, Datamodel Independent Storage
|
||||
% EAB: flex, basis, stable, dura
|
||||
|
||||
\newcommand{\yad}{Void\xspace}
|
||||
\newcommand{\oasys}{Juicer\xspace}
|
||||
|
@ -61,18 +62,25 @@ UC Berkeley
|
|||
|
||||
\subsection*{Abstract}
|
||||
|
||||
\yad is a storage framework that incorporates ideas from traditional
|
||||
write-ahead-logging storage algorithms and file system technologies,
|
||||
while providing applications with increased control over its
|
||||
underlying modules. Generic transactional storage systems such as SQL
|
||||
and BerkeleyDB serve many applications well, but impose constraints
|
||||
that are undesirable to developers of system software and
|
||||
high-performance applications. Conversely, while filesystems place
|
||||
few constraints on applications, the do not provide atomicity or
|
||||
durability properties that naturally correspond to application needs.
|
||||
The is an increasing need to manage data well in a wide variety of
|
||||
systems, including robust support for atomic durable concurrent
|
||||
transactions. Databases provide the default solution, but force
|
||||
applications to interact via SQL and to forfeit control over data
|
||||
layout and access mechanisms. We argue there is a gap between DBMSs and file systems that limits designers of data-oriented applications.
|
||||
|
||||
This paper addresses this gap (and enables the development of
|
||||
unforeseen variants on transactional storage) by generalizing
|
||||
\yad is a storage framework that incorporates ideas from traditional
|
||||
write-ahead-logging storage algorithms and file systems,
|
||||
while providing applications with flexible control over data structure, layout and performance vs. robustness tradeoffs.
|
||||
% increased control over their
|
||||
%underlying modules. Generic transactional storage systems such as SQL
|
||||
%and BerkeleyDB serve many applications well, but impose constraints
|
||||
%that are undesirable to developers of system software and
|
||||
%high-performance applications. Conversely, while filesystems place
|
||||
%few constraints on applications, the do not provide atomicity or
|
||||
%durability properties that naturally correspond to application needs.
|
||||
|
||||
\yad enables the development of
|
||||
unforeseen variants on transactional storage by generalizing
|
||||
write-ahead-logging algorithms. Our partial implementation of these
|
||||
ideas already provides specialized (and cleaner) semantics and
|
||||
improved performance to applications.
|
||||
|
@ -80,17 +88,18 @@ improved performance to applications.
|
|||
%Applications may use our modular library of basic data strctures to
|
||||
%compose new concurrent transactional access methods, or write their
|
||||
%own from scratch.
|
||||
This paper presents examples that make use of custom access methods,
|
||||
|
||||
We present examples that make use of custom access methods,
|
||||
modifed buffer manager semantics, direct log file manipulation, and
|
||||
LSN-free pages that facilitate zero-copy optimizations, and discusses
|
||||
the composability of these extensions.
|
||||
|
||||
We argue that our ability to support such a diverse range of
|
||||
transactional systems stems directly from our rejectiion of
|
||||
assumptions made by early database designers. These assumptions
|
||||
permeate ``database toolkit'' research. We attribute the success of
|
||||
low-level transaction processing libraries (such as Berkeley DB) to
|
||||
a partial break from traditional database dogma.
|
||||
%We argue that our ability to support such a diverse range of
|
||||
%transactional systems stems directly from our rejection of
|
||||
%assumptions made by early database designers. These assumptions
|
||||
%permeate ``database toolkit'' research. We attribute the success of
|
||||
%low-level transaction processing libraries (such as Berkeley DB) to
|
||||
%a partial break from traditional database dogma.
|
||||
|
||||
% entries, and
|
||||
% to reduce memory and
|
||||
|
@ -118,6 +127,8 @@ a partial break from traditional database dogma.
|
|||
%this has happened, the abstractions provided by database systems have
|
||||
%seriously restricted system designs and implementations.
|
||||
|
||||
|
||||
|
||||
Approximately a decade ago, the operating systems research community came to
|
||||
the painful realization that the presence of high level abstractions
|
||||
in ``unavoidable'' system components precluded the development of
|
||||
|
@ -153,6 +164,8 @@ services, map and trip planning services, ticket reservation systems,
|
|||
photo and video repositories, bioinformatics, version control systems,
|
||||
workflow applications, CAD/VLSI applications and directory services.
|
||||
|
||||
\eab{need to talk about positive examples: LRVM, Berk DB, windows registry? Grid FS from Wisconsin}
|
||||
|
||||
Applications that have only recently begun to make use of high-level
|
||||
database features include XML based systems, object persistance
|
||||
mechanisms, and enterprise management systems (notably, SAP R/3).
|
||||
|
@ -209,17 +222,19 @@ when possible.
|
|||
**We've explained why the sky is falling. Now, explain why \yad is
|
||||
so good. (Take ideas from old paper.)**
|
||||
|
||||
\section{Prior work}
|
||||
\section{\yad is not a Database}
|
||||
|
||||
Database research has a long history, including the development of
|
||||
many technologies that our system builds upon. However, we view \yad
|
||||
as a rejection of the fundamental assumptions that underly database
|
||||
systems. In particular, we reject the idea that a general purpose
|
||||
systems. In particular, we reject the idea that a general-purpose
|
||||
storage sytem should attempt to encode universal data models and
|
||||
computational paradigms.
|
||||
computational paradigms. Although we accept that such data models may
|
||||
make sense for applications, we believe that system builders need more
|
||||
control and flexibility.
|
||||
|
||||
Instead, we are less ambitious and seek to build a storage system that
|
||||
provides durable (which often implies transactional) access to the
|
||||
Instead, we are less ambitious and seek to build a flexible
|
||||
transactional storage system that provides durable access to the
|
||||
primitives provided by the underlying hardware. To be of practical
|
||||
value, it must be easy to specialize such a system so that it encodes
|
||||
any of a variety of data models and computational paradigms.
|
||||
|
@ -243,31 +258,32 @@ the description of \yad's design.
|
|||
%we claim that prior work is dissimilar to our own, we refer to
|
||||
%high-level architectural considerations, not low-level details.
|
||||
|
||||
\subsection{Databases as system components}
|
||||
\subsection{Databases as system components}
|
||||
|
||||
A recent survey~\cite{riscDB} enumerates problems that plague users of
|
||||
state-of-the-art database systems. It concludes that efficiently optimizing and
|
||||
consistenly servicing large declarative queries is inherently
|
||||
difficult.
|
||||
state-of-the-art database systems. It concludes that efficiently
|
||||
optimizing and consistenly servicing large declarative queries is
|
||||
inherently difficult.
|
||||
|
||||
The survey finds that database implementations fail to support the needs of modern systems.
|
||||
In large systems, this manifests itself as managability and tuning issues that
|
||||
prevent databases from effectively servicing large scale, diverse, interactive
|
||||
workloads.
|
||||
On smaller systems, footprint, predictable performance, and power
|
||||
consumption are primary concerns, that are not addressed by full-fledged database systems.
|
||||
Database applications that must scale up to large numbers of independent, self-administering desktop
|
||||
installations will be problematic unless a number of open research problems are solved.
|
||||
The survey finds that database implementations fail to support the
|
||||
needs of modern systems. In large systems, this manifests itself as
|
||||
managability and tuning issues that prevent databases from effectively
|
||||
servicing large scale, diverse, interactive workloads. On smaller
|
||||
systems, footprint, predictable performance, and power consumption are
|
||||
primary concerns that remain troublesome.
|
||||
%Database applications that must scale up to large numbers of
|
||||
%independent, self-administering desktop installations will be
|
||||
%problematic unless a number of open research problems are solved.
|
||||
|
||||
The survey also provides evidence that SQL itself is problematic.
|
||||
While SQL serves some classes of applications well, it is
|
||||
Although SQL serves some classes of applications well, it is
|
||||
often inadequate for algorithmic and hierarchical computing tasks.
|
||||
|
||||
Finally, complete, modern database
|
||||
implementations are often incomprehensible, and border on
|
||||
irreproducable, hindering further research. After making these
|
||||
points, the study concludes by suggesting the adoption of ``RISC''
|
||||
style database architectures, both as a research and as an
|
||||
style database architectures, both as a research and an
|
||||
implementation tool~\cite{riscDB}.
|
||||
|
||||
%For example, large scale application such as web search, map services,
|
||||
|
@ -295,14 +311,14 @@ and {\em physical database models}~\cite{batoryPhysical}.
|
|||
|
||||
Conceptual-to-internal mappings and physical database models were
|
||||
discovered during an early survey of database implementations. Mappings
|
||||
desribe the computational primitives upon which client applications must
|
||||
describe the computational primitives upon which client applications must
|
||||
be implemented. Physical database models define the on-disk layout used
|
||||
by a system in terms of data layouts and representations that are commonly
|
||||
used by relational and navigational database implementations.
|
||||
|
||||
Both concepts are fundamentally incompatible with a general storage
|
||||
implementation. By definition, a database server encodes both
|
||||
concepts, while transaction processing libraries mange to avoid
|
||||
concepts, while transaction processing libraries manage to avoid
|
||||
conceptual mappings. \yad's novelty stems from the fact that it avoids
|
||||
both concepts, while making it easy for applications to incorporate results from the database
|
||||
literature.
|
||||
|
@ -341,7 +357,7 @@ Our work's primary concern is to support systems beyond database
|
|||
implementations. If we were to follow the database toolkit approach,
|
||||
we would proceed by developing a framework that adequately encodes the
|
||||
set of all abstract data types and all algorithms that system software
|
||||
designers make use of. Finally, we would describe a framework that is
|
||||
designers require. Finally, we would describe a framework that is
|
||||
capable of encoding all conceivable system software designs, and
|
||||
encode stanadard, intechangable interfaces to each type of component
|
||||
in our framework.
|
||||
|
@ -351,9 +367,8 @@ absurd. However, similar approachs have been extremeley successful
|
|||
for well-understood, well-defined classes of applications. In
|
||||
particular, it has been highly successful in the design of systems
|
||||
that perform limited types of computations over particular classes of
|
||||
data. Much of the database literature is based upon this idea, and
|
||||
continues to successfully improve the state of computer science, and
|
||||
is the basis of the highly sucessful database industry.
|
||||
data. Much of the database literature is based upon this idea, as is the
|
||||
highly sucessful database industry.
|
||||
|
||||
Clearly, however, this approach is inappropriate for the design of
|
||||
general purpose components for system developers, or for applications
|
||||
|
@ -366,7 +381,7 @@ Instead, it leaves decisions regarding abstract data types and
|
|||
algorithm design to system developers or language designers. For
|
||||
instance, while \yad has no concept of object oriented data types, two
|
||||
radically different approaches toward object persistance have been
|
||||
implemented on top of it.~\ref{oasys}
|
||||
implemented on top of it~\ref{oasys}.
|
||||
|
||||
We could have just as easily written a persistance mechanism for a
|
||||
functional programming language, or a particular application (such as
|
||||
|
@ -391,7 +406,7 @@ applications to build mappings on top of the primitives it provides,
|
|||
physical database model. Therefore, while Berkeley DB could be implemented on top
|
||||
of \yad, Berkeley DB cannot support the primitives provided by \yad.
|
||||
|
||||
Genesis,~\cite{genesis} an early database toolkit, was built in terms
|
||||
Genesis~\cite{genesis}, an early database toolkit, was built in terms
|
||||
of interchangable primitives that implemented the interfaces of an
|
||||
early database implementation model. It built upon the idea of
|
||||
conceptual mappings described above, and the physical database model
|
||||
|
@ -407,10 +422,10 @@ Subsequent database toolkit work builds upon these foundations,
|
|||
Exodus~\cite{exodus} and Starburst~\cite{starburst} are notable
|
||||
examples, and incorporated a number of ideas that will be referred to
|
||||
later in this paper. Although further discussion is beyond the scope
|
||||
of this paper, object oriented database systems, and relational
|
||||
databases with support for user definable abstract data types (such as
|
||||
of this paper, object-oriented database systems, and relational
|
||||
databases with support for user-definable abstract data types (such as
|
||||
in Postgres~\cite{postgres}) were the primary competitors to these
|
||||
database toolkits work.
|
||||
database toolkits.
|
||||
|
||||
Fundamentally, all of these systems allowed users to quickly define
|
||||
new DBMS software by defining some abstract data types and often index
|
||||
|
@ -441,7 +456,7 @@ atomically applicable changes to the page file. Of course,
|
|||
applications that wish to reason in terms of linksets and simple files
|
||||
are free to do so.
|
||||
|
||||
We reget forcing applications to arrange for updates to be atomic, but
|
||||
We regret forcing applications to arrange for updates to be atomic, but
|
||||
this restriction is fundamental if we wish to support concurrent
|
||||
transactions, durability and recovery using conventional hardware
|
||||
systems. In Section~\ref{nestedTopActions} we explain how a set of
|
||||
|
@ -459,8 +474,8 @@ platform, and to address issues that affect modern
|
|||
databases, such as automatic performance tuning, and reducing the
|
||||
effort required to implement a new database system~\cite{riscDB}.
|
||||
|
||||
While we agree with the motivations behind RISC databases, instead of
|
||||
building a modular database, we seek to build a module that allows
|
||||
Although we agree with the motivations behind RISC databases, instead of
|
||||
building a modular database, we seek to build a system that allows
|
||||
programmers to avoid databases.
|
||||
|
||||
|
||||
|
@ -468,12 +483,12 @@ programmers to avoid databases.
|
|||
|
||||
Berkeley DB is a highly successful alternative to conventional
|
||||
database design. At its core, it provides the physical database, or
|
||||
relational storage system of a conventional database server.
|
||||
the relational storage system of a conventional database server.
|
||||
|
||||
This module focuses on providing fully transactional data storage with
|
||||
B-Tree and hashtable based indexes. Berkeley DB also provides some
|
||||
support for application specific access methods, as did Genesis, and
|
||||
the database toolkits that succeeded it.~\cite{libtp} Finally,
|
||||
the database toolkits that succeeded it~\cite{libtp}. Finally,
|
||||
Berkeley DB allows applications that need to modify the recovery
|
||||
semantics of Berkeley DB, or otherwise tweak the way its
|
||||
write-ahead-logging protocol works to pass flags via its API.
|
||||
|
|
Loading…
Reference in a new issue