diff --git a/doc/paper3/LLADD.tex b/doc/paper3/LLADD.tex index d40ef77..1916dab 100644 --- a/doc/paper3/LLADD.tex +++ b/doc/paper3/LLADD.tex @@ -25,7 +25,7 @@ % TARDIS: Atomic, Recoverable, Datamodel Independent Storage % EAB: flex, basis, stable, dura -\newcommand{\yad}{Lemon\xspace} +\newcommand{\yad}{Stasys\xspace} \newcommand{\oasys}{Oasys\xspace} \newcommand{\eab}[1]{\textcolor{red}{\bf EAB: #1}} @@ -59,9 +59,9 @@ UC Berkeley %\thispagestyle{empty} -\subsection*{Abstract} +%\subsection*{Abstract} -The is an increasing need to manage data well in a wide variety of +{\em There is an increasing need to manage data well in a wide variety of systems, including robust support for atomic durable concurrent transactions. Databases provide the default solution, but force applications to interact via SQL and to forfeit control over data @@ -69,7 +69,7 @@ layout and access mechanisms. We argue there is a gap between DBMSs and file sy \yad is a storage framework that incorporates ideas from traditional write-ahead-logging storage algorithms and file systems, -while providing applications with flexible control over data structure, layout and performance vs. robustness tradeoffs. +while providing applications with flexible control over data structures, layout, and performance vs. robustness tradeoffs. % increased control over their %underlying modules. Generic transactional storage systems such as SQL %and BerkeleyDB serve many applications well, but impose constraints @@ -90,9 +90,13 @@ improved performance to applications. We present examples that make use of custom access methods, modifed buffer manager semantics, direct log file manipulation, and -LSN-free pages that facilitate zero-copy optimizations, and discusses +LSN-free pages that facilitate zero-copy optimizations, and discuss the composability of these extensions. +\eab{performance} + +} + %We argue that our ability to support such a diverse range of %transactional systems stems directly from our rejection of %assumptions made by early database designers. These assumptions @@ -113,13 +117,14 @@ the composability of these extensions. %existing systems. + \section{Introduction} As our reliance on computing infrastructure has increased, the need for robust data management has increased greatly, as has the range of applications and systems that need it. Traditionally, data management -has been the province of database management systems, which although -well-suited to enterprise applications, leads to poor support for a +has been the province of database management systems (DBMSs), which although +well-suited to enterprise applications, lead to poor support for a wide-range systems including grid and scientific computing, bioinformatics, search engines, version control, and workflow applications. These applications need transactions but don't fit well @@ -132,13 +137,15 @@ A typical example of this mismatch is in the support for persistent objects in Java, called {\em Enterprise Java Beans} (EJB). In a typical usage, an array of objects is made persistent by mapping each object to a row in a table (or sometimes multiple -tables~\cite[xxx]) and then issuing queries to keep the objects and +tables~\cite{xxx}) and then issuing queries to keep the objects and rows consistent. A typical update must confirm it has the current version, modify the object, write out a serialized version using the SQL update command and commit. This is an awkward and slow mechanism; we show up to a 5x speedup over a MySQL implementation that is optimized for single-threaded, local access (Section XXX). +Add bioinformatics = Perl + files example? + \eat{ Examples of real world systems that currently fall into this category are web search engines, document repositories, large-scale web-email @@ -146,7 +153,6 @@ services, map and trip planning services, ticket reservation systems, photo and video repositories, bioinformatics, version control systems, workflow applications, CAD/VLSI applications and directory services. - In short, we believe that a fundamental architectural shift in transactional storage is necessary before general purpose storage systems are of practical use to modern applications. @@ -178,15 +184,11 @@ This paper presents \yad, a library that provides transactional storage at a level of abstraction as close to the hardware as possible. The library can support special purpose, transactional storage interfaces as well as ACID database-style interfaces to -abstract data models. - -Notably, \yad incorporates many existing technologies from the storage -communities, and allows applications to incorporate appropriate -subsystems as necessary. A partial open-source implementation of the -ideas presented below is available; performance numbers are provided -when possible. - -Taken from sosp: +abstract data models. \yad incororates techniques from the databases +(e.g. write-ahead logging) and systems (e.g. zero-copy techniques). +Our goal is to combine the flexibility and layering of low-level +abstractions typical for systems work, with the complete semantics +that exemplify the database field. By {\em flexible} we mean that \yad{} can implement a wide range of transactional data structures, that it can support a variety @@ -206,18 +208,17 @@ to meet and form the {\em raison d'\^etre} for \yad{}: the framework delivers these properties as reusable building blocks for systems to implement complete transactions. ---- +Through examples, and their good performance, we show how \yad{} +support a wide range of uses that in the database gap, including +persistent objects (roadmap?), graph or XML apps, and recoverable +virtual memory~\cite{lrvm}. An (early) open-source implementation of +the ideas presented below is available. -\eab{need to talk about positive examples: LRVM, Berk DB, windows registry? Grid FS from Wisconsin} +\eab{others? CVS, windows registry, berk DB, Grid FS?} + +roadmap? -Applications that have only recently begun to make use of high-level -database features include XML based systems, object persistance -mechanisms, and enterprise management systems (notably, SAP R/3). - - -**We've explained why the sky is falling. Now, explain why \yad is -so good. (Take ideas from old paper.)** \section{\yad is not a Database} @@ -229,8 +230,8 @@ database systems and research projects for at least 25 years. The section concludes with a discussion of database systems that attempt to address these problems. Although these systems were -successful in many respects, they failed to address the broad class of -software we are interested in. +successful in many respects, they fundamentally aim to implement a +data model, rather than build transactions from the bottom up. \eab{move this?} \subsection{The database abstraction} @@ -240,42 +241,40 @@ abstractions they present. For instance, relational database systems implement the relational model~\cite{cobb}, object oriented databases implement object abstractions, XML databases implement hierarchical datasets, and so on. Before the relational model, -navigational databases implemented pointer and record -based data models. +navigational databases implemented pointer- and record-based data models. An early survey of database implementations sought to enumerate the fundamental components used by database system implementors. This survey was performed due to difficulties in extending database systems into new application domains. The survey divided internal database -routines into two broad modules: conceptual -mappings~\cite{batoryConceptual} and the physical -database~\cite{batoryPhysical} model. +routines into two broad modules: {\em conceptual +mappings}~\cite{batoryConceptual} and the {\em physical +database}~\cite{batoryPhysical} model. A conceptual mapping might translate a relation into a set of keyed -tuples. A physical model could then translate a set of tuples into an +tuples. A physical model would then translate a set of tuples into an on-disk B-Tree, and provide support for iterators and range-based query operations. It is the responsibility of a database implementor to choose a set of -conceptual mappings that implement the desired higher level +conceptual mappings that implement the desired higher-level abstraction (such as the relational model). The physical data model is chosen to efficiently support the set of mappings that are built on top of it. -{\em The key observation of this paper is that no known physical data model +{\em A key observation of this paper is that no known physical data model can support more than a small percentage of today's applications.} Instead of attempting to create such a model after decades of database research has failed to produce one, we opt to provide a transactional storage model that mimics the primitives provided by modern hardware. This makes it easy for system designers to implement most of the data -models that the underlying hardware is capable of supporting, or to -abandon the database approach entirely, and forgo the use of a +models that the underlying hardware can support, or to +abandon the data model approach entirely, and forgo the use of a structured physical model or conceptual mappings. \subsection{Extensible databases} - Genesis~\cite{genesis}, an early database toolkit, was built in terms of a physical data model, and the conceptual mappings desribed above. It was designed allow database implementors to easily swap out @@ -284,11 +283,13 @@ Like subsequent systems (including \yad), it allowed it users to implement custom operations. Subsequent extensible database work builds upon these foundations. -The Exodus~\cite{exodus} database toolkit was the successor to +For example, the Exodus~\cite{exodus} database toolkit was the successor to Genesis. It supported the autmatic generation of query optimizers and execution engines based upon abstract data type definitions, access methods and cost models provided by its users. +\eab{move this next paragraph to RW?} + Starburst's~\cite{starburst} physical data model consisted of {\em storage methods}. Storage methods supported {\em attachment types} that allowed triggers and active databases to be implemented. An @@ -304,7 +305,7 @@ object-oriented database systems, and relational databases with support for user-definable abstract data types (such as in Postgres~\cite{postgres}) were the primary competitors to extensible database toolkits. Ideas from all of these systems have been -incorporated into the mechanisms that support user definable types in +incorporated into the mechanisms that support user-definable types in current database systems. One can characterise the difference between database toolkits and @@ -312,16 +313,12 @@ extensible database servers in terms of early and late binding. With a database toolkit, new types are defined when the database server is compiled. In today's object-relational database systems, new types are defined at runtime. Each approach has its advantages. However, -both types of systems attempted to provide similar levels of -abstraction and flexibility to their end users. - -Therefore, the database toolkit approach is inappropriate for -applications not well serviced by modern database systems. +both types of systems aim to extend a high-level data model with new abstract data types, and thus are quite limited in the range of new applications they support. Not surprisingly, this kind of extensibility has had little impact on the range of applications we listed above. \subsection{Berkeley DB} System R was the first relational database implementation, and was -based upon a clean separation between it's storage system and its +based upon a clean separation between its storage system and its query processing engine. In fact, it supported a simple navigational interface to the storage subsystem. To this day, database systems are built using this sort of architecture. @@ -342,48 +339,36 @@ primitives. We have already discussed the limitations of this approach. With the exception of the direct comparison of the two systems, none of the \yad applications presented in Section~\ref{extensions} are efficiently -supported by Berkeley DB. This is a result of Berkeley DB's, +supported by Berkeley DB. This is a result of Berkeley DB's assumptions regarding workloads and decisions regarding low level data -representation. While Berkeley DB could be built on top of \yad, +representation. Thus, although Berkeley DB could be built on top of \yad, Berkeley DB is too specialized to support \yad. -\subsection{Boxwood} +\eab{for BDB, should we say that it still has a data model?} -The Boxwood system provides a networked, fault-tolerant transactional -B-Tree and ``Chunk Manager.'' We believe that \yad is an interesting -complement to such a system, especially given \yad's focus on -intelligence and optimizations within a single node, and Boxwoods -focus on multiple node systems. In particular, when implementing -applications with predictable locality properties, it would be -interesting to explore extensions to the Boxwood approach that make -use of \yad's customizable semantics (Section~\ref{wal}), and fully logical logging -mechanism. (Section~\ref{logging}) %cover P2 (the old one, not "Pier 2" if there is time... \subsection{Better databases} +The database community is also aware of this gap. A recent survey~\cite{riscDB} enumerates problems that plague users of -state-of-the-art database systems. - -The survey finds that database implementations fail to support the +state-of-the-art database systems, and finds that database implementations fail to support the needs of modern systems. In large systems, this manifests itself as managability and tuning issues that prevent databases from predictably servicing diverse, large scale, declartive, workloads. - On small devices, footprint, predictable performance, and power consumption are primary, concerns that database systems do not address. -Midsize deployments, such as desktop installations, must run without -user intervention, but self-tuning, self-administering database -servers are still an area of active research. +%Midsize deployments, such as desktop installations, must run without +%user intervention, but self-tuning, self-administering database +%servers are still an area of active research. The survey argues that these problems cannot be adequately addressed without a fundamental shift in the architectures that underly database systems. Complete, modern database implementations are generally incomprehensible and irreproducable, hindering further research. The study concludes -by suggesting the adoption of ``RISC'' -style database architectures, both as a research and as an +by suggesting the adoption of ``RISC''-style database architectures, both as a research and an implementation tool~\cite{riscDB}. RISC databases have many elements in common with @@ -398,13 +383,12 @@ effort required to implement a new database system~\cite{riscDB}. We agree with the motivations behind RISC databases, and that a need for improvement in database technology exists. In fact, is our hope that our system will mature to the point where it can support -competitive relational database storage subsystems. However this is +a competitive relational database. However this is not our primary goal. - Instead, we are interested in supporting applications that derive little benefit from database abstractions, but that need reliable storage. Therefore, instead of building a modular database, we seek -to build a system that allows programmers to avoid databases. +to build a system that enables a wider range of data management options. %For example, large scale application such as web search, map services, %e-mail use databases to store unstructured binary data, if at all. @@ -983,10 +967,25 @@ concurrent, durable data structure using RVM. We plan to add RVM style transactional memory to \yad in a way that is compatible with fully concurrent collections such as hash tables and tree structures. + +\section{Related Work?} + +The Boxwood system provides a networked, fault-tolerant transactional +B-Tree and ``Chunk Manager.'' We believe that \yad is an interesting +complement to such a system, especially given \yad's focus on +intelligence and optimizations within a single node, and Boxwoods +focus on multiple node systems. In particular, when implementing +applications with predictable locality properties, it would be +interesting to explore extensions to the Boxwood approach that make +use of \yad's customizable semantics (Section~\ref{wal}), and fully logical logging +mechanism. (Section~\ref{logging}) + \section{Conclusion} \section{Acknowledgements} +mike demmer, others? + \section{Availability} Additional information, and \yad's source code is available at: