*** empty log message ***
This commit is contained in:
parent
eee21ad6fd
commit
39bf19166e
1 changed files with 137 additions and 47 deletions
|
@ -214,32 +214,57 @@ so good. (Take ideas from old paper.)**
|
||||||
Database research has a long history, including the development of
|
Database research has a long history, including the development of
|
||||||
many technologies that our system builds upon. However, we view \yad
|
many technologies that our system builds upon. However, we view \yad
|
||||||
as a rejection of the fundamental assumptions that underly database
|
as a rejection of the fundamental assumptions that underly database
|
||||||
systems. Here we will focus on lines of research that are
|
systems. In particular, we reject the idea that a general purpose
|
||||||
superficially similar, but distinct from our own, and cite evidence
|
storage sytem should attempt to encode universal data models and
|
||||||
from within the database community that highlights problems with
|
computational paradigms.
|
||||||
systems that attempt to incorporate databases into other systems.
|
|
||||||
|
|
||||||
Of course, database systems have a place in modern software
|
Instead, we are less ambitious and seek to build a storage system that
|
||||||
development and design, and are the best available storage solution
|
provides durable (which often implies transactional) access to the
|
||||||
for many classes of applications. Also, this section refers to work
|
primitives provided by the underlying hardware. To be of practical
|
||||||
that introduces technologies that are crucial to \yad's design; when
|
value, it must be easy to specialize such a system so that it encodes
|
||||||
we claim that prior work is dissimilar to our own, we refer to
|
any of a variety of data models and computational paradigms.
|
||||||
high-level architectural considerations, not low-level details.
|
Otherwise, the system could not easily reused in many environments.
|
||||||
|
We know of no system that adequately achieves these two goals.
|
||||||
|
|
||||||
|
Here, we present a brief history of transactional storage systems, and
|
||||||
|
explain why they fail to achieve \yad's goals. Citations of the
|
||||||
|
technical work upon which our system is based are included below, in
|
||||||
|
the description of \yad's design.
|
||||||
|
|
||||||
|
%Here we will focus on lines of research that are
|
||||||
|
%superficially similar, but distinct from our own, and cite evidence
|
||||||
|
%from within the database community that highlights problems with
|
||||||
|
%systems that attempt to incorporate databases into other systems.
|
||||||
|
|
||||||
|
%Of course, database systems have a place in modern software
|
||||||
|
%development and design, and are the best available storage solution
|
||||||
|
%for many classes of applications. Also, this section refers to work
|
||||||
|
%that introduces technologies that are crucial to \yad's design; when
|
||||||
|
%we claim that prior work is dissimilar to our own, we refer to
|
||||||
|
%high-level architectural considerations, not low-level details.
|
||||||
|
|
||||||
\subsection{Databases as system components}
|
\subsection{Databases as system components}
|
||||||
|
|
||||||
|
A recent survey~\cite{riscDB} enumerates problems that plague users of
|
||||||
A recent survey enumerates problems that plague users of
|
state-of-the-art database systems. It concludes that efficiently optimizing and
|
||||||
state-of-the-art database systems. Efficiently optimizing and
|
|
||||||
consistenly servicing large declarative queries is inherently
|
consistenly servicing large declarative queries is inherently
|
||||||
difficult. This leads to managability and tuning issues that
|
difficult.
|
||||||
prevent databases from effectively servicing diverse, interactive
|
|
||||||
workloads. While SQL serves some classes of applications well, it is
|
The survey finds that database implementations fail to scale to modern systems.
|
||||||
|
This leads to managability and tuning issues that
|
||||||
|
prevent databases from effectively servicing large scale, diverse, interactive
|
||||||
|
workloads.
|
||||||
|
They are also a poor fit for
|
||||||
|
smaller devices, where footprint, predictable performance, and power
|
||||||
|
consumption are primary concerns.
|
||||||
|
Scaling out to large numbers of self-administering desktop
|
||||||
|
installations will be difficult until a number of open research problems are solved.
|
||||||
|
|
||||||
|
The survey provides evidence that SQL itself is problematic.
|
||||||
|
While SQL serves some classes of applications well, it is
|
||||||
often inadequate for algorithmic and hierarchical computing tasks.
|
often inadequate for algorithmic and hierarchical computing tasks.
|
||||||
|
|
||||||
The survey finds that database implementations are also a poor fit for
|
Finally, complete, modern database
|
||||||
smaller devices, where footprint, predictable performance, and power
|
|
||||||
consumption are primary concerns. Finally, complete, modern database
|
|
||||||
implementations are often incomprehensible, and border on
|
implementations are often incomprehensible, and border on
|
||||||
irreproducable, hindering further research. After making these
|
irreproducable, hindering further research. After making these
|
||||||
points, the study concludes by suggesting the adoption of ``RISC''
|
points, the study concludes by suggesting the adoption of ``RISC''
|
||||||
|
@ -261,40 +286,105 @@ implementation tool~\cite{riscDB}.
|
||||||
%was more difficult than implementing from scratch (winfs), scaling
|
%was more difficult than implementing from scratch (winfs), scaling
|
||||||
%down doesn't work (variance in performance, footprint),
|
%down doesn't work (variance in performance, footprint),
|
||||||
|
|
||||||
\subsection{Database toolkits}
|
\subsection{Database Toolkits}
|
||||||
|
|
||||||
Database toolkits are based upon the idea that database
|
\yad is a library that could be used to provide storage primatives to a
|
||||||
implementations can be broken into smaller components with
|
database server. Therefore, one might suppose that \yad is a database
|
||||||
standardized interfaces. Early work in this field surveyed database
|
toolkit. However, such an assumption would be incorrect. Here we
|
||||||
implementations that existed at the time. It casts compoenents of
|
describe the two characteristics that are the essence of database
|
||||||
these implementation in terms of a physical database
|
toolkits: {\em conceptual-to-internal mappings}~\cite{batoryConceptual}
|
||||||
model~\cite{batoryPhysical} and conceptual-to-internal
|
and {\em physical database models}~\cite{batoryPhysical}.
|
||||||
mappings~\cite{batoryConceptual}. These abstractions describe
|
|
||||||
relational database systems, and describe many aspects of subsequent
|
|
||||||
database toolkit research.
|
|
||||||
|
|
||||||
However, these abstractions are built upon assumptions about
|
Conceptual-to-internal mappings and physical database models were
|
||||||
application structure and data layout. At the time of the survey, ten
|
discovered by an early survey of database implementations. Mappings
|
||||||
|
are essentially a model of computation, while physical database models
|
||||||
|
are essentially a model of data layout and representation.
|
||||||
|
|
||||||
|
Both concepts are fundamentally incompatible with a general storage
|
||||||
|
implementation. By definition, a database server encodes both
|
||||||
|
concepts, while transaction processing libraries mange to avoid
|
||||||
|
conceptual mappings. \yad's novelty stems from the fact that it avoids
|
||||||
|
both concepts, while incorporating results from the database
|
||||||
|
literature.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{Conceptual mappings}
|
||||||
|
|
||||||
|
%Database toolkits are based upon the idea that database
|
||||||
|
%implementations can be broken into smaller components with
|
||||||
|
%standardized interfaces.
|
||||||
|
|
||||||
|
%Early work in this field surveyed database
|
||||||
|
%implementations that existed at the time. It casts compoenents of
|
||||||
|
%these implementation in terms of a physical database
|
||||||
|
%model~\cite{batoryPhysical} and conceptual-to-internal
|
||||||
|
%mappings~\cite{batoryConceptual}. These abstractions describe
|
||||||
|
%relational database systems, and describe many aspects of subsequent
|
||||||
|
%database toolkit research.
|
||||||
|
|
||||||
|
%However, these abstractions are built upon assumptions about
|
||||||
|
%application structure and data layout.
|
||||||
|
|
||||||
|
At the time of their introduction, ten
|
||||||
conceptual-to-internal mappings were sufficient to describe existing
|
conceptual-to-internal mappings were sufficient to describe existing
|
||||||
implementation. These mappings included:
|
database systems. These mappings include indexing, encoding
|
||||||
|
(compression, encryption, etc), segmentation (along field boundaries),
|
||||||
|
fragmentation (without regard to fields), $n:m$ pointers, and
|
||||||
|
horizontal partitioning, among others.
|
||||||
|
|
||||||
|
The initial survey postulates that a finite number of such mappings
|
||||||
|
are adequate to describe database implementations. A general purpose
|
||||||
|
database toolkit need only implement each type of mapping in order to
|
||||||
|
encode the set of all conceivable database systems.
|
||||||
|
|
||||||
\begin{itemize}
|
To meet out requirements with this approach, one would first develop a
|
||||||
\item indexing
|
framework that adequately encodes the requirements of {\em every}
|
||||||
\item encoding (compression, encryption, etc)
|
system that manipulates data, and would then define interfaces that
|
||||||
\item transposition
|
support the needs of each implementation of the components specified
|
||||||
\item segmentation (along field boundaries)
|
by the framework.
|
||||||
\item fragmentation (without regard to field boundaries)
|
|
||||||
\item pointers with support for $n:m$ relationships
|
|
||||||
\item horizonatal partitioning
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
Many data manipulation tasks can be cast as mappings from abstract to
|
Put this way, this goal seems absurd. However, this approach has
|
||||||
more concrete representation, and even cleanly partitioned into more
|
been extremeley successful. In fact, much of the
|
||||||
general sets of mappings. In fact, Genesis,~\cite{genesis} an early
|
database literature is devoted to this task and has
|
||||||
database toolkit was built in terms of interchangable primitives that
|
certainly improved the state of computer science. Furthermore, it is the basis for
|
||||||
implemented interfaces that correspond to these interafaces.
|
the highly successful database industry.
|
||||||
|
|
||||||
Similarly, the physical database model partitions storage into simple
|
However, from a practical perspective, current database
|
||||||
|
implementations are already among the most complex
|
||||||
|
software systems ever created, are difficult to understand or
|
||||||
|
reason about, They still only encode a small percentage of
|
||||||
|
the computational and storage primitives in the database
|
||||||
|
literature, which in turn only represents a portion of
|
||||||
|
the computer science literature.
|
||||||
|
|
||||||
|
|
||||||
|
%\begin{itemize}
|
||||||
|
%\item indexing
|
||||||
|
%\item encoding (compression, encryption, etc)
|
||||||
|
%\item transposition
|
||||||
|
%\item segmentation (along field boundaries)
|
||||||
|
%\item fragmentation (without regard to field boundaries)
|
||||||
|
%\item pointers with support for $n:m$ relationships
|
||||||
|
%\item horizonatal partitioning
|
||||||
|
%\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Physical data models}
|
||||||
|
|
||||||
|
As it was initially tempting to say that \yad was a database toolkit,
|
||||||
|
it may now be tempting to claim that \yad implements a physical
|
||||||
|
database model. In this section, we compare \yad to the physical
|
||||||
|
database model of existing toolkits, and show that it supports a wider
|
||||||
|
range of storage technologies than physical database models. In fact,
|
||||||
|
it has no concept of a physical database model, and intentionally
|
||||||
|
allows applications to avoid such concepts as well.
|
||||||
|
|
||||||
|
Genesis,~\cite{genesis} an early database toolkit, was built in terms
|
||||||
|
of interchangable primitives that implemented the interfaces of an
|
||||||
|
early database implementation model. It built upon the idea of
|
||||||
|
conceptual mappings described above, and the physical databse model
|
||||||
|
decribed here.
|
||||||
|
|
||||||
|
The physical database model partitions storage into simple
|
||||||
files, which provide operations associated with key based storage, and
|
files, which provide operations associated with key based storage, and
|
||||||
linksets, which make use of various pointer storage schemes to provide
|
linksets, which make use of various pointer storage schemes to provide
|
||||||
mappings between records in simple files.
|
mappings between records in simple files.
|
||||||
|
|
Loading…
Reference in a new issue