Wrote description of graph traversal..

This commit is contained in:
Sears Russell 2005-03-25 03:46:16 +00:00
parent d4534cea23
commit 79e26dda60

View file

@ -1919,7 +1919,7 @@ This section uses:
\item{Custom recovery and checkpointing semantics to maintain correctness} \item{Custom recovery and checkpointing semantics to maintain correctness}
\end{enumerate} \end{enumerate}
\section{Transitive closure\label{TransClos}} \section{Graph Traversal\label{TransClos}}
Database servers (and most transactional storage systems) are not Database servers (and most transactional storage systems) are not
designed to handle large graph structures well. Typically, each edge designed to handle large graph structures well. Typically, each edge
@ -1950,15 +1950,102 @@ it is simply a set to a constant value by a graph traversal.)
We implement a ``naive'' graph traversal algorithm that uses depth We implement a ``naive'' graph traversal algorithm that uses depth
first search to find all nodes that are reachable from node zero. first search to find all nodes that are reachable from node zero.
This algorithm (predictably) consumes a large amount of memory, as This algorithm (predictably) consumes a large amount of memory, as
nothing stops it from placing the entire graph upon its stack. For nothing stops it from placing the entire graph upon its stack.
the purposes of this section, which focuses on page access locality,
we ignore the amount of memory utlization used to store stacks and For the purposes of this section, which focuses on page access
worklists, as they can vary greatly from application to application, locality, we ignore the amount of memory utilization used to store
but we note that the simple depth first search algorithm is at least stacks and worklists, as they can vary greatly from application to
as bad we application, but we note that the memory utilization of the simple
depth first search algorithm is certainly no better than the algorithm
presented in the next section.
Also, for simplicity, we do not apply any of the optimizations in
Section~\ref{OASYS}. This allows our performance comparison to
cleanly measure the the optimization presented here.
\subsection {Request Reordering for Locality} \subsection {Request Reordering for Locality}
General graph structures may have no intrinsic locality. If such a
graph is too large to fit into memory, basic graph operations such as
edge traversal become very expensive, which makes many algorithms over
these structures intractible in practice. In this section, we
describe how \yad's primatives provide a natural way to introduce
physical locality into a sequence of such requests. These primatives
are general and support a wide class of optimizations which we discuss
before presenting benchmarking results for a sample graph traversal
algorithm.
\yad's wrapper functions translate high level (logical) application
requests into lower level (physiological) log entries. These
physiological log entries generally include a logical undo,
(Section~\ref{nested-top-actions}) which simply invokes the logical
inverse of the application request. Since the logical inverse of most
application request is another application request, we can {\em reuse} our
logging format and wrapper functions to implement a purely logical log.
For our graph traversal algorithm we use a {\em log multiplexer} to
route entries from a single log into many sub-logs according to the
page that each entry refers to. This is easy to do with the Array
List reperesentation that we chose for our graph, since Array List
provides a publicly accessible function that maps from array offset to
a $(page, slot, size)$ triple.
The logical log allows us to insert log entries that are independent
of the phyisical location of the data they refer to. However, we are
interested in exploiting the commutivity of the graph traversal
operation, and saving the logical offset would not provide us with any
obvious benefit. Therefore, we place physical node addresses into the
in-memory log.
We considered a number of multiplexing policies and present two
particularly interesting ones here. The first divides the page file
up into equally sized contiguous regions. The second takes the hash
of the page's offset in the file. The second policy is interesting
because it reduces the effect of locality (or lack thereof) between
the pages that store the graph, while the first better exploits any
locality intrinsic to the graph's layout on disk.
Requests are continuously consumed by a process that empties each of
the multiplexer's output queues one at a time. Instead of following
graph edges immediately, the targets of edges leaving each node are
simply pushed into the multiplexer's input queue. The number of
multiplexer output queues is chosen so that each queue addresses a
subset of the page file that can fit into cache. Therefore, if the
queue contains enough entries then its locality set will be swapped
in, allowing requests to be serviced from cache. When the
multiplexer's queues contain no more entries, the traversal is
complete.
Although this algorithm may seem complex, it is essentially just a
queue based breadth first search implementations, except that the
queue reorders requests in a way that attempts to establish and
maintain disk locality.
\rcs{ This belongs in future work....}
The purely logical log has some interesting properties. It can be
{\em page file independent} as all information expresed within it is
expressed in application terms instead of in terms of internal
representations. This means the log entry could be sent over the
network and applied on a different system, providing a simple
generalization of log based replication schemes.
While in transit, various transformations could be applied. LRVM's
log merging optimizations~\cite{LRVM} are one such possibility.
Replication and partitioning schemes are another possibility. If a
lock manager is not in use and a consistent order is imposed upon
requests,\footnote{and we assume all failures are recoverable or
masked with redundancy} then we can remove replicas' ability to
unilaterally abort transactions, allowing requests to commit as they
propagate through the network, but before actually being applied to
page files.
%However, most of \yad's current functionality focuses upon the single
%node case, so we decided to choose a single node optimization for this
%section, and leave networked logical logging to future work. To this
%end, we implemented a log multiplexing primative which splits log
%entries into multiple logs according to the value returned by a
%callback function. (Figure~\ref{fig:mux})
\subsection {Performance Evaluation} \subsection {Performance Evaluation}