Wrote description of graph traversal..
This commit is contained in:
parent
d4534cea23
commit
79e26dda60
1 changed files with 94 additions and 7 deletions
|
@ -1919,7 +1919,7 @@ This section uses:
|
|||
\item{Custom recovery and checkpointing semantics to maintain correctness}
|
||||
\end{enumerate}
|
||||
|
||||
\section{Transitive closure\label{TransClos}}
|
||||
\section{Graph Traversal\label{TransClos}}
|
||||
|
||||
Database servers (and most transactional storage systems) are not
|
||||
designed to handle large graph structures well. Typically, each edge
|
||||
|
@ -1950,15 +1950,102 @@ it is simply a set to a constant value by a graph traversal.)
|
|||
We implement a ``naive'' graph traversal algorithm that uses depth
|
||||
first search to find all nodes that are reachable from node zero.
|
||||
This algorithm (predictably) consumes a large amount of memory, as
|
||||
nothing stops it from placing the entire graph upon its stack. For
|
||||
the purposes of this section, which focuses on page access locality,
|
||||
we ignore the amount of memory utlization used to store stacks and
|
||||
worklists, as they can vary greatly from application to application,
|
||||
but we note that the simple depth first search algorithm is at least
|
||||
as bad we
|
||||
nothing stops it from placing the entire graph upon its stack.
|
||||
|
||||
For the purposes of this section, which focuses on page access
|
||||
locality, we ignore the amount of memory utilization used to store
|
||||
stacks and worklists, as they can vary greatly from application to
|
||||
application, but we note that the memory utilization of the simple
|
||||
depth first search algorithm is certainly no better than the algorithm
|
||||
presented in the next section.
|
||||
|
||||
Also, for simplicity, we do not apply any of the optimizations in
|
||||
Section~\ref{OASYS}. This allows our performance comparison to
|
||||
cleanly measure the the optimization presented here.
|
||||
|
||||
\subsection {Request Reordering for Locality}
|
||||
|
||||
General graph structures may have no intrinsic locality. If such a
|
||||
graph is too large to fit into memory, basic graph operations such as
|
||||
edge traversal become very expensive, which makes many algorithms over
|
||||
these structures intractible in practice. In this section, we
|
||||
describe how \yad's primatives provide a natural way to introduce
|
||||
physical locality into a sequence of such requests. These primatives
|
||||
are general and support a wide class of optimizations which we discuss
|
||||
before presenting benchmarking results for a sample graph traversal
|
||||
algorithm.
|
||||
|
||||
\yad's wrapper functions translate high level (logical) application
|
||||
requests into lower level (physiological) log entries. These
|
||||
physiological log entries generally include a logical undo,
|
||||
(Section~\ref{nested-top-actions}) which simply invokes the logical
|
||||
inverse of the application request. Since the logical inverse of most
|
||||
application request is another application request, we can {\em reuse} our
|
||||
logging format and wrapper functions to implement a purely logical log.
|
||||
|
||||
For our graph traversal algorithm we use a {\em log multiplexer} to
|
||||
route entries from a single log into many sub-logs according to the
|
||||
page that each entry refers to. This is easy to do with the Array
|
||||
List reperesentation that we chose for our graph, since Array List
|
||||
provides a publicly accessible function that maps from array offset to
|
||||
a $(page, slot, size)$ triple.
|
||||
|
||||
The logical log allows us to insert log entries that are independent
|
||||
of the phyisical location of the data they refer to. However, we are
|
||||
interested in exploiting the commutivity of the graph traversal
|
||||
operation, and saving the logical offset would not provide us with any
|
||||
obvious benefit. Therefore, we place physical node addresses into the
|
||||
in-memory log.
|
||||
|
||||
We considered a number of multiplexing policies and present two
|
||||
particularly interesting ones here. The first divides the page file
|
||||
up into equally sized contiguous regions. The second takes the hash
|
||||
of the page's offset in the file. The second policy is interesting
|
||||
because it reduces the effect of locality (or lack thereof) between
|
||||
the pages that store the graph, while the first better exploits any
|
||||
locality intrinsic to the graph's layout on disk.
|
||||
|
||||
Requests are continuously consumed by a process that empties each of
|
||||
the multiplexer's output queues one at a time. Instead of following
|
||||
graph edges immediately, the targets of edges leaving each node are
|
||||
simply pushed into the multiplexer's input queue. The number of
|
||||
multiplexer output queues is chosen so that each queue addresses a
|
||||
subset of the page file that can fit into cache. Therefore, if the
|
||||
queue contains enough entries then its locality set will be swapped
|
||||
in, allowing requests to be serviced from cache. When the
|
||||
multiplexer's queues contain no more entries, the traversal is
|
||||
complete.
|
||||
|
||||
Although this algorithm may seem complex, it is essentially just a
|
||||
queue based breadth first search implementations, except that the
|
||||
queue reorders requests in a way that attempts to establish and
|
||||
maintain disk locality.
|
||||
|
||||
\rcs{ This belongs in future work....}
|
||||
|
||||
The purely logical log has some interesting properties. It can be
|
||||
{\em page file independent} as all information expresed within it is
|
||||
expressed in application terms instead of in terms of internal
|
||||
representations. This means the log entry could be sent over the
|
||||
network and applied on a different system, providing a simple
|
||||
generalization of log based replication schemes.
|
||||
|
||||
While in transit, various transformations could be applied. LRVM's
|
||||
log merging optimizations~\cite{LRVM} are one such possibility.
|
||||
Replication and partitioning schemes are another possibility. If a
|
||||
lock manager is not in use and a consistent order is imposed upon
|
||||
requests,\footnote{and we assume all failures are recoverable or
|
||||
masked with redundancy} then we can remove replicas' ability to
|
||||
unilaterally abort transactions, allowing requests to commit as they
|
||||
propagate through the network, but before actually being applied to
|
||||
page files.
|
||||
|
||||
%However, most of \yad's current functionality focuses upon the single
|
||||
%node case, so we decided to choose a single node optimization for this
|
||||
%section, and leave networked logical logging to future work. To this
|
||||
%end, we implemented a log multiplexing primative which splits log
|
||||
%entries into multiple logs according to the value returned by a
|
||||
%callback function. (Figure~\ref{fig:mux})
|
||||
|
||||
\subsection {Performance Evaluation}
|
||||
|
||||
|
|
Loading…
Reference in a new issue