From 79e26dda608c65da07dc0ce33a9af54b00a7721f Mon Sep 17 00:00:00 2001 From: Sears Russell Date: Fri, 25 Mar 2005 03:46:16 +0000 Subject: [PATCH] Wrote description of graph traversal.. --- doc/paper2/LLADD.tex | 101 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 94 insertions(+), 7 deletions(-) diff --git a/doc/paper2/LLADD.tex b/doc/paper2/LLADD.tex index 5f5d57c..bdd7611 100644 --- a/doc/paper2/LLADD.tex +++ b/doc/paper2/LLADD.tex @@ -1919,7 +1919,7 @@ This section uses: \item{Custom recovery and checkpointing semantics to maintain correctness} \end{enumerate} -\section{Transitive closure\label{TransClos}} +\section{Graph Traversal\label{TransClos}} Database servers (and most transactional storage systems) are not designed to handle large graph structures well. Typically, each edge @@ -1950,15 +1950,102 @@ it is simply a set to a constant value by a graph traversal.) We implement a ``naive'' graph traversal algorithm that uses depth first search to find all nodes that are reachable from node zero. This algorithm (predictably) consumes a large amount of memory, as -nothing stops it from placing the entire graph upon its stack. For -the purposes of this section, which focuses on page access locality, -we ignore the amount of memory utlization used to store stacks and -worklists, as they can vary greatly from application to application, -but we note that the simple depth first search algorithm is at least -as bad we +nothing stops it from placing the entire graph upon its stack. + +For the purposes of this section, which focuses on page access +locality, we ignore the amount of memory utilization used to store +stacks and worklists, as they can vary greatly from application to +application, but we note that the memory utilization of the simple +depth first search algorithm is certainly no better than the algorithm +presented in the next section. + +Also, for simplicity, we do not apply any of the optimizations in +Section~\ref{OASYS}. This allows our performance comparison to +cleanly measure the the optimization presented here. \subsection {Request Reordering for Locality} +General graph structures may have no intrinsic locality. If such a +graph is too large to fit into memory, basic graph operations such as +edge traversal become very expensive, which makes many algorithms over +these structures intractible in practice. In this section, we +describe how \yad's primatives provide a natural way to introduce +physical locality into a sequence of such requests. These primatives +are general and support a wide class of optimizations which we discuss +before presenting benchmarking results for a sample graph traversal +algorithm. + +\yad's wrapper functions translate high level (logical) application +requests into lower level (physiological) log entries. These +physiological log entries generally include a logical undo, +(Section~\ref{nested-top-actions}) which simply invokes the logical +inverse of the application request. Since the logical inverse of most +application request is another application request, we can {\em reuse} our +logging format and wrapper functions to implement a purely logical log. + +For our graph traversal algorithm we use a {\em log multiplexer} to +route entries from a single log into many sub-logs according to the +page that each entry refers to. This is easy to do with the Array +List reperesentation that we chose for our graph, since Array List +provides a publicly accessible function that maps from array offset to +a $(page, slot, size)$ triple. + +The logical log allows us to insert log entries that are independent +of the phyisical location of the data they refer to. However, we are +interested in exploiting the commutivity of the graph traversal +operation, and saving the logical offset would not provide us with any +obvious benefit. Therefore, we place physical node addresses into the +in-memory log. + +We considered a number of multiplexing policies and present two +particularly interesting ones here. The first divides the page file +up into equally sized contiguous regions. The second takes the hash +of the page's offset in the file. The second policy is interesting +because it reduces the effect of locality (or lack thereof) between +the pages that store the graph, while the first better exploits any +locality intrinsic to the graph's layout on disk. + +Requests are continuously consumed by a process that empties each of +the multiplexer's output queues one at a time. Instead of following +graph edges immediately, the targets of edges leaving each node are +simply pushed into the multiplexer's input queue. The number of +multiplexer output queues is chosen so that each queue addresses a +subset of the page file that can fit into cache. Therefore, if the +queue contains enough entries then its locality set will be swapped +in, allowing requests to be serviced from cache. When the +multiplexer's queues contain no more entries, the traversal is +complete. + +Although this algorithm may seem complex, it is essentially just a +queue based breadth first search implementations, except that the +queue reorders requests in a way that attempts to establish and +maintain disk locality. + +\rcs{ This belongs in future work....} + +The purely logical log has some interesting properties. It can be +{\em page file independent} as all information expresed within it is +expressed in application terms instead of in terms of internal +representations. This means the log entry could be sent over the +network and applied on a different system, providing a simple +generalization of log based replication schemes. + +While in transit, various transformations could be applied. LRVM's +log merging optimizations~\cite{LRVM} are one such possibility. +Replication and partitioning schemes are another possibility. If a +lock manager is not in use and a consistent order is imposed upon +requests,\footnote{and we assume all failures are recoverable or +masked with redundancy} then we can remove replicas' ability to +unilaterally abort transactions, allowing requests to commit as they +propagate through the network, but before actually being applied to +page files. + +%However, most of \yad's current functionality focuses upon the single +%node case, so we decided to choose a single node optimization for this +%section, and leave networked logical logging to future work. To this +%end, we implemented a log multiplexing primative which splits log +%entries into multiple logs according to the value returned by a +%callback function. (Figure~\ref{fig:mux}) \subsection {Performance Evaluation}