All sections are now filled in again. (What else is missing???)

2006-04-24 17:33:06 +00:00 · 2006-04-24 17:33:06 +00:00 · 5a265805f4
commit 5a265805f4
parent 63c031e34c
1 changed files with 48 additions and 17 deletions
--- a/doc/paper3/LLADD.tex
+++ b/doc/paper3/LLADD.tex
@ -1266,17 +1266,18 @@ If requests can be partitioned in a natural way, load
 balancing can be implemented by splitting requests across many nodes.
 Similarly, a node can easily service streams of requests from multiple
 nodes by combining them into a single log, and processing the log
-using operaiton implementations.  
+using operaiton implementations.  For example, this type of optimization 
 is used by RVM's log-merging operations~\cite{rvm}.
 Furthermore, application-specific
 procedures that are analagous to standard relational algebra methods
 (join, project and select) could be used to efficiently transform the data
-before it reaches the page file, while it is layed out sequentially
+while it is still layed out sequentially
 in non-transactional memory.
-Note that read-only operations do not necessarily generate log
+%Note that read-only operations do not necessarily generate log
-entries.  Therefore, applications may need to implement custom
+%entries.  Therefore, applications may need to implement custom
-operations to make use of the ideas in this section.
+%operations to make use of the ideas in this section.
 Although \yad has rudimentary support for a two-phase commit based
 cluster hash table, we have not yet implemented networking primitives for logical logs.
@ -1284,20 +1285,48 @@ Therefore, we implemented a single node log reordering scheme that increases req
 during the traversal of a random graph.  The graph traversal system
 takes a sequence of (read) requests, and partitions them using some
 function.  It then proceses each partition in isolation from the
-others.  We considered two partitioning functions.  The first, partitions the
+others.  We considered two partitioning functions.  The first divides the page file
-requests according to the hash of the node id they refer to, and would be useful for load balancing over a network.
+up into equally sized contiguous regions, which enables locality.  The second takes the hash
-(We expect the early phases of such a traversal to be bandwidth, not
+of the page's offset in the file, which enables load balancing.
-latency limited, as each node would stream large sequences of
+%%  The second policy is interesting
-asynchronous requests to the other nodes.) 
+%The first, partitions the
 %requests according to the hash of the node id they refer to, and would be useful for load balancing over a network.
 %(We expect the early phases of such a traversal to be bandwidth, not
 %latency limited, as each node would stream large sequences of
 %asynchronous requests to the other nodes.) 
 The second partitioning function, which was used in our benchmarks,
-partitions requests by their position in the page
+partitions requests by their position in the page file.  We chose the
-file.  We ran two experiments.  The first, presented in Figure~\ref{fig:oo7} is loosely based on the oo7 database benchmark.~\cite{oo7}.  The second explicitly measures the effect of graph locality on our optimization. (Figure~\ref{fig:hotGraph})  When the graph has good locality, a normal depth first search
+position size so that each partition can fit in \yads buffer pool,
-traversal and the prioritized traversal performs well.  As locality
+ensuring locality.
-decreases, the partitioned traversal algorithm's performance degrades
+
-less than the naive traversal.
+We ran two experiments.  Both stored a graph of fixed size objects in
 the growable array implementation that is used as our linear
 hashtable's bucket list.
 The first experiment (Figure~\ref{fig:oo7})
 is loosely based on the oo7 database benchmark.~\cite{oo7}.  We
 hardcode the out-degree of each node, and use a directed graph.  OO7
 constructs graphs by by first connecting nodes together into a ring.
 It then randomly adds edges between the nodes until the desired
 out-degree is obtained.  This structure ensures graph connectivity.
 If the nodes are laid out in ring order on disk, it also ensures that
 one edge from each node has good locality while the others generally
 have poor locality.
 The second experiment explicitly measures the effect of graph locality
 on our optimization. (Figure~\ref{fig:hotGraph}) It extends the idea
 of a hot set to graph generation.  Each node has a distinct hot set
 which includes the 10\% of the nodes that are closest to it in ring
 order.  The remaining nodes are in the cold set.  We use random edges
 instead of ring edges for this test.  This does not ensure graph
 connectivity, but we used the same random seeds for the two systems.
 When the graph has good locality, a normal depth first search
 traversal and the prioritized traversal both performs well.  The
 prioritied traversal is slightly slower due to the overhead of extra
 log manipulation. As locality decreases, the partitioned traversal
 algorithm's outperforms the naive traversal.
 \rcs{ This really needs more experimental setup... look at older draft! }
 \subsection{LSN-Free pages}
 \label{sec:zeroCopy}
@ -1480,7 +1509,9 @@ provided us with invaluable feedback.
 Additional information, and \yads source code is available at:
 \begin{center}
-{\tt http://\yad.sourceforge.net/}
+%{\tt http://www.cs.berkeley.edu/sears/\yad/}
 {\small{\tt http://www.cs.berkeley.edu/\ensuremath{\sim}sears/\yad/}}
 %{\tt http://www.cs.berkeley.edu/sears/\yad/}
 \end{center}
 {\footnotesize \bibliographystyle{acm}