diff --git a/doc/paper3/LLADD.tex b/doc/paper3/LLADD.tex index a55ef74..a1fd07c 100644 --- a/doc/paper3/LLADD.tex +++ b/doc/paper3/LLADD.tex @@ -1266,17 +1266,18 @@ If requests can be partitioned in a natural way, load balancing can be implemented by splitting requests across many nodes. Similarly, a node can easily service streams of requests from multiple nodes by combining them into a single log, and processing the log -using operaiton implementations. +using operaiton implementations. For example, this type of optimization +is used by RVM's log-merging operations~\cite{rvm}. Furthermore, application-specific procedures that are analagous to standard relational algebra methods (join, project and select) could be used to efficiently transform the data -before it reaches the page file, while it is layed out sequentially +while it is still layed out sequentially in non-transactional memory. -Note that read-only operations do not necessarily generate log -entries. Therefore, applications may need to implement custom -operations to make use of the ideas in this section. +%Note that read-only operations do not necessarily generate log +%entries. Therefore, applications may need to implement custom +%operations to make use of the ideas in this section. Although \yad has rudimentary support for a two-phase commit based cluster hash table, we have not yet implemented networking primitives for logical logs. @@ -1284,20 +1285,48 @@ Therefore, we implemented a single node log reordering scheme that increases req during the traversal of a random graph. The graph traversal system takes a sequence of (read) requests, and partitions them using some function. It then proceses each partition in isolation from the -others. We considered two partitioning functions. The first, partitions the -requests according to the hash of the node id they refer to, and would be useful for load balancing over a network. -(We expect the early phases of such a traversal to be bandwidth, not -latency limited, as each node would stream large sequences of -asynchronous requests to the other nodes.) +others. We considered two partitioning functions. The first divides the page file +up into equally sized contiguous regions, which enables locality. The second takes the hash +of the page's offset in the file, which enables load balancing. +%% The second policy is interesting +%The first, partitions the +%requests according to the hash of the node id they refer to, and would be useful for load balancing over a network. +%(We expect the early phases of such a traversal to be bandwidth, not +%latency limited, as each node would stream large sequences of +%asynchronous requests to the other nodes.) The second partitioning function, which was used in our benchmarks, -partitions requests by their position in the page -file. We ran two experiments. The first, presented in Figure~\ref{fig:oo7} is loosely based on the oo7 database benchmark.~\cite{oo7}. The second explicitly measures the effect of graph locality on our optimization. (Figure~\ref{fig:hotGraph}) When the graph has good locality, a normal depth first search -traversal and the prioritized traversal performs well. As locality -decreases, the partitioned traversal algorithm's performance degrades -less than the naive traversal. +partitions requests by their position in the page file. We chose the +position size so that each partition can fit in \yads buffer pool, +ensuring locality. + +We ran two experiments. Both stored a graph of fixed size objects in +the growable array implementation that is used as our linear +hashtable's bucket list. +The first experiment (Figure~\ref{fig:oo7}) +is loosely based on the oo7 database benchmark.~\cite{oo7}. We +hardcode the out-degree of each node, and use a directed graph. OO7 +constructs graphs by by first connecting nodes together into a ring. +It then randomly adds edges between the nodes until the desired +out-degree is obtained. This structure ensures graph connectivity. +If the nodes are laid out in ring order on disk, it also ensures that +one edge from each node has good locality while the others generally +have poor locality. + +The second experiment explicitly measures the effect of graph locality +on our optimization. (Figure~\ref{fig:hotGraph}) It extends the idea +of a hot set to graph generation. Each node has a distinct hot set +which includes the 10\% of the nodes that are closest to it in ring +order. The remaining nodes are in the cold set. We use random edges +instead of ring edges for this test. This does not ensure graph +connectivity, but we used the same random seeds for the two systems. + +When the graph has good locality, a normal depth first search +traversal and the prioritized traversal both performs well. The +prioritied traversal is slightly slower due to the overhead of extra +log manipulation. As locality decreases, the partitioned traversal +algorithm's outperforms the naive traversal. -\rcs{ This really needs more experimental setup... look at older draft! } \subsection{LSN-Free pages} \label{sec:zeroCopy} @@ -1480,7 +1509,9 @@ provided us with invaluable feedback. Additional information, and \yads source code is available at: \begin{center} -{\tt http://\yad.sourceforge.net/} +%{\tt http://www.cs.berkeley.edu/sears/\yad/} +{\small{\tt http://www.cs.berkeley.edu/\ensuremath{\sim}sears/\yad/}} +%{\tt http://www.cs.berkeley.edu/sears/\yad/} \end{center} {\footnotesize \bibliographystyle{acm}