All sections are now filled in again. (What else is missing???)

This commit is contained in:
Sears Russell 2006-04-24 17:33:06 +00:00
parent 63c031e34c
commit 5a265805f4

View file

@ -1266,17 +1266,18 @@ If requests can be partitioned in a natural way, load
balancing can be implemented by splitting requests across many nodes.
Similarly, a node can easily service streams of requests from multiple
nodes by combining them into a single log, and processing the log
using operaiton implementations.
using operaiton implementations. For example, this type of optimization
is used by RVM's log-merging operations~\cite{rvm}.
Furthermore, application-specific
procedures that are analagous to standard relational algebra methods
(join, project and select) could be used to efficiently transform the data
before it reaches the page file, while it is layed out sequentially
while it is still layed out sequentially
in non-transactional memory.
Note that read-only operations do not necessarily generate log
entries. Therefore, applications may need to implement custom
operations to make use of the ideas in this section.
%Note that read-only operations do not necessarily generate log
%entries. Therefore, applications may need to implement custom
%operations to make use of the ideas in this section.
Although \yad has rudimentary support for a two-phase commit based
cluster hash table, we have not yet implemented networking primitives for logical logs.
@ -1284,20 +1285,48 @@ Therefore, we implemented a single node log reordering scheme that increases req
during the traversal of a random graph. The graph traversal system
takes a sequence of (read) requests, and partitions them using some
function. It then proceses each partition in isolation from the
others. We considered two partitioning functions. The first, partitions the
requests according to the hash of the node id they refer to, and would be useful for load balancing over a network.
(We expect the early phases of such a traversal to be bandwidth, not
latency limited, as each node would stream large sequences of
asynchronous requests to the other nodes.)
others. We considered two partitioning functions. The first divides the page file
up into equally sized contiguous regions, which enables locality. The second takes the hash
of the page's offset in the file, which enables load balancing.
%% The second policy is interesting
%The first, partitions the
%requests according to the hash of the node id they refer to, and would be useful for load balancing over a network.
%(We expect the early phases of such a traversal to be bandwidth, not
%latency limited, as each node would stream large sequences of
%asynchronous requests to the other nodes.)
The second partitioning function, which was used in our benchmarks,
partitions requests by their position in the page
file. We ran two experiments. The first, presented in Figure~\ref{fig:oo7} is loosely based on the oo7 database benchmark.~\cite{oo7}. The second explicitly measures the effect of graph locality on our optimization. (Figure~\ref{fig:hotGraph}) When the graph has good locality, a normal depth first search
traversal and the prioritized traversal performs well. As locality
decreases, the partitioned traversal algorithm's performance degrades
less than the naive traversal.
partitions requests by their position in the page file. We chose the
position size so that each partition can fit in \yads buffer pool,
ensuring locality.
We ran two experiments. Both stored a graph of fixed size objects in
the growable array implementation that is used as our linear
hashtable's bucket list.
The first experiment (Figure~\ref{fig:oo7})
is loosely based on the oo7 database benchmark.~\cite{oo7}. We
hardcode the out-degree of each node, and use a directed graph. OO7
constructs graphs by by first connecting nodes together into a ring.
It then randomly adds edges between the nodes until the desired
out-degree is obtained. This structure ensures graph connectivity.
If the nodes are laid out in ring order on disk, it also ensures that
one edge from each node has good locality while the others generally
have poor locality.
The second experiment explicitly measures the effect of graph locality
on our optimization. (Figure~\ref{fig:hotGraph}) It extends the idea
of a hot set to graph generation. Each node has a distinct hot set
which includes the 10\% of the nodes that are closest to it in ring
order. The remaining nodes are in the cold set. We use random edges
instead of ring edges for this test. This does not ensure graph
connectivity, but we used the same random seeds for the two systems.
When the graph has good locality, a normal depth first search
traversal and the prioritized traversal both performs well. The
prioritied traversal is slightly slower due to the overhead of extra
log manipulation. As locality decreases, the partitioned traversal
algorithm's outperforms the naive traversal.
\rcs{ This really needs more experimental setup... look at older draft! }
\subsection{LSN-Free pages}
\label{sec:zeroCopy}
@ -1480,7 +1509,9 @@ provided us with invaluable feedback.
Additional information, and \yads source code is available at:
\begin{center}
{\tt http://\yad.sourceforge.net/}
%{\tt http://www.cs.berkeley.edu/sears/\yad/}
{\small{\tt http://www.cs.berkeley.edu/\ensuremath{\sim}sears/\yad/}}
%{\tt http://www.cs.berkeley.edu/sears/\yad/}
\end{center}
{\footnotesize \bibliographystyle{acm}