bucket map, bucket overflow
This commit is contained in:
parent
823af05adc
commit
1b8c20a882
1 changed files with 37 additions and 33 deletions
|
@ -1435,8 +1435,6 @@ partition the array. Since we expect relatively few partitions (one
|
||||||
per enlargement typically), this leads to an efficient map. We use a
|
per enlargement typically), this leads to an efficient map. We use a
|
||||||
single ``header'' page to store the list of intervals and their sizes.
|
single ``header'' page to store the list of intervals and their sizes.
|
||||||
|
|
||||||
%We use fixed-sized buckets, which allows us to treat a region of pages
|
|
||||||
% as an array of buckets.
|
|
||||||
For space efficiency, the array elements themselves are stored using
|
For space efficiency, the array elements themselves are stored using
|
||||||
the fixed-size record page layout. Thus, we use the header page to
|
the fixed-size record page layout. Thus, we use the header page to
|
||||||
find the right interval, and then index into it to get the $(page,
|
find the right interval, and then index into it to get the $(page,
|
||||||
|
@ -1478,26 +1476,31 @@ record.
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
\includegraphics[width=3.25in]{LHT2.pdf}
|
\includegraphics[width=3.25in]{LHT2.pdf}
|
||||||
\caption{\label{fig:LHT}Structure of locality preserving ({\em Page Oriented})
|
\caption{\label{fig:LHT}Structure of locality preserving ({\em page-oriented})
|
||||||
linked lists. Hashtable bucket overflow lists tend to be of some small fixed
|
linked lists. Hashtable bucket overflow lists tend to be of some small fixed
|
||||||
length. This data structure allows \yad to aggressively maintain page locality
|
length. This data structure allows \yad to aggressively maintain page locality
|
||||||
for short lists, providing fast overflow bucket traversal for the hash table.}
|
for short lists, providing fast overflow bucket traversal for the hash table.}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
For simplicity, the entries in the bucket list described above are
|
Given the map, which locates the bucket, we need a transactional
|
||||||
fixed length. Therefore, we store recordids in the bucket
|
linked list for the contents of the bucket. The trivial implemention
|
||||||
list and set these recordid pointers to point to lists
|
would just link variable-size records together, where each record
|
||||||
of variable length $(key, value)$ pairs.
|
contains a $(key,value)$ pair and the $next$ pointer, which is just a
|
||||||
In order to achieve good locality for overflow entries we represent
|
$(page,slot)$ address.
|
||||||
each list as a list of smaller lists. The main list links pages together, and the smaller
|
|
||||||
lists each reside within a single page (Figure~\ref{fig:LHT}).
|
|
||||||
We reuse \yad's slotted page space allocation routines to deal with
|
|
||||||
the low-level details of space allocation and reuse within each page.
|
|
||||||
|
|
||||||
All of the entries within a single page may be traversed without
|
However, in order to achieve good locality, we instead implement a
|
||||||
|
{\em page-oriented} transactional linked list, shown in
|
||||||
|
Figure~\ref{fig:LHT}. The basic idea is to place adjacent elements of
|
||||||
|
the list on the same page: thus we use a list of lists. The main list
|
||||||
|
links pages together, while the smaller lists reside with that
|
||||||
|
page. \yad's slotted pages allows the smaller lists to support
|
||||||
|
variable-size values, and allow list reordering and value resizing
|
||||||
|
with a single log entry (since everthing is on one page).
|
||||||
|
|
||||||
|
In addition, all of the entries within a page may be traversed without
|
||||||
unpinning and repinning the page in memory, providing very fast
|
unpinning and repinning the page in memory, providing very fast
|
||||||
traversal over lists that have good locality. This optimization would
|
traversal over lists that have good locality. This optimization would
|
||||||
not be possible if it were not for the low level interfaces provided
|
not be possible if it were not for the low-level interfaces provided
|
||||||
by the buffer manager. In particular, we need to specify which page
|
by the buffer manager. In particular, we need to specify which page
|
||||||
we would like to allocate space from and we need to be able to
|
we would like to allocate space from and we need to be able to
|
||||||
read and write multiple records with a single call to pin/unpin. Due to
|
read and write multiple records with a single call to pin/unpin. Due to
|
||||||
|
@ -1506,26 +1509,27 @@ for short lists, it can also be used on its own.
|
||||||
|
|
||||||
\subsection{Concurrency}
|
\subsection{Concurrency}
|
||||||
|
|
||||||
Given the structures described above, the implementation of a linear hash
|
Given the structures described above, the implementation of a linear
|
||||||
table is straightforward. A linear hash function is used to map keys
|
hash table is straightforward. A linear hash function is used to map
|
||||||
to buckets, insertions and deletions are handled by the array implementation,
|
keys to buckets, insertions and deletions are handled by the ArrayList
|
||||||
%linked list implementation,
|
implementation, and the table can be extended lazily by
|
||||||
and the table can be extended lazily by transactionally removing items
|
transactionally removing items from one bucket and adding them to
|
||||||
from one bucket and adding them to another.
|
another.
|
||||||
|
|
||||||
Given that the underlying data structures are transactional and there
|
Given that the underlying data structures are transactional and a
|
||||||
are never any concurrent transactions, this is actually all that is
|
single lock around the hashtable, this is actually all that is needed
|
||||||
needed to complete the linear hash table implementation.
|
to complete the linear hash table implementation. Unfortunately, as
|
||||||
Unfortunately, as we mentioned in Section~\ref{nested-top-actions},
|
we mentioned in Section~\ref{nested-top-actions}, things become a bit
|
||||||
things become a bit more complex if we allow interleaved transactions.
|
more complex if we allow interleaved transactions. The solution for
|
||||||
|
the default hashtable is simply to follow the recipe for Nested
|
||||||
We simply apply Nested Top Actions according to the recipe
|
Top Actions, and only lock the whole table during structural changes.
|
||||||
described in that section and lock the entire hashtable for each
|
We explore a version with finer-grain locking below.
|
||||||
operation. This prevents the hashtable implementation from fully
|
%This prevents the
|
||||||
exploiting multiprocessor systems,\footnote{\yad passes regression
|
%hashtable implementation from fully exploiting multiprocessor
|
||||||
tests on multiprocessor systems.} but seems to be adequate on single
|
%systems,\footnote{\yad passes regression tests on multiprocessor
|
||||||
processor machines (Figure~\ref{fig:TPS}).
|
%systems.} but seems to be adequate on single processor machines
|
||||||
We describe a finer grained concurrency mechanism below.
|
%(Figure~\ref{fig:TPS}).
|
||||||
|
%we describe a finer-grained concurrency mechanism below.
|
||||||
|
|
||||||
%We have found a simple recipe for converting a non-concurrent data structure into a concurrent one, which involves three steps:
|
%We have found a simple recipe for converting a non-concurrent data structure into a concurrent one, which involves three steps:
|
||||||
%\begin{enumerate}
|
%\begin{enumerate}
|
||||||
|
|
Loading…
Reference in a new issue