sec 3
This commit is contained in:
parent
2d2e8cef0c
commit
88c19d1880
1 changed files with 63 additions and 62 deletions
|
@ -457,14 +457,14 @@ relatively easy to see how they would map onto \yad.
|
||||||
|
|
||||||
This section describes how existing write-ahead logging protocols
|
This section describes how existing write-ahead logging protocols
|
||||||
implement the four properties of transactional storage: Atomicity,
|
implement the four properties of transactional storage: Atomicity,
|
||||||
Consistency, Isolation and Durability. \yad provides these four
|
Consistency, Isolation and Durability. \yad provides these
|
||||||
properties to applications but also allows applications to opt-out of
|
properties but also allows applications to opt-out of
|
||||||
certain of properties as appropriate. This can be useful for
|
them as appropriate. This can be useful for
|
||||||
performance reasons or to simplify the mapping between application
|
performance reasons or to simplify the mapping between application
|
||||||
semantics and the storage layer. Unlike prior work, \yad also exposes
|
semantics and the storage layer. Unlike prior work, \yad also exposes
|
||||||
the primitives described below to application developers, allowing
|
the primitives described below to application developers, allowing
|
||||||
unanticipated optimizations to be implemented and allowing low-level
|
unanticipated optimizations and allowing low-level
|
||||||
behavior such as recovery semantics to be customized on a
|
behavior, such as recovery semantics, to be customized on a
|
||||||
per-application basis.
|
per-application basis.
|
||||||
|
|
||||||
The write-ahead logging algorithm we use is based upon ARIES, but
|
The write-ahead logging algorithm we use is based upon ARIES, but
|
||||||
|
@ -483,7 +483,7 @@ will be protected according to the ACID properties mentioned above.
|
||||||
%reversible, implying that any information that is needed in order to
|
%reversible, implying that any information that is needed in order to
|
||||||
%reverse the action must be stored for future use.
|
%reverse the action must be stored for future use.
|
||||||
Typically, the
|
Typically, the
|
||||||
information necessary to redo and undo each action is stored in the
|
information necessary to REDO and UNDO each action is stored in the
|
||||||
log. We refine this concept and explicitly discuss {\em operations},
|
log. We refine this concept and explicitly discuss {\em operations},
|
||||||
which must be atomically applicable to the page file.
|
which must be atomically applicable to the page file.
|
||||||
|
|
||||||
|
@ -495,8 +495,8 @@ to build. In Section~\ref{nested-top-actions}, we explain how to
|
||||||
handle operations that span pages.
|
handle operations that span pages.
|
||||||
|
|
||||||
One unique aspect of \yad, which is not true for ARIES, is that {\em
|
One unique aspect of \yad, which is not true for ARIES, is that {\em
|
||||||
normal} operations are defined in terms of redo and undo
|
normal} operations are defined in terms of REDO and UNDO
|
||||||
functions. There is no way to modify the page except via the redo
|
functions. There is no way to modify the page except via the REDO
|
||||||
function.\footnote{Actually, even this can be overridden, but doing so
|
function.\footnote{Actually, even this can be overridden, but doing so
|
||||||
complicates recovery semantics, and only should be done as a last
|
complicates recovery semantics, and only should be done as a last
|
||||||
resort. Currently, this is only done to implement the \oasys flush()
|
resort. Currently, this is only done to implement the \oasys flush()
|
||||||
|
@ -504,9 +504,9 @@ and update() operations described in Section~\ref{OASYS}.} This has
|
||||||
the nice property that the REDO code is known to work, since the
|
the nice property that the REDO code is known to work, since the
|
||||||
original operation was the exact same ``redo''. In general, the \yad
|
original operation was the exact same ``redo''. In general, the \yad
|
||||||
philosophy is that you define operations in terms of their REDO/UNDO
|
philosophy is that you define operations in terms of their REDO/UNDO
|
||||||
behavior, and then build a user friendly {\em wrapper} interface
|
behavior, and then build a user-friendly {\em wrapper} interface
|
||||||
around them. The value of \yad is that it provides a skeleton that
|
around them. The value of \yad is that it provides a skeleton that
|
||||||
invokes the redo/undo functions at the {\em right} time, despite
|
invokes the REDO/UNDO functions at the {\em right} time, despite
|
||||||
concurrency, crashes, media failures, and aborted transactions. Also
|
concurrency, crashes, media failures, and aborted transactions. Also
|
||||||
unlike ARIES, \yad refines the concept of the wrapper interface,
|
unlike ARIES, \yad refines the concept of the wrapper interface,
|
||||||
making it possible to reschedule operations according to an
|
making it possible to reschedule operations according to an
|
||||||
|
@ -521,8 +521,9 @@ We allow transactions to be interleaved, allowing concurrent access to
|
||||||
application data and exploiting opportunities for hardware
|
application data and exploiting opportunities for hardware
|
||||||
parallelism. Therefore, each action must assume that the
|
parallelism. Therefore, each action must assume that the
|
||||||
physical data upon which it relies may contain uncommitted
|
physical data upon which it relies may contain uncommitted
|
||||||
information and that this information may have been produced by a
|
information that might be undone due to a crash or an abort.
|
||||||
transaction that will be aborted by a crash or by the application.
|
%and that this information may have been produced by a
|
||||||
|
%transaction that will be aborted by a crash or by the application.
|
||||||
%(The latter is actually harder, since there is no ``fate sharing''.)
|
%(The latter is actually harder, since there is no ``fate sharing''.)
|
||||||
|
|
||||||
% Furthermore, aborting
|
% Furthermore, aborting
|
||||||
|
@ -554,7 +555,7 @@ For locking, due to the variety of locking protocols available, and
|
||||||
their interaction with application
|
their interaction with application
|
||||||
workloads~\cite{multipleGenericLocking}, we leave it to the
|
workloads~\cite{multipleGenericLocking}, we leave it to the
|
||||||
application to decide what degree of isolation is
|
application to decide what degree of isolation is
|
||||||
appropriate. Section~\ref{lock-manager} presents the Lock Manager API.
|
appropriate. Section~\ref{lock-manager} presents the Lock Manager.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -563,15 +564,15 @@ appropriate. Section~\ref{lock-manager} presents the Lock Manager API.
|
||||||
\label{log-manager}
|
\label{log-manager}
|
||||||
|
|
||||||
All actions performed by a committed transaction must be
|
All actions performed by a committed transaction must be
|
||||||
restored in the case of a crash, and all actions performed by aborting
|
restored in the case of a crash, and all actions performed by aborted
|
||||||
transactions must be undone. In order for \yad to arrange for this
|
transactions must be undone. In order to arrange for this
|
||||||
to happen at recovery, operations must produce log entries that contain
|
to happen at recovery, operations must produce log entries that contain
|
||||||
all information necessary for undo and redo.
|
all information necessary for REDO and UNDO.
|
||||||
|
|
||||||
An important concept in ARIES is the ``log sequence number'' or {\em
|
An important concept in ARIES is the ``log sequence number'' or {\em
|
||||||
LSN}. An LSN is essentially a virtual timestamp that goes on every
|
LSN}. An LSN is essentially a virtual timestamp that goes on every
|
||||||
page; it marks the last log entry that is reflected on the page and
|
page; it marks the last log entry that is reflected on the page and
|
||||||
implies that all previous log entries are also reflected. Given the
|
implies that {\em all previous log entries} are also reflected. Given the
|
||||||
LSN, \yad calculates where to start playing back the log to bring the
|
LSN, \yad calculates where to start playing back the log to bring the
|
||||||
page up to date. The LSN is stored in the page that it refers to so
|
page up to date. The LSN is stored in the page that it refers to so
|
||||||
that it is always written to disk atomically with the data on the
|
that it is always written to disk atomically with the data on the
|
||||||
|
@ -584,7 +585,7 @@ a increased need for buffer memory (to hold all dirty pages). Worse,
|
||||||
as we allow multiple transactions to run concurrently on the same page
|
as we allow multiple transactions to run concurrently on the same page
|
||||||
(but not typically the same item), it may be that a given page {\em
|
(but not typically the same item), it may be that a given page {\em
|
||||||
always} contains some uncommitted data and thus can never be written
|
always} contains some uncommitted data and thus can never be written
|
||||||
back to disk. To handle stolen pages, we log UNDO records that
|
back. To handle stolen pages, we log UNDO records that
|
||||||
we can use to undo the uncommitted changes in case we crash. \yad
|
we can use to undo the uncommitted changes in case we crash. \yad
|
||||||
ensures that the UNDO record is durable in the log before the
|
ensures that the UNDO record is durable in the log before the
|
||||||
page is written to disk and that the page LSN reflects this log entry.
|
page is written to disk and that the page LSN reflects this log entry.
|
||||||
|
@ -595,17 +596,10 @@ that we can use to redo the operation in case the committed version never
|
||||||
makes it to disk. \yad ensures that the REDO entry is durable in the
|
makes it to disk. \yad ensures that the REDO entry is durable in the
|
||||||
log before the transaction commits. REDO entries are physical changes
|
log before the transaction commits. REDO entries are physical changes
|
||||||
to a single page (``page-oriented redo''), and thus must be redone in
|
to a single page (``page-oriented redo''), and thus must be redone in
|
||||||
order. Therefore, they are produced after any rescheduling or computation
|
order.
|
||||||
specific to the current state of the page file is performed.
|
% Therefore, they are produced after any rescheduling or computation
|
||||||
|
%specific to the current state of the page file is performed.
|
||||||
|
|
||||||
Eventually, the page makes it to disk, but the REDO entry is still
|
|
||||||
useful: we can use it to roll forward a single page from an archived
|
|
||||||
copy. Thus one of the nice properties of \yad, which has been tested,
|
|
||||||
is that we can handle media failures very gracefully: lost disk blocks
|
|
||||||
or even whole files can be recovered given an old version and the log.
|
|
||||||
Because pages can be recovered independently from each other, there is
|
|
||||||
no need to stop transactions to make a snapshot for archiving: any
|
|
||||||
fuzzy snapshot is fine.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -620,7 +614,7 @@ fuzzy snapshot is fine.
|
||||||
We use the same basic recovery strategy as ARIES, which consists of
|
We use the same basic recovery strategy as ARIES, which consists of
|
||||||
three phases: {\em analysis}, {\em redo} and {\em undo}. The first,
|
three phases: {\em analysis}, {\em redo} and {\em undo}. The first,
|
||||||
analysis, is implemented by \yad, but will not be discussed in this
|
analysis, is implemented by \yad, but will not be discussed in this
|
||||||
paper. The second, redo, ensures that each redo entry is applied to
|
paper. The second, redo, ensures that each REDO entry is applied to
|
||||||
its corresponding page exactly once. The third phase, undo, rolls
|
its corresponding page exactly once. The third phase, undo, rolls
|
||||||
back any transactions that were active when the crash occurred, as
|
back any transactions that were active when the crash occurred, as
|
||||||
though the application manually aborted them with the ``abort''
|
though the application manually aborted them with the ``abort''
|
||||||
|
@ -636,21 +630,22 @@ present, it also works with a truncated log and an archive copy.}
|
||||||
Because we make no further assumptions regarding the order in which
|
Because we make no further assumptions regarding the order in which
|
||||||
pages were propagated to disk, redo must assume that any data
|
pages were propagated to disk, redo must assume that any data
|
||||||
structures, lookup tables, etc. that span more than a single page are
|
structures, lookup tables, etc. that span more than a single page are
|
||||||
in an inconsistent state. Therefore, as the redo phase re-applies the
|
in an inconsistent state.
|
||||||
information in the log to the page file, it must address all pages
|
%Therefore, as the redo phase re-applies the
|
||||||
directly.
|
%information in the log to the page file, it must address all pages
|
||||||
|
%directly.
|
||||||
|
|
||||||
This implies that the redo information for each operation in the log
|
This implies that the REDO information for each operation in the log
|
||||||
must contain the physical address (page number) of the information
|
must contain the physical address (page number) of the information
|
||||||
that it modifies, and the portion of the operation executed by a
|
that it modifies, and the portion of the operation executed by a
|
||||||
single redo log entry must only rely upon the contents of that
|
single REDO log entry must only rely upon the contents of that
|
||||||
page.
|
page.
|
||||||
% (Since we assume that pages are propagated to disk atomically,
|
% (Since we assume that pages are propagated to disk atomically,
|
||||||
%the redo phase can rely upon information contained within a single
|
%the redo phase can rely upon information contained within a single
|
||||||
%page.)
|
%page.)
|
||||||
|
|
||||||
Once redo completes, we have essentially repeated history: replaying
|
Once redo completes, we have essentially repeated history: replaying
|
||||||
all redo entries to ensure that the page file is in a physically
|
all REDO entries to ensure that the page file is in a physically
|
||||||
consistent state. However, we also replayed updates from transactions
|
consistent state. However, we also replayed updates from transactions
|
||||||
that should be aborted, as they were still in progress at the time of
|
that should be aborted, as they were still in progress at the time of
|
||||||
the crash. The final stage of recovery is the undo phase, which simply
|
the crash. The final stage of recovery is the undo phase, which simply
|
||||||
|
@ -658,6 +653,12 @@ aborts all uncommitted transactions. Since the page file is physically
|
||||||
consistent, the transactions may be aborted exactly as they would be
|
consistent, the transactions may be aborted exactly as they would be
|
||||||
during normal operation.
|
during normal operation.
|
||||||
|
|
||||||
|
One of the nice properties of ARIES, which has been tested with \yad,
|
||||||
|
is that we can handle media failures very gracefully: lost disk blocks
|
||||||
|
or even whole files can be recovered given an old version and the log.
|
||||||
|
Because pages can be recovered independently from each other, there is
|
||||||
|
no need to stop transactions to make a snapshot for archiving: any
|
||||||
|
fuzzy snapshot is fine.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -684,7 +685,7 @@ parts.
|
||||||
The lower layer implements the write-ahead logging component,
|
The lower layer implements the write-ahead logging component,
|
||||||
including a buffer pool, logger, and (optionally) a lock manager.
|
including a buffer pool, logger, and (optionally) a lock manager.
|
||||||
The complexity of the write-ahead logging component lies in
|
The complexity of the write-ahead logging component lies in
|
||||||
determining exactly when the undo and redo operations should be
|
determining exactly when the UNDO and REDO operations should be
|
||||||
applied, when pages may be flushed to disk, log truncation, logging
|
applied, when pages may be flushed to disk, log truncation, logging
|
||||||
optimizations, and a large number of other data-independent extensions
|
optimizations, and a large number of other data-independent extensions
|
||||||
and optimizations. This layer is the core of \yad.
|
and optimizations. This layer is the core of \yad.
|
||||||
|
@ -869,16 +870,16 @@ that should be presented here. {\em Physical logging }
|
||||||
is the practice of logging physical (byte-level) updates
|
is the practice of logging physical (byte-level) updates
|
||||||
and the physical (page-number) addresses to which they are applied.
|
and the physical (page-number) addresses to which they are applied.
|
||||||
|
|
||||||
\rcs{Do we really need to differentiate between types of diffs applied to pages? The concept of physical redo/logical undo is probably more important...}
|
\rcs{Do we really need to differentiate between types of diffs applied to pages? The concept of physical REDO/logical UNDO is probably more important...}
|
||||||
|
|
||||||
{\em Physiological logging } is what \yad recommends for its redo
|
{\em Physiological logging } is what \yad recommends for its REDO
|
||||||
records~\cite{physiological}. The physical address (page number) is
|
records~\cite{physiological}. The physical address (page number) is
|
||||||
stored, but the byte offset and the actual delta are stored implicitly
|
stored, but the byte offset and the actual delta are stored implicitly
|
||||||
in the parameters of the redo or undo function. These parameters allow
|
in the parameters of the REDO or UNDO function. These parameters allow
|
||||||
the function to update the page in a way that preserves application
|
the function to update the page in a way that preserves application
|
||||||
semantics. One common use for this is {\em slotted pages}, which use
|
semantics. One common use for this is {\em slotted pages}, which use
|
||||||
an on-page level of indirection to allow records to be rearranged
|
an on-page level of indirection to allow records to be rearranged
|
||||||
within the page; instead of using the page offset, redo operations use
|
within the page; instead of using the page offset, REDO operations use
|
||||||
the index to locate the data within the page. This allows data within a single
|
the index to locate the data within the page. This allows data within a single
|
||||||
page to be re-arranged at runtime to produce contiguous regions of
|
page to be re-arranged at runtime to produce contiguous regions of
|
||||||
free space. \yad generalizes this model; for example, the parameters
|
free space. \yad generalizes this model; for example, the parameters
|
||||||
|
@ -934,7 +935,7 @@ transaction, $A$, rearranged the layout of a data structure, a second
|
||||||
transaction, $B$, added a value to the rearranged structure, and then
|
transaction, $B$, added a value to the rearranged structure, and then
|
||||||
the first transaction aborted. (Note that the structure is not
|
the first transaction aborted. (Note that the structure is not
|
||||||
isolated.) While applying physical undo information to the altered
|
isolated.) While applying physical undo information to the altered
|
||||||
data structure, $A$ would undo its writes
|
data structure, $A$ would UNDO its writes
|
||||||
without considering the modifications made by
|
without considering the modifications made by
|
||||||
$B$, which is likely to cause corruption. At this point, $B$ would
|
$B$, which is likely to cause corruption. At this point, $B$ would
|
||||||
have to be aborted as well ({\em cascading aborts}).
|
have to be aborted as well ({\em cascading aborts}).
|
||||||
|
@ -959,7 +960,7 @@ three steps:
|
||||||
with deadlock detection is required, this can be done with the lock
|
with deadlock detection is required, this can be done with the lock
|
||||||
manager. Alternatively, this can be done using mutexes for fine-grain isolation.
|
manager. Alternatively, this can be done using mutexes for fine-grain isolation.
|
||||||
\item Define a logical UNDO for each operation (rather than just using
|
\item Define a logical UNDO for each operation (rather than just using
|
||||||
a lower-level physical undo). For example, this is easy for a
|
a lower-level physical UNDO). For example, this is easy for a
|
||||||
hashtable; e.g. the UNDO for an {\em insert} is {\em remove}.
|
hashtable; e.g. the UNDO for an {\em insert} is {\em remove}.
|
||||||
\item For mutating operations (not read-only), add a ``begin nested
|
\item For mutating operations (not read-only), add a ``begin nested
|
||||||
top action'' right after the mutex acquisition, and a ``commit
|
top action'' right after the mutex acquisition, and a ``commit
|
||||||
|
@ -968,7 +969,7 @@ three steps:
|
||||||
This recipe ensures that operations that might span multiple pages
|
This recipe ensures that operations that might span multiple pages
|
||||||
atomically apply and commit any structural changes and thus avoids
|
atomically apply and commit any structural changes and thus avoids
|
||||||
cascading aborts. If the transaction that encloses the operations
|
cascading aborts. If the transaction that encloses the operations
|
||||||
aborts, the logical undo will {\em compensate} for
|
aborts, the logical UNDO will {\em compensate} for
|
||||||
its effects, but leave its structural changes intact. Note that by releasing the mutex before we commit, we are
|
its effects, but leave its structural changes intact. Note that by releasing the mutex before we commit, we are
|
||||||
violating strict two-phase locking in exchange for better performance
|
violating strict two-phase locking in exchange for better performance
|
||||||
and support for deadlock avoidance.
|
and support for deadlock avoidance.
|
||||||
|
@ -991,7 +992,7 @@ changes, such as growing a hash table or array.
|
||||||
%% mechanism described here. If the need arises, we will add support
|
%% mechanism described here. If the need arises, we will add support
|
||||||
%% for nested top actions.}
|
%% for nested top actions.}
|
||||||
%% An operation's wrapper is just a normal function, and therefore may
|
%% An operation's wrapper is just a normal function, and therefore may
|
||||||
%% generate multiple log entries. First, it writes an undo-only entry
|
%% generate multiple log entries. First, it writes an UNDO-only entry
|
||||||
%% to the log. This entry will cause the \emph{logical} inverse of the
|
%% to the log. This entry will cause the \emph{logical} inverse of the
|
||||||
%% current operation to be performed at recovery or abort, must be idempotent,
|
%% current operation to be performed at recovery or abort, must be idempotent,
|
||||||
%% and must fail gracefully if applied to a version of the database that
|
%% and must fail gracefully if applied to a version of the database that
|
||||||
|
@ -1028,15 +1029,15 @@ representations and data structures by defining new operations.
|
||||||
|
|
||||||
There are a number of invariants that these operations must obey:
|
There are a number of invariants that these operations must obey:
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item Pages should only be updated inside of a redo or undo function.
|
\item Pages should only be updated inside of a REDO or UNDO function.
|
||||||
\item An update to a page atomically updates the LSN by pinning the page.
|
\item An update to a page atomically updates the LSN by pinning the page.
|
||||||
\item If the data read by the wrapper function must match the state of
|
\item If the data read by the wrapper function must match the state of
|
||||||
the page that the redo function sees, then the wrapper should latch
|
the page that the REDO function sees, then the wrapper should latch
|
||||||
the relevant data.
|
the relevant data.
|
||||||
\item Redo operations use page numbers and possibly record numbers
|
\item REDO operations use page numbers and possibly record numbers
|
||||||
while Undo operations use these or logical names/keys
|
while UNDO operations use these or logical names/keys
|
||||||
\item Acquire latches as needed (typically per page or record)
|
\item Acquire latches as needed (typically per page or record)
|
||||||
\item Use nested top actions (which require a logical undo log record)
|
\item Use nested top actions (which require a logical UNDO log record)
|
||||||
or ``big locks'' (which drastically reduce concurrency) for multi-page updates.
|
or ``big locks'' (which drastically reduce concurrency) for multi-page updates.
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
|
@ -1045,7 +1046,7 @@ or ``big locks'' (which drastically reduce concurrency) for multi-page updates.
|
||||||
A common optimization for TPC benchmarks is to provide hand-built
|
A common optimization for TPC benchmarks is to provide hand-built
|
||||||
operations that support adding/subtracting from an account. Such
|
operations that support adding/subtracting from an account. Such
|
||||||
operations improve concurrency since they can be reordered and can be
|
operations improve concurrency since they can be reordered and can be
|
||||||
easily made into nested top actions (since the logical undo is
|
easily made into nested top actions (since the logical UNDO is
|
||||||
trivial). Here we show how increment/decrement map onto \yad operations.
|
trivial). Here we show how increment/decrement map onto \yad operations.
|
||||||
|
|
||||||
First, we define the operation-specific part of the log record:
|
First, we define the operation-specific part of the log record:
|
||||||
|
@ -1109,7 +1110,7 @@ int Tincrement(int xid, recordid rid, int amount) {
|
||||||
\end{small}
|
\end{small}
|
||||||
|
|
||||||
With some examination it is possible to show that this example meets
|
With some examination it is possible to show that this example meets
|
||||||
the invariants. In addition, because the redo code is used for normal
|
the invariants. In addition, because the REDO code is used for normal
|
||||||
operation, most bugs are easy to find with conventional testing
|
operation, most bugs are easy to find with conventional testing
|
||||||
strategies. As future work, there is some hope of verifying these
|
strategies. As future work, there is some hope of verifying these
|
||||||
invariants statically; for example, it is easy to verify that pages
|
invariants statically; for example, it is easy to verify that pages
|
||||||
|
@ -1418,7 +1419,7 @@ single ``header'' page to store the list of intervals and their sizes.
|
||||||
For space efficiency, the array elements themselves are stored using
|
For space efficiency, the array elements themselves are stored using
|
||||||
the fixed-size record page layout. Thus, we use the header page to
|
the fixed-size record page layout. Thus, we use the header page to
|
||||||
find the right interval, and then index into it to get the $(page,
|
find the right interval, and then index into it to get the $(page,
|
||||||
slot)$ address. Once we have this address, the redo/undo entries are
|
slot)$ address. Once we have this address, the REDO/UNDO entries are
|
||||||
trivial: they simply log the before and after image of the that
|
trivial: they simply log the before and after image of the that
|
||||||
record.
|
record.
|
||||||
|
|
||||||
|
@ -1529,8 +1530,8 @@ We explore a version with finer-grain locking below.
|
||||||
%\item Wrap a mutex around each operation, this can be done with a lock
|
%\item Wrap a mutex around each operation, this can be done with a lock
|
||||||
% manager, or just using pthread mutexes. This provides isolation.
|
% manager, or just using pthread mutexes. This provides isolation.
|
||||||
%\item Define a logical UNDO for each operation (rather than just using
|
%\item Define a logical UNDO for each operation (rather than just using
|
||||||
% the lower-level undo in the transactional array). This is easy for a
|
% the lower-level UNDO in the transactional array). This is easy for a
|
||||||
% hash table; e.g. the undo for an {\em insert} is {\em remove}.
|
% hash table; e.g. the UNDO for an {\em insert} is {\em remove}.
|
||||||
%\item For mutating operations (not read-only), add a ``begin nested
|
%\item For mutating operations (not read-only), add a ``begin nested
|
||||||
% top action'' right after the mutex acquisition, and a ``commit
|
% top action'' right after the mutex acquisition, and a ``commit
|
||||||
% nested top action'' where we release the mutex.
|
% nested top action'' where we release the mutex.
|
||||||
|
@ -1578,7 +1579,7 @@ We explore a version with finer-grain locking below.
|
||||||
This completes our description of \yad's default hashtable
|
This completes our description of \yad's default hashtable
|
||||||
implementation. We would like to emphasize the fact that implementing
|
implementation. We would like to emphasize the fact that implementing
|
||||||
transactional support and concurrency for this data structure is
|
transactional support and concurrency for this data structure is
|
||||||
straightforward. The only complications are a) defining a logical undo, and b) dealing with fixed-length records.
|
straightforward. The only complications are a) defining a logical UNDO, and b) dealing with fixed-length records.
|
||||||
|
|
||||||
%, and (other than requiring the design of a logical
|
%, and (other than requiring the design of a logical
|
||||||
%logging format, and the restrictions imposed by fixed length pages) is
|
%logging format, and the restrictions imposed by fixed length pages) is
|
||||||
|
@ -1601,10 +1602,10 @@ Instead of using nested top actions, the optimized implementation
|
||||||
applies updates in a carefully chosen order that minimizes the extent
|
applies updates in a carefully chosen order that minimizes the extent
|
||||||
to which the on disk representation of the hash table can be
|
to which the on disk representation of the hash table can be
|
||||||
corrupted (Figure~\ref{linkedList}). Before beginning updates, it
|
corrupted (Figure~\ref{linkedList}). Before beginning updates, it
|
||||||
writes an undo entry that will check and restore the consistency of
|
writes an UNDO entry that will check and restore the consistency of
|
||||||
the hashtable during recovery, and then invokes the inverse of the
|
the hashtable during recovery, and then invokes the inverse of the
|
||||||
operation that needs to be undone. This recovery scheme does not
|
operation that needs to be undone. This recovery scheme does not
|
||||||
require record-level undo information. Therefore, pre-images of
|
require record-level UNDO information. Therefore, pre-images of
|
||||||
records do not need to be written to log, saving log bandwidth and
|
records do not need to be written to log, saving log bandwidth and
|
||||||
enhancing performance.
|
enhancing performance.
|
||||||
|
|
||||||
|
@ -1890,17 +1891,17 @@ modifications will incur relatively inexpensive log additions,
|
||||||
and are only coalesced into a single modification to the page file
|
and are only coalesced into a single modification to the page file
|
||||||
when the object is flushed from cache.
|
when the object is flushed from cache.
|
||||||
|
|
||||||
\yad provides a several options to handle undo records in the context
|
\yad provides a several options to handle UNDO records in the context
|
||||||
of object serialization. The first is to use a single transaction for
|
of object serialization. The first is to use a single transaction for
|
||||||
each object modification, avoiding the cost of generating or logging
|
each object modification, avoiding the cost of generating or logging
|
||||||
any undo records. The second option is to assume that the
|
any UNDO records. The second option is to assume that the
|
||||||
application will provide a custom undo for the delta,
|
application will provide a custom UNDO for the delta,
|
||||||
which requires a log entry for each update,
|
which requires a log entry for each update,
|
||||||
but still avoids the need to read or update the page
|
but still avoids the need to read or update the page
|
||||||
file.
|
file.
|
||||||
|
|
||||||
The third option is to relax the atomicity requirements for a set of
|
The third option is to relax the atomicity requirements for a set of
|
||||||
object updates, and again avoid generating any undo records. This
|
object updates, and again avoid generating any UNDO records. This
|
||||||
assumes that the application cannot abort individual updates,
|
assumes that the application cannot abort individual updates,
|
||||||
and is willing to
|
and is willing to
|
||||||
accept that some prefix of logged but uncommitted updates may
|
accept that some prefix of logged but uncommitted updates may
|
||||||
|
@ -2102,7 +2103,7 @@ before presenting an evaluation.
|
||||||
|
|
||||||
\yad's wrapper functions translate high-level (logical) application
|
\yad's wrapper functions translate high-level (logical) application
|
||||||
requests into lower level (physiological) log entries. These
|
requests into lower level (physiological) log entries. These
|
||||||
physiological log entries generally include a logical undo,
|
physiological log entries generally include a logical UNDO,
|
||||||
(Section~\ref{nested-top-actions}) that invokes the logical
|
(Section~\ref{nested-top-actions}) that invokes the logical
|
||||||
inverse of the application request. Since the logical inverse of most
|
inverse of the application request. Since the logical inverse of most
|
||||||
application request is another application request, we can {\em reuse} our
|
application request is another application request, we can {\em reuse} our
|
||||||
|
|
Loading…
Reference in a new issue