libdb/docs/gsg_txn/JAVA/architectrecovery.html
2012-11-14 16:35:20 -05:00

367 lines
19 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Designing Your Application for Recovery</title>
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
<link rel="start" href="index.html" title="Getting Started with Berkeley DB Transaction Processing" />
<link rel="up" href="filemanagement.html" title="Chapter 5. Managing DB Files" />
<link rel="prev" href="recovery.html" title="Recovery Procedures" />
<link rel="next" href="hotfailover.html" title="Using Hot Failovers" />
</head>
<body>
<div xmlns="" class="navheader">
<div class="libver">
<p>Library Version 11.2.5.3</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center">Designing Your Application for Recovery</th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="recovery.html">Prev</a> </td>
<th width="60%" align="center">Chapter 5. Managing DB Files</th>
<td width="20%" align="right"> <a accesskey="n" href="hotfailover.html">Next</a></td>
</tr>
</table>
<hr />
</div>
<div class="sect1" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title" style="clear: both"><a id="architectrecovery"></a>Designing Your Application for Recovery</h2>
</div>
</div>
</div>
<div class="toc">
<dl>
<dt>
<span class="sect2">
<a href="architectrecovery.html#multithreadrecovery">Recovery for Multi-Threaded Applications</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="architectrecovery.html#multiprocessrecovery">Recovery in Multi-Process Applications</a>
</span>
</dt>
</dl>
</div>
<p>
When building your DB application, you should consider how you will run recovery. If you are building a
single threaded, single process application, it is fairly simple to run recovery when your application first
opens its environment. In this case, you need only decide if you want to run recovery every time you open
your application (recommended) or only some of the time, presumably triggered by a start up option
controlled by your application's user.
</p>
<p>
However, for multi-threaded and multi-process applications, you need to carefully consider how you will
design your application's startup code so as to run recovery only when it makes sense to do so.
</p>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="multithreadrecovery"></a>Recovery for Multi-Threaded Applications</h3>
</div>
</div>
</div>
<p>
If your application uses only one environment handle, then handling recovery for a multi-threaded
application is no more difficult than for a single threaded application. You simply open the environment
in the application's main thread, and then pass that handle to each of the threads that will be
performing DB operations. We illustrate this with our final example in this book (see
<a class="xref" href="txnexample_java.html" title="Base API Transaction Example">Base API Transaction Example</a>
for more information).
</p>
<p>
Alternatively, you can have each worker thread open its own environment handle. However, in this case,
designing for recovery is a bit more complicated.
</p>
<p>
Generally, when a thread performing database operations fails
or hangs, it is frequently best to simply
restart the application and run recovery upon application
startup as normal. However, not all applications can afford
to restart because a single thread has misbehaved.
</p>
<p>
If you are attempting to continue operations in the face of a misbehaving thread,
then at a minimum recovery must be run if a thread performing database operations fails or hangs.
</p>
<p>
Remember that recovery clears the environment of all
outstanding locks, including any that might be outstanding
from an aborted thread. If these locks are not cleared,
other threads performing database operations can back up
behind the locks obtained but never cleared by the failed
thread. The result will be an application that hangs
indefinitely.
</p>
<p>
To run recovery under these circumstances:
</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>
Suspend or shutdown all other threads performing
database operations.
</p>
</li>
<li>
<p>
Discarding any open environment handles. Note that
attempting to gracefully close these handles may be
asking for trouble; the close can fail if the
environment is already in need of recovery. For
this reason, it is best and easiest to simply discard the handle.
</p>
</li>
<li>
<p>
Open new handles, running recovery as you open
them.
See <a class="xref" href="recovery.html#normalrecovery" title="Normal Recovery">Normal Recovery</a> for more information.
</p>
</li>
<li>
<p>
Restart all your database threads.
</p>
</li>
</ol>
</div>
<p>
A traditional way to handle this activity is to spawn a watcher thread that is responsible for making
sure all is well with your threads, and performing the above actions if not.
</p>
<p>
However, in the case where each worker thread opens and maintains its own environment handle, recovery
is complicated for two reasons:
</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>
For some applications and workloads, it might be
worthwhile to give your database threads the
ability to gracefully finalize any on-going
transactions. If this is the case, your
code must be capable of signaling each thread
to halt DB activities and close its
environment. If you simply run recovery against the
environment, your database threads will
detect this and fail in the midst of performing their
database operations.
</p>
</li>
<li>
<p>
Your code must be capable of ensuring only one
thread runs recovery before allowing all other
threads to open their respective environment
handles. Recovery should be single threaded because when
recovery is run against an environment, it is
deleted and then recreated. This will cause all
other processes and threads to "fail" when they
attempt operations against the newly recovered
environment. If all threads run recovery
when they start up, then it is likely that some
threads will fail because the environment that they
are using has been recovered. This will cause the thread to have to re-execute its own recovery
path. At best, this is inefficient and at worst it could cause your application to fall into an
endless recovery pattern.
</p>
</li>
</ol>
</div>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="multiprocessrecovery"></a>Recovery in Multi-Process Applications</h3>
</div>
</div>
</div>
<p>
Frequently, DB applications use multiple processes to interact with the databases. For example, you may
have a long-running process, such as some kind of server, and then a series of administrative tools that
you use to inspect and administer the underlying databases. Or, in some web-based architectures, different
services are run as independent processes that are managed by the server.
</p>
<p>
In any case, recovery for a multi-process environment is complicated for two reasons:
</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>
In the event that recovery must be run, you might
want to notify processes interacting with the environment
that recovery is about to occur and give them a
chance to gracefully terminate. Whether it is
worthwhile for you to do this is entirely dependent
upon the nature of your application. Some
long-running applications with multiple processes
performing meaningful work might want to do this.
Other applications with processes performing database
operations that are likely to be harmed by error conditions in other
processes will likely find it to be not worth the
effort. For this latter group, the chances of
performing a graceful shutdown may be low anyway.
</p>
</li>
<li>
<p>
Unlike single process scenarios, it can quickly become wasteful for every process interacting
with the databases to run recovery when it starts up. This is partly because recovery
<span class="emphasis"><em>does</em></span> take some amount of time to run, but mostly you want to
avoid a situation where your server must
reopen all its environment handles just because you fire up a command line database
administrative utility that always runs recovery.
</p>
</li>
</ol>
</div>
<p>
The following sections describe a mechanism that you can use to determine if and when you should run
recovery in a multi-process application.
</p>
<div class="sect3" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h4 class="title"><a id="mp_recover_effects"></a>Effects of Multi-Process Recovery</h4>
</div>
</div>
</div>
<p>
Before continuing, it is worth noting that the following sections describe recovery processes than
can result in one process running recovery while other processes are currently actively performing
database operations.
</p>
<p>
When this happens, the current database operation will
abnormally fail, indicating a DB_RUNRECOVERY condition.
This means that your application should immediately abandon any database operations that it may have
on-going, discard any environment handles it has opened, and obtain and open new handles.
</p>
<p>
The net effect of this is that any writes performed by unresolved transactions will be lost.
For persistent applications (servers, for example), the services it provides will also be
unavailable for the amount of time that it takes to complete a recovery and for all participating
processes to reopen their environment handles.
</p>
</div>
<div class="sect3" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h4 class="title"><a id="db_register"></a>Process Registration</h4>
</div>
</div>
</div>
<p>
One way to handle multi-process recovery is for every process to "register" its environment. In
doing so, the process gains the ability to see if any other applications are using the
environment and, if so, whether they have suffered an abnormal termination. If an abnormal
termination is detected, the process runs recovery; otherwise, it does not.
</p>
<p>
Note that using process registration also ensures that
recovery is serialized across applications. That is,
only one process at a time has a chance to run
recovery. Generally this means that the first process
to start up will run recovery, and all other processes
will silently not run recovery because it is not
needed.
</p>
<p>
To cause your application to register its environment, you specify
<span>
<code class="literal">true</code> to the <code class="methodname">EnvironmentConfig.setRegister()</code>
method when you open your environment. You may also specify
<code class="literal">true</code> to <code class="methodname">EnvironmentConfig.setRunRecovery()</code>.
However, it is an error to specify <code class="literal">true</code> to
<code class="methodname">EnvironmentConfig.setRunFatalRecovery()</code> when
you are also registering your environment with the <code class="methodname">setRegister()</code>
method.
</span>
If during the open, DB determines that recovery must be run, it will automatically run the correct
type of recovery for you, so long as you specify normal recovery
on your environment open. If you do not specify normal recovery, and you register your environment,
then no recovery is run if the registration process identifies a need for it. In this case,
the environment open simply fails by
<span>throwing <code class="classname">RunRecoveryException</code>.</span>
</p>
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>
If you do not specify normal recovery when you open your first registered environment
in the application, then that application will fail the environment open by
<span>throwing <code class="classname">RunRecoveryException</code>.</span>
This is because the first process to register must create an internal
registration file, and recovery is forced when that file is created. To
avoid an abnormal termination of the environment open, specify recovery on
the environment open for at least the first process starting in your
application.
</p>
</div>
<p>
Be aware that there are some limitations/requirements if you want your various processes to
coordinate recovery using registration:
</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>
There can be only one environment handle per
environment per process. In the case of multi-threaded
processes, the environment handle must be shared across threads.
</p>
</li>
<li>
<p>
All processes sharing the environment must use registration. If registration is
not uniformly used across all participating processes, then you can see inconsistent results
in terms of your application's ability to recognize that recovery must be run.
</p>
</li>
</ol>
</div>
</div>
</div>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href="recovery.html">Prev</a> </td>
<td width="20%" align="center">
<a accesskey="u" href="filemanagement.html">Up</a>
</td>
<td width="40%" align="right"> <a accesskey="n" href="hotfailover.html">Next</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Recovery Procedures </td>
<td width="20%" align="center">
<a accesskey="h" href="index.html">Home</a>
</td>
<td width="40%" align="right" valign="top"> Using Hot Failovers</td>
</tr>
</table>
</div>
</body>
</html>