libdb/docs/programmer_reference/lock_deaddbg.html
2012-11-14 16:35:20 -05:00

180 lines
10 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Deadlock debugging</title>
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
<link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
<link rel="up" href="lock.html" title="Chapter 16.  The Locking Subsystem" />
<link rel="prev" href="lock_timeout.html" title="Deadlock detection using timers" />
<link rel="next" href="lock_page.html" title="Locking granularity" />
</head>
<body>
<div xmlns="" class="navheader">
<div class="libver">
<p>Library Version 11.2.5.3</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center">Deadlock debugging</th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="lock_timeout.html">Prev</a> </td>
<th width="60%" align="center">Chapter 16. 
The Locking Subsystem
</th>
<td width="20%" align="right"> <a accesskey="n" href="lock_page.html">Next</a></td>
</tr>
</table>
<hr />
</div>
<div class="sect1" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title" style="clear: both"><a id="lock_deaddbg"></a>Deadlock debugging</h2>
</div>
</div>
</div>
<p>An occasional debugging problem in Berkeley DB applications is unresolvable
deadlock. The output of the <span class="bold"><strong>-Co</strong></span> flags of the <a href="../api_reference/C/db_stat.html" class="olink">db_stat</a> utility can be used to detect and debug these problems. The following
is a typical example of the output of this utility:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
1 READ 1 HELD a.db handle 0
80000004 WRITE 1 HELD a.db page 3</pre>
<p>In this example, we have opened a database and stored a single key/data
pair in it. Because we have a database handle open, we have a read lock
on that database handle. The database handle lock is the read lock
labeled <span class="emphasis"><em>handle</em></span>. (We can normally ignore handle locks for
the purposes of database debugging, as they will only conflict with
other handle operations, for example, an attempt to remove the database
will block because we are holding the handle locked, but reading and
writing the database will not conflict with the handle lock.)</p>
<p>It is important to note that locker IDs are 32-bit unsigned integers,
and are divided into two name spaces. Locker IDs with the high bit set
(that is, values 80000000 or higher), are locker IDs associated with
transactions. Locker IDs without the high bit set are locker IDs that
are not associated with a transaction. Locker IDs associated with
transactions map one-to-one with the transaction, that is, a transaction
never has more than a single locker ID, and all of the locks acquired
by the transaction will be acquired on behalf of the same locker ID.</p>
<p>We also hold a write lock on the database page where we stored the new
key/data pair. The page lock is labeled <span class="emphasis"><em>page</em></span> and is on page
number 3. If we were to put an additional key/data pair in the
database, we would see the following output:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
80000004 WRITE 2 HELD a.db page 3
1 READ 1 HELD a.db handle 0</pre>
<p>That is, we have acquired a second reference count to page number 3, but
have not acquired any new locks. If we add an entry to a different page
in the database, we would acquire additional locks:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
1 READ 1 HELD a.db handle 0
80000004 WRITE 2 HELD a.db page 3
80000004 WRITE 1 HELD a.db page 2</pre>
<p>Here's a simple example of one lock blocking another one:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
80000004 WRITE 1 HELD a.db page 2
80000005 WRITE 1 WAIT a.db page 2
1 READ 1 HELD a.db handle 0
80000004 READ 1 HELD a.db page 1</pre>
<p>In this example, there are two different transactional lockers (80000004 and
80000005). Locker 80000004 is holding a write lock on page 2, and
locker 80000005 is waiting for a write lock on page 2. This is not a
deadlock, because locker 80000004 is not blocked on anything.
Presumably, the thread of control using locker 80000004 will proceed,
eventually release its write lock on page 2, at which point the thread
of control using locker 80000005 can also proceed, acquiring a write
lock on page 2.</p>
<p>If lockers 80000004 and 80000005 are not in different threads of
control, the result would be <span class="emphasis"><em>self deadlock</em></span>. Self deadlock
is not a true deadlock, and won't be detected by the Berkeley DB deadlock
detector. It's not a true deadlock because, if work could continue to
be done on behalf of locker 80000004, then the lock would eventually be
released, and locker 80000005 could acquire the lock and itself proceed.
So, the key element is that the thread of control holding the lock
cannot proceed because it is the same thread as is blocked waiting on the
lock.</p>
<p>Here's an example of three transactions reaching true deadlock. First,
three different threads of control opened the database, acquiring three
database handle read locks.</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
1 READ 1 HELD a.db handle 0
3 READ 1 HELD a.db handle 0
5 READ 1 HELD a.db handle 0</pre>
<p>The three threads then each began a transaction, and put a key/data pair
on a different page:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
80000008 WRITE 1 HELD a.db page 4
1 READ 1 HELD a.db handle 0
3 READ 1 HELD a.db handle 0
5 READ 1 HELD a.db handle 0
80000006 READ 1 HELD a.db page 1
80000007 READ 1 HELD a.db page 1
80000008 READ 1 HELD a.db page 1
80000006 WRITE 1 HELD a.db page 2
80000007 WRITE 1 HELD a.db page 3</pre>
<p>The thread using locker 80000006 put a new key/data pair on page 2, the
thread using locker 80000007, on page 3, and the thread using locker
80000008 on page 4. Because the database is a 2-level Btree, the tree
was searched, and so each transaction acquired a read lock on the Btree
root page (page 1) as part of this operation.</p>
<p>The three threads then each attempted to put a second key/data pair on
a page currently locked by another thread. The thread using locker
80000006 tried to put a key/data pair on page 3, the thread using locker
80000007 on page 4, and the thread using locker 80000008 on page 2:</p>
<pre class="programlisting">Locks grouped by object
Locker Mode Count Status ----------- Object ----------
80000008 WRITE 1 HELD a.db page 4
80000007 WRITE 1 WAIT a.db page 4
1 READ 1 HELD a.db handle 0
3 READ 1 HELD a.db handle 0
5 READ 1 HELD a.db handle 0
80000006 READ 2 HELD a.db page 1
80000007 READ 2 HELD a.db page 1
80000008 READ 2 HELD a.db page 1
80000006 WRITE 1 HELD a.db page 2
80000008 WRITE 1 WAIT a.db page 2
80000007 WRITE 1 HELD a.db page 3
80000006 WRITE 1 WAIT a.db page 3</pre>
<p>Now, each of the threads of control is blocked, waiting on a different
thread of control.
The thread using locker 80000007 is blocked by
the thread using locker 80000008, due to the lock on page 4.
The thread using locker 80000008 is blocked by
the thread using locker 80000006, due to the lock on page 2.
And the thread using locker 80000006 is blocked by
the thread using locker 80000007, due to the lock on page 3.
Since none of the threads of control can make
progress, one of them will have to be killed in order to resolve the
deadlock.</p>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href="lock_timeout.html">Prev</a> </td>
<td width="20%" align="center">
<a accesskey="u" href="lock.html">Up</a>
</td>
<td width="40%" align="right"> <a accesskey="n" href="lock_page.html">Next</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Deadlock detection using timers </td>
<td width="20%" align="center">
<a accesskey="h" href="index.html">Home</a>
</td>
<td width="40%" align="right" valign="top"> Locking granularity</td>
</tr>
</table>
</div>
</body>
</html>