Commit fbb9ce95 authored by Ingo Molnar's avatar Ingo Molnar Committed by Linus Torvalds
Browse files

[PATCH] lockdep: core

Do 'make oldconfig' and accept all the defaults for new config options -
reboot into the kernel and if everything goes well it should boot up fine and
you should have /proc/lockdep and /proc/lockdep_stats files.

Typically if the lock validator finds some problem it will print out
voluminous debug output that begins with "BUG: ..." and which syslog output
can be used by kernel developers to figure out the precise locking scenario.

What does the lock validator do?  It "observes" and maps all locking rules as
they occur dynamically (as triggered by the kernel's natural use of spinlocks,
rwlocks, mutexes and rwsems).  Whenever the lock validator subsystem detects a
new locking scenario, it validates this new rule against the existing set of
rules.  If this new rule is consistent with the existing set of rules then the
new rule is added transparently and the kernel continues as normal.  If the
new rule could create a deadlock scenario then this condition is printed out.

When determining validity of locking, all possible "deadlock scenarios" are
considered: assuming arbitrary number of CPUs, arbitrary irq context and task
context constellations, running arbitrary combinations of all the existing
locking scenarios.  In a typical system this means millions of separate
scenarios.  This is why we call it a "locking correctness" validator - for all
rules that are observed the lock validator proves it with mathematical
certainty that a deadlock could not occur (assuming that the lock validator
implementation itself is correct and its internal data structures are not
corrupted by some other kernel subsystem).  [see more details and conditionals
of this statement in include/linux/lockdep.h and
Documentation/lockdep-design.txt]

Furthermore, this "all possible scenarios" property of the validator also
enables the finding of complex, highly unlikely multi-CPU multi-context races
via single single-context rules, increasing the likelyhood of finding bugs
drastically.  In practical terms: the lock validator already found a bug in
the upstream kernel that could only occur on systems with 3 or more CPUs, and
which needed 3 very unlikely code sequences to occur at once on the 3 CPUs.
That bug was found and reported on a single-CPU system (!).  So in essence a
race will be found "piecemail-wise", triggering all the necessary components
for the race, without having to reproduce the race scenario itself!  In its
short existence the lock validator found and reported many bugs before they
actually caused a real deadlock.

To further increase the efficiency of the validator, the mapping is not per
"lock instance", but per "lock-class".  For example, all struct inode objects
in the kernel have inode->inotify_mutex.  If there are 10,000 inodes cached,
then there are 10,000 lock objects.  But ->inotify_mutex is a single "lock
type", and all locking activities that occur against ->inotify_mutex are
"unified" into this single lock-class.  The advantage of the lock-class
approach is that all historical ->inotify_mutex uses are mapped into a single
(and as narrow as possible) set of locking rules - regardless of how many
different tasks or inode structures it took to build this set of rules.  The
set of rules persist during the lifetime of the kernel.

To see the rough magnitude of checking that the lock validator does, here's a
portion of /proc/lockdep_stats, fresh after bootup:

 lock-classes:                            694 [max: 2048]
 direct dependencies:                  1598 [max: 8192]
 indirect dependencies:               17896
 all direct dependencies:             16206
 dependency chains:                    1910 [max: 8192]
 in-hardirq chains:                      17
 in-softirq chains:                     105
 in-process chains:                    1065
 stack-trace entries:                 38761 [max: 131072]
 combined max dependencies:         2033928
 hardirq-safe locks:                     24
 hardirq-unsafe locks:                  176
 softirq-safe locks:                     53
 softirq-unsafe locks:                  137
 irq-safe locks:                         59
 irq-unsafe locks:                      176

The lock validator has observed 1598 actual single-thread locking patterns,
and has validated all possible 2033928 distinct locking scenarios.

More details about the design of the lock validator can be found in
Documentation/lockdep-design.txt, which can also found at:

   http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt



[bunk@stusta.de: cleanups]
Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent cae2ed9a
......@@ -3,6 +3,7 @@
#include <linux/preempt.h>
#include <linux/smp_lock.h>
#include <linux/lockdep.h>
#include <asm/hardirq.h>
#include <asm/system.h>
......@@ -122,7 +123,7 @@ static inline void account_system_vtime(struct task_struct *tsk)
*/
extern void irq_exit(void);
#define nmi_enter() irq_enter()
#define nmi_exit() __irq_exit()
#define nmi_enter() do { lockdep_off(); irq_enter(); } while (0)
#define nmi_exit() do { __irq_exit(); lockdep_on(); } while (0)
#endif /* LINUX_HARDIRQ_H */
......@@ -4,6 +4,7 @@
#include <linux/file.h>
#include <linux/rcupdate.h>
#include <linux/irqflags.h>
#include <linux/lockdep.h>
#define INIT_FDTABLE \
{ \
......@@ -126,6 +127,7 @@ extern struct group_info init_groups;
.fs_excl = ATOMIC_INIT(0), \
.pi_lock = SPIN_LOCK_UNLOCKED, \
INIT_TRACE_IRQFLAGS \
INIT_LOCKDEP \
}
......
/*
* Runtime locking correctness validator
*
* Copyright (C) 2006 Red Hat, Inc., Ingo Molnar <mingo@redhat.com>
*
* see Documentation/lockdep-design.txt for more details.
*/
#ifndef __LINUX_LOCKDEP_H
#define __LINUX_LOCKDEP_H
#include <linux/linkage.h>
#include <linux/list.h>
#include <linux/debug_locks.h>
#include <linux/stacktrace.h>
#ifdef CONFIG_LOCKDEP
/*
* Lock-class usage-state bits:
*/
enum lock_usage_bit
{
LOCK_USED = 0,
LOCK_USED_IN_HARDIRQ,
LOCK_USED_IN_SOFTIRQ,
LOCK_ENABLED_SOFTIRQS,
LOCK_ENABLED_HARDIRQS,
LOCK_USED_IN_HARDIRQ_READ,
LOCK_USED_IN_SOFTIRQ_READ,
LOCK_ENABLED_SOFTIRQS_READ,
LOCK_ENABLED_HARDIRQS_READ,
LOCK_USAGE_STATES
};
/*
* Usage-state bitmasks:
*/
#define LOCKF_USED (1 << LOCK_USED)
#define LOCKF_USED_IN_HARDIRQ (1 << LOCK_USED_IN_HARDIRQ)
#define LOCKF_USED_IN_SOFTIRQ (1 << LOCK_USED_IN_SOFTIRQ)
#define LOCKF_ENABLED_HARDIRQS (1 << LOCK_ENABLED_HARDIRQS)
#define LOCKF_ENABLED_SOFTIRQS (1 << LOCK_ENABLED_SOFTIRQS)
#define LOCKF_ENABLED_IRQS (LOCKF_ENABLED_HARDIRQS | LOCKF_ENABLED_SOFTIRQS)
#define LOCKF_USED_IN_IRQ (LOCKF_USED_IN_HARDIRQ | LOCKF_USED_IN_SOFTIRQ)
#define LOCKF_USED_IN_HARDIRQ_READ (1 << LOCK_USED_IN_HARDIRQ_READ)
#define LOCKF_USED_IN_SOFTIRQ_READ (1 << LOCK_USED_IN_SOFTIRQ_READ)
#define LOCKF_ENABLED_HARDIRQS_READ (1 << LOCK_ENABLED_HARDIRQS_READ)
#define LOCKF_ENABLED_SOFTIRQS_READ (1 << LOCK_ENABLED_SOFTIRQS_READ)
#define LOCKF_ENABLED_IRQS_READ \
(LOCKF_ENABLED_HARDIRQS_READ | LOCKF_ENABLED_SOFTIRQS_READ)
#define LOCKF_USED_IN_IRQ_READ \
(LOCKF_USED_IN_HARDIRQ_READ | LOCKF_USED_IN_SOFTIRQ_READ)
#define MAX_LOCKDEP_SUBCLASSES 8UL
/*
* Lock-classes are keyed via unique addresses, by embedding the
* lockclass-key into the kernel (or module) .data section. (For
* static locks we use the lock address itself as the key.)
*/
struct lockdep_subclass_key {
char __one_byte;
} __attribute__ ((__packed__));
struct lock_class_key {
struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES];
};
/*
* The lock-class itself:
*/
struct lock_class {
/*
* class-hash:
*/
struct list_head hash_entry;
/*
* global list of all lock-classes:
*/
struct list_head lock_entry;
struct lockdep_subclass_key *key;
unsigned int subclass;
/*
* IRQ/softirq usage tracking bits:
*/
unsigned long usage_mask;
struct stack_trace usage_traces[LOCK_USAGE_STATES];
/*
* These fields represent a directed graph of lock dependencies,
* to every node we attach a list of "forward" and a list of
* "backward" graph nodes.
*/
struct list_head locks_after, locks_before;
/*
* Generation counter, when doing certain classes of graph walking,
* to ensure that we check one node only once:
*/
unsigned int version;
/*
* Statistics counter:
*/
unsigned long ops;
const char *name;
int name_version;
};
/*
* Map the lock object (the lock instance) to the lock-class object.
* This is embedded into specific lock instances:
*/
struct lockdep_map {
struct lock_class_key *key;
struct lock_class *class[MAX_LOCKDEP_SUBCLASSES];
const char *name;
};
/*
* Every lock has a list of other locks that were taken after it.
* We only grow the list, never remove from it:
*/
struct lock_list {
struct list_head entry;
struct lock_class *class;
struct stack_trace trace;
};
/*
* We record lock dependency chains, so that we can cache them:
*/
struct lock_chain {
struct list_head entry;
u64 chain_key;
};
struct held_lock {
/*
* One-way hash of the dependency chain up to this point. We
* hash the hashes step by step as the dependency chain grows.
*
* We use it for dependency-caching and we skip detection
* passes and dependency-updates if there is a cache-hit, so
* it is absolutely critical for 100% coverage of the validator
* to have a unique key value for every unique dependency path
* that can occur in the system, to make a unique hash value
* as likely as possible - hence the 64-bit width.
*
* The task struct holds the current hash value (initialized
* with zero), here we store the previous hash value:
*/
u64 prev_chain_key;
struct lock_class *class;
unsigned long acquire_ip;
struct lockdep_map *instance;
/*
* The lock-stack is unified in that the lock chains of interrupt
* contexts nest ontop of process context chains, but we 'separate'
* the hashes by starting with 0 if we cross into an interrupt
* context, and we also keep do not add cross-context lock
* dependencies - the lock usage graph walking covers that area
* anyway, and we'd just unnecessarily increase the number of
* dependencies otherwise. [Note: hardirq and softirq contexts
* are separated from each other too.]
*
* The following field is used to detect when we cross into an
* interrupt context:
*/
int irq_context;
int trylock;
int read;
int check;
int hardirqs_off;
};
/*
* Initialization, self-test and debugging-output methods:
*/
extern void lockdep_init(void);
extern void lockdep_info(void);
extern void lockdep_reset(void);
extern void lockdep_reset_lock(struct lockdep_map *lock);
extern void lockdep_free_key_range(void *start, unsigned long size);
extern void lockdep_off(void);
extern void lockdep_on(void);
extern int lockdep_internal(void);
/*
* These methods are used by specific locking variants (spinlocks,
* rwlocks, mutexes and rwsems) to pass init/acquire/release events
* to lockdep:
*/
extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
struct lock_class_key *key);
/*
* Reinitialize a lock key - for cases where there is special locking or
* special initialization of locks so that the validator gets the scope
* of dependencies wrong: they are either too broad (they need a class-split)
* or they are too narrow (they suffer from a false class-split):
*/
#define lockdep_set_class(lock, key) \
lockdep_init_map(&(lock)->dep_map, #key, key)
#define lockdep_set_class_and_name(lock, key, name) \
lockdep_init_map(&(lock)->dep_map, name, key)
/*
* Acquire a lock.
*
* Values for "read":
*
* 0: exclusive (write) acquire
* 1: read-acquire (no recursion allowed)
* 2: read-acquire with same-instance recursion allowed
*
* Values for check:
*
* 0: disabled
* 1: simple checks (freeing, held-at-exit-time, etc.)
* 2: full validation
*/
extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check, unsigned long ip);
extern void lock_release(struct lockdep_map *lock, int nested,
unsigned long ip);
# define INIT_LOCKDEP .lockdep_recursion = 0,
#else /* !LOCKDEP */
static inline void lockdep_off(void)
{
}
static inline void lockdep_on(void)
{
}
static inline int lockdep_internal(void)
{
return 0;
}
# define lock_acquire(l, s, t, r, c, i) do { } while (0)
# define lock_release(l, n, i) do { } while (0)
# define lockdep_init() do { } while (0)
# define lockdep_info() do { } while (0)
# define lockdep_init_map(lock, name, key) do { (void)(key); } while (0)
# define lockdep_set_class(lock, key) do { (void)(key); } while (0)
# define lockdep_set_class_and_name(lock, key, name) \
do { (void)(key); } while (0)
# define INIT_LOCKDEP
# define lockdep_reset() do { debug_locks = 1; } while (0)
# define lockdep_free_key_range(start, size) do { } while (0)
/*
* The class key takes no space if lockdep is disabled:
*/
struct lock_class_key { };
#endif /* !LOCKDEP */
#ifdef CONFIG_TRACE_IRQFLAGS
extern void early_boot_irqs_off(void);
extern void early_boot_irqs_on(void);
#else
# define early_boot_irqs_off() do { } while (0)
# define early_boot_irqs_on() do { } while (0)
#endif
/*
* For trivial one-depth nesting of a lock-class, the following
* global define can be used. (Subsystems with multiple levels
* of nesting should define their own lock-nesting subclasses.)
*/
#define SINGLE_DEPTH_NESTING 1
/*
* Map the dependency ops to NOP or to real lockdep ops, depending
* on the per lock-class debug mode:
*/
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, i)
# else
# define spin_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, i)
# endif
# define spin_release(l, n, i) lock_release(l, n, i)
#else
# define spin_acquire(l, s, t, i) do { } while (0)
# define spin_release(l, n, i) do { } while (0)
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, i)
# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 2, i)
# else
# define rwlock_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, i)
# define rwlock_acquire_read(l, s, t, i) lock_acquire(l, s, t, 2, 1, i)
# endif
# define rwlock_release(l, n, i) lock_release(l, n, i)
#else
# define rwlock_acquire(l, s, t, i) do { } while (0)
# define rwlock_acquire_read(l, s, t, i) do { } while (0)
# define rwlock_release(l, n, i) do { } while (0)
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, i)
# else
# define mutex_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, i)
# endif
# define mutex_release(l, n, i) lock_release(l, n, i)
#else
# define mutex_acquire(l, s, t, i) do { } while (0)
# define mutex_release(l, n, i) do { } while (0)
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# ifdef CONFIG_PROVE_LOCKING
# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 2, i)
# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 2, i)
# else
# define rwsem_acquire(l, s, t, i) lock_acquire(l, s, t, 0, 1, i)
# define rwsem_acquire_read(l, s, t, i) lock_acquire(l, s, t, 1, 1, i)
# endif
# define rwsem_release(l, n, i) lock_release(l, n, i)
#else
# define rwsem_acquire(l, s, t, i) do { } while (0)
# define rwsem_acquire_read(l, s, t, i) do { } while (0)
# define rwsem_release(l, n, i) do { } while (0)
#endif
#endif /* __LINUX_LOCKDEP_H */
......@@ -886,6 +886,13 @@ struct task_struct {
int hardirq_context;
int softirq_context;
#endif
#ifdef CONFIG_LOCKDEP
# define MAX_LOCK_DEPTH 30UL
u64 curr_chain_key;
int lockdep_depth;
struct held_lock held_locks[MAX_LOCK_DEPTH];
unsigned int lockdep_recursion;
#endif
/* journalling filesystem info */
void *journal_info;
......
......@@ -48,6 +48,7 @@
#include <linux/unwind.h>
#include <linux/buffer_head.h>
#include <linux/debug_locks.h>
#include <linux/lockdep.h>
#include <asm/io.h>
#include <asm/bugs.h>
......@@ -457,6 +458,15 @@ asmlinkage void __init start_kernel(void)
smp_setup_processor_id();
/*
* Need to run as early as possible, to initialize the
* lockdep hash:
*/
lockdep_init();
local_irq_disable();
early_boot_irqs_off();
/*
* Interrupts are still disabled. Do necessary setups, then
* enable them
......@@ -502,6 +512,7 @@ asmlinkage void __init start_kernel(void)
profile_init();
if (!irqs_disabled())
printk("start_kernel(): bug: interrupts were enabled early\n");
early_boot_irqs_on();
local_irq_enable();
/*
......@@ -512,6 +523,9 @@ asmlinkage void __init start_kernel(void)
console_init();
if (panic_later)
panic(panic_later, panic_param);
lockdep_info();
/*
* Need to run this when irqs are enabled, because it wants
* to self-test [hard/soft]-irqs on/off lock inversion bugs
......
......@@ -13,6 +13,7 @@ obj-y = sched.o fork.o exec_domain.o panic.o printk.o profile.o \
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-y += time/
obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o
obj-$(CONFIG_LOCKDEP) += lockdep.o
obj-$(CONFIG_FUTEX) += futex.o
ifeq ($(CONFIG_COMPAT),y)
obj-$(CONFIG_FUTEX) += futex_compat.o
......
......@@ -1061,6 +1061,11 @@ static task_t *copy_process(unsigned long clone_flags,
p->hardirq_context = 0;
p->softirq_context = 0;
#endif
#ifdef CONFIG_LOCKDEP
p->lockdep_depth = 0; /* no locks held yet */
p->curr_chain_key = 0;
p->lockdep_recursion = 0;
#endif
rt_mutex_init_task(p);
......
......@@ -410,6 +410,12 @@ int request_irq(unsigned int irq,
struct irqaction *action;
int retval;
#ifdef CONFIG_LOCKDEP
/*
* Lockdep wants atomic interrupt handlers:
*/
irqflags |= SA_INTERRUPT;
#endif
/*
* Sanity-check: shared interrupts must pass in a real dev-ID,
* otherwise we'll have trouble later trying to figure out
......
This diff is collapsed.
/*
* kernel/lockdep_internals.h
*
* Runtime locking correctness validator
*
* lockdep subsystem internal functions and variables.
*/
/*
* MAX_LOCKDEP_ENTRIES is the maximum number of lock dependencies
* we track.
*
* We use the per-lock dependency maps in two ways: we grow it by adding
* every to-be-taken lock to all currently held lock's own dependency
* table (if it's not there yet), and we check it for lock order
* conflicts and deadlocks.
*/
#define MAX_LOCKDEP_ENTRIES 8192UL
#define MAX_LOCKDEP_KEYS_BITS 11
#define MAX_LOCKDEP_KEYS (1UL << MAX_LOCKDEP_KEYS_BITS)
#define MAX_LOCKDEP_CHAINS_BITS 13
#define MAX_LOCKDEP_CHAINS (1UL << MAX_LOCKDEP_CHAINS_BITS)
/*
* Stack-trace: tightly packed array of stack backtrace
* addresses. Protected by the hash_lock.
*/
#define MAX_STACK_TRACE_ENTRIES 131072UL
extern struct list_head all_lock_classes;
extern void
get_usage_chars(struct lock_class *class, char *c1, char *c2, char *c3, char *c4);
extern const char * __get_key_name(struct lockdep_subclass_key *key, char *str);
extern unsigned long nr_lock_classes;
extern unsigned long nr_list_entries;
extern unsigned long nr_lock_chains;
extern unsigned long nr_stack_trace_entries;
extern unsigned int nr_hardirq_chains;
extern unsigned int nr_softirq_chains;
extern unsigned int nr_process_chains;
extern unsigned int max_lockdep_depth;
extern unsigned int max_recursion_depth;
#ifdef CONFIG_DEBUG_LOCKDEP
/*
* Various lockdep statistics:
*/
extern atomic_t chain_lookup_hits;
extern atomic_t chain_lookup_misses;
extern atomic_t hardirqs_on_events;
extern atomic_t hardirqs_off_events;
extern atomic_t redundant_hardirqs_on;
extern atomic_t redundant_hardirqs_off;
extern atomic_t softirqs_on_events;
extern atomic_t softirqs_off_events;
extern atomic_t redundant_softirqs_on;
extern atomic_t redundant_softirqs_off;
extern atomic_t nr_unused_locks;
extern atomic_t nr_cyclic_checks;
extern atomic_t nr_cyclic_check_recursions;
extern atomic_t nr_find_usage_forwards_checks;
extern atomic_t nr_find_usage_forwards_recursions;
extern atomic_t nr_find_usage_backwards_checks;
extern atomic_t nr_find_usage_backwards_recursions;
# define debug_atomic_inc(ptr) atomic_inc(ptr)
# define debug_atomic_dec(ptr) atomic_dec(ptr)
# define debug_atomic_read(ptr) atomic_read(ptr)
#else
# define debug_atomic_inc(ptr) do { } while (0)
# define debug_atomic_dec(ptr) do { } while (0)
# define debug_atomic_read(ptr) 0
#endif
......@@ -1121,6 +1121,9 @@ static void free_module(struct module *mod)
if (mod->percpu)
percpu_modfree(mod->percpu);
/* Free lock-classes: */
lockdep_free_key_range(mod->module_core, mod->core_size);
/* Finally, free the core (containing the module structure) */
module_free(mod, mod->module_core);
}
......
......@@ -48,7 +48,7 @@ config DEBUG_KERNEL
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" if DEBUG_KERNEL
range 12 21
default 17 if S390
default 17 if S390 || LOCKDEP
default 16 if X86_NUMAQ || IA64
default 15 if SMP
default 14
......
......@@ -15,6 +15,7 @@
#include <linux/sched.h>
#include <linux/delay.h>
#include <linux/module.h>
#include <linux/lockdep.h>
#include <linux/spinlock.h>
#include <linux/kallsyms.h>
#include <linux/interrupt.h>
......@@ -889,9 +890,6 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
#include "locking-selftest-softirq.h"
// GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft)
#define lockdep_reset()
#define lockdep_reset_lock(x)
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# define I_SPINLOCK(x) lockdep_reset_lock(&lock_##x.dep_map)
# define I_RWLOCK(x) lockdep_reset_lock(&rwlock_##x.dep_map)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment