Commit 9f244e9c authored by Seiji Aguchi's avatar Seiji Aguchi Committed by Tony Luck

pstore: Avoid deadlock in panic and emergency-restart path

[Issue]

When pstore is in panic and emergency-restart paths, it may be blocked
in those paths because it simply takes spin_lock.

This is an example scenario which pstore may hang up in a panic path:

 - cpuA grabs psinfo->buf_lock
 - cpuB panics and calls smp_send_stop
 - smp_send_stop sends IRQ to cpuA
 - after 1 second, cpuB gives up on cpuA and sends an NMI instead
 - cpuA is now in an NMI handler while still holding buf_lock
 - cpuB is deadlocked

This case may happen if a firmware has a bug and
cpuA is stuck talking with it more than one second.

Also, this is a similar scenario in an emergency-restart path:

 - cpuA grabs psinfo->buf_lock and stucks in a firmware
 - cpuB kicks emergency-restart via either sysrq-b or hangcheck timer.
   And then, cpuB is deadlocked by taking psinfo->buf_lock again.

[Solution]

This patch avoids the deadlocking issues in both panic and emergency_restart
paths by introducing a function, is_non_blocking_path(), to check if a cpu
can be blocked in current path.

With this patch, pstore is not blocked even if another cpu has
taken a spin_lock, in those paths by changing from spin_lock_irqsave
to spin_trylock_irqsave.

In addition, according to a comment of emergency_restart() in kernel/sys.c,
spin_lock shouldn't be taken in an emergency_restart path to avoid
deadlock. This patch fits the comment below.

<snip>
/**
 *      emergency_restart - reboot the system
 *
 *      Without shutting down any hardware or taking any locks
 *      reboot the system.  This is called when we know we are in
 *      trouble so this is our best effort to reboot.  This is
 *      safe to call in interrupt context.
 */
void emergency_restart(void)
<snip>
Signed-off-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
Acked-by: default avatarDon Zickus <dzickus@redhat.com>
Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
parent d1c3ed66
...@@ -96,6 +96,27 @@ static const char *get_reason_str(enum kmsg_dump_reason reason) ...@@ -96,6 +96,27 @@ static const char *get_reason_str(enum kmsg_dump_reason reason)
} }
} }
bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
{
/*
* In case of NMI path, pstore shouldn't be blocked
* regardless of reason.
*/
if (in_nmi())
return true;
switch (reason) {
/* In panic case, other cpus are stopped by smp_send_stop(). */
case KMSG_DUMP_PANIC:
/* Emergency restart shouldn't be blocked by spin lock. */
case KMSG_DUMP_EMERG:
return true;
default:
return false;
}
}
EXPORT_SYMBOL_GPL(pstore_cannot_block_path);
/* /*
* callback from kmsg_dump. (s2,l2) has the most recently * callback from kmsg_dump. (s2,l2) has the most recently
* written bytes, older bytes are in (s1,l1). Save as much * written bytes, older bytes are in (s1,l1). Save as much
...@@ -114,10 +135,12 @@ static void pstore_dump(struct kmsg_dumper *dumper, ...@@ -114,10 +135,12 @@ static void pstore_dump(struct kmsg_dumper *dumper,
why = get_reason_str(reason); why = get_reason_str(reason);
if (in_nmi()) { if (pstore_cannot_block_path(reason)) {
is_locked = spin_trylock(&psinfo->buf_lock); is_locked = spin_trylock_irqsave(&psinfo->buf_lock, flags);
if (!is_locked) if (!is_locked) {
pr_err("pstore dump routine blocked in NMI, may corrupt error record\n"); pr_err("pstore dump routine blocked in %s path, may corrupt error record\n"
, in_nmi() ? "NMI" : why);
}
} else } else
spin_lock_irqsave(&psinfo->buf_lock, flags); spin_lock_irqsave(&psinfo->buf_lock, flags);
oopscount++; oopscount++;
...@@ -143,9 +166,9 @@ static void pstore_dump(struct kmsg_dumper *dumper, ...@@ -143,9 +166,9 @@ static void pstore_dump(struct kmsg_dumper *dumper,
total += hsize + len; total += hsize + len;
part++; part++;
} }
if (in_nmi()) { if (pstore_cannot_block_path(reason)) {
if (is_locked) if (is_locked)
spin_unlock(&psinfo->buf_lock); spin_unlock_irqrestore(&psinfo->buf_lock, flags);
} else } else
spin_unlock_irqrestore(&psinfo->buf_lock, flags); spin_unlock_irqrestore(&psinfo->buf_lock, flags);
} }
......
...@@ -68,12 +68,18 @@ struct pstore_info { ...@@ -68,12 +68,18 @@ struct pstore_info {
#ifdef CONFIG_PSTORE #ifdef CONFIG_PSTORE
extern int pstore_register(struct pstore_info *); extern int pstore_register(struct pstore_info *);
extern bool pstore_cannot_block_path(enum kmsg_dump_reason reason);
#else #else
static inline int static inline int
pstore_register(struct pstore_info *psi) pstore_register(struct pstore_info *psi)
{ {
return -ENODEV; return -ENODEV;
} }
static inline bool
pstore_cannot_block_path(enum kmsg_dump_reason reason)
{
return false;
}
#endif #endif
#endif /*_LINUX_PSTORE_H*/ #endif /*_LINUX_PSTORE_H*/
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment