• David Miller's avatar
    sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK · b9f8fcd5
    David Miller authored
    Relax stable-sched-clock architectures to not save/disable/restore
    hardirqs in cpu_clock().
    The background is that I was trying to resolve a sparc64 perf
    issue when I discovered this problem.
    On sparc64 I implement pseudo NMIs by simply running the kernel
    at IRQ level 14 when local_irq_disable() is called, this allows
    performance counter events to still come in at IRQ level 15.
    This doesn't work if any code in an NMI handler does
    local_irq_save() or local_irq_disable() since the "disable" will
    kick us back to cpu IRQ level 14 thus letting NMIs back in and
    we recurse.
    The only path which that does that in the perf event IRQ
    handling path is the code supporting frequency based events.  It
    uses cpu_clock().
    cpu_clock() simply invokes sched_clock() with IRQs disabled.
    And that's a fundamental bug all on it's own, particularly for
    the HAVE_UNSTABLE_SCHED_CLOCK case.  NMIs can thus get into the
    sched_clock() code interrupting the local IRQ disable code
    sections of it.
    Furthermore, for the not-HAVE_UNSTABLE_SCHED_CLOCK case, the IRQ
    disabling done by cpu_clock() is just pure overhead and
    completely unnecessary.
    So the core problem is that sched_clock() is not NMI safe, but
    we are invoking it from NMI contexts in the perf events code
    (via cpu_clock()).
    A less important issue is the overhead of IRQ disabling when it
    isn't necessary in cpu_clock().
    CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures are not
    affected by this patch.
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Mike Galbraith <efault@gmx.de>
    LKML-Reference: <20091213.182502.215092085.davem@davemloft.net>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
sched_clock.c 5.59 KB