1. 21 May, 2019 1 commit
  2. 06 Mar, 2019 1 commit
    • Jann Horn's avatar
      x86/unwind: Handle NULL pointer calls better in frame unwinder · f4f34e1b
      Jann Horn authored
      
      
      When the frame unwinder is invoked for an oops caused by a call to NULL, it
      currently skips the parent function because BP still points to the parent's
      stack frame; the (nonexistent) current function only has the first half of
      a stack frame, and BP doesn't point to it yet.
      
      Add a special case for IP==0 that calculates a fake BP from SP, then uses
      the real BP for the next frame.
      
      Note that this handles first_frame specially: Return information about the
      parent function as long as the saved IP is >=first_frame, even if the fake
      BP points below it.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace is:
      
      Call Trace:
       prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-1-jannh@google.com
      f4f34e1b
  3. 10 Oct, 2017 4 commits
    • Josh Poimboeuf's avatar
      x86/unwind: Disable unwinder warnings on 32-bit · d4a2d031
      Josh Poimboeuf authored
      
      
      x86-32 doesn't have stack validation, so in most cases it doesn't make
      sense to warn about bad frame pointers.
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a69658760800bf281e6353248c23e0fa0acf5230.1507597785.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d4a2d031
    • Josh Poimboeuf's avatar
      x86/unwind: Align stack pointer in unwinder dump · 99bd28a4
      Josh Poimboeuf authored
      
      
      When printing the unwinder dump, the stack pointer could be unaligned,
      for one of two reasons:
      
      - stack corruption; or
      
      - GCC created an unaligned stack.
      
      There's no way for the unwinder to tell the difference between the two,
      so we have to assume one or the other.  GCC unaligned stacks are very
      rare, and have only been spotted before GCC 5.  Presumably, if we're
      doing an unwinder stack dump, stack corruption is more likely than a
      GCC unaligned stack.  So always align the stack before starting the
      dump.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2f540c515946ab09ed267e1a1d6421202a0cce08.1507597785.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      99bd28a4
    • Josh Poimboeuf's avatar
      x86/unwind: Use MSB for frame pointer encoding on 32-bit · 5c99b692
      Josh Poimboeuf authored
      
      
      On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
      like:
      
        WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000
      
      And also there were some stack dumps with a bunch of unreliable '?'
      symbols after an apic_timer_interrupt symbol, meaning the unwinder got
      confused when it tried to read the regs.
      
      The cause of those issues is that, with GCC 4.8 (and possibly older),
      there are cases where GCC misaligns the stack pointer in a leaf function
      for no apparent reason:
      
        c124a388 <acpi_rs_move_data>:
        c124a388:       55                      push   %ebp
        c124a389:       89 e5                   mov    %esp,%ebp
        c124a38b:       57                      push   %edi
        c124a38c:       56                      push   %esi
        c124a38d:       89 d6                   mov    %edx,%esi
        c124a38f:       53                      push   %ebx
        c124a390:       31 db                   xor    %ebx,%ebx
        c124a392:       83 ec 03                sub    $0x3,%esp
        ...
        c124a3e3:       83 c4 03                add    $0x3,%esp
        c124a3e6:       5b                      pop    %ebx
        c124a3e7:       5e                      pop    %esi
        c124a3e8:       5f                      pop    %edi
        c124a3e9:       5d                      pop    %ebp
        c124a3ea:       c3                      ret
      
      If an interrupt occurs in such a function, the regs on the stack will be
      unaligned, which breaks the frame pointer encoding assumption.  So on
      32-bit, use the MSB instead of the LSB to encode the regs.
      
      This isn't an issue on 64-bit, because interrupts align the stack before
      writing to it.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5c99b692
    • Josh Poimboeuf's avatar
      x86/unwind: Fix dereference of untrusted pointer · 62dd86ac
      Josh Poimboeuf authored
      Tetsuo Handa and Fengguang Wu reported a panic in the unwinder:
      
        BUG: unable to handle kernel NULL pointer dereference at 000001f2
        IP: update_stack_state+0xd4/0x340
        *pde = 00000000
      
        Oops: 0000 [#1] PREEMPT SMP
        CPU: 0 PID: 18728 Comm: 01-cpu-hotplug Not tainted 4.13.0-rc4-00170-gb09be676
      
       #592
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
        task: bb0b53c0 task.stack: bb3ac000
        EIP: update_stack_state+0xd4/0x340
        EFLAGS: 00010002 CPU: 0
        EAX: 0000a570 EBX: bb3adccb ECX: 0000f401 EDX: 0000a570
        ESI: 00000001 EDI: 000001ba EBP: bb3adc6b ESP: bb3adc3f
         DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
        CR0: 80050033 CR2: 000001f2 CR3: 0b3a7000 CR4: 00140690
        DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
        DR6: fffe0ff0 DR7: 00000400
        Call Trace:
         ? unwind_next_frame+0xea/0x400
         ? __unwind_start+0xf5/0x180
         ? __save_stack_trace+0x81/0x160
         ? save_stack_trace+0x20/0x30
         ? __lock_acquire+0xfa5/0x12f0
         ? lock_acquire+0x1c2/0x230
         ? tick_periodic+0x3a/0xf0
         ? _raw_spin_lock+0x42/0x50
         ? tick_periodic+0x3a/0xf0
         ? tick_periodic+0x3a/0xf0
         ? debug_smp_processor_id+0x12/0x20
         ? tick_handle_periodic+0x23/0xc0
         ? local_apic_timer_interrupt+0x63/0x70
         ? smp_trace_apic_timer_interrupt+0x235/0x6a0
         ? trace_apic_timer_interrupt+0x37/0x3c
         ? strrchr+0x23/0x50
        Code: 0f 95 c1 89 c7 89 45 e4 0f b6 c1 89 c6 89 45 dc 8b 04 85 98 cb 74 bc 88 4d e3 89 45 f0 83 c0 01 84 c9 89 04 b5 98 cb 74 bc 74 3b <8b> 47 38 8b 57 34 c6 43 1d 01 25 00 00 02 00 83 e2 03 09 d0 83
        EIP: update_stack_state+0xd4/0x340 SS:ESP: 0068:bb3adc3f
        CR2: 00000000000001f2
        ---[ end trace 0d147fd4aba8ff50 ]---
        Kernel panic - not syncing: Fatal exception in interrupt
      
      On x86-32, after decoding a frame pointer to get a regs address,
      regs_size() dereferences the regs pointer when it checks regs->cs to see
      if the regs are user mode.  This is dangerous because it's possible that
      what looks like a decoded frame pointer is actually a corrupt value, and
      we don't want the unwinder to make things worse.
      
      Instead of calling regs_size() on an unsafe pointer, just assume they're
      kernel regs to start with.  Later, once it's safe to access the regs, we
      can do the user mode check and corresponding safety check for the
      remaining two regs.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 5ed8d8bb ("x86/unwind: Move common code into update_stack_state()")
      Link: http://lkml.kernel.org/r/7f95b9a6993dec7674b3f3ab3dcd3294f7b9644d.1507597785.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      62dd86ac
  4. 10 Aug, 2017 1 commit
  5. 26 Jul, 2017 1 commit
    • Josh Poimboeuf's avatar
      x86/unwind: Add the ORC unwinder · ee9f8fce
      Josh Poimboeuf authored
      
      
      Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y.
      It plugs into the existing x86 unwinder framework.
      
      It relies on objtool to generate the needed .orc_unwind and
      .orc_unwind_ip sections.
      
      For more details on why ORC is used instead of DWARF, see
      Documentation/x86/orc-unwinder.txt - but the short version is
      that it's a simplified, fundamentally more robust debugninfo
      data structure, which also allows up to two orders of magnitude
      faster lookups than the DWARF unwinder - which matters to
      profiling workloads like perf.
      
      Thanks to Andy Lutomirski for the performance improvement ideas:
      splitting the ORC unwind table into two parallel arrays and creating a
      fast lookup table to search a subset of the unwind table.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com
      
      
      [ Extended the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ee9f8fce
  6. 24 May, 2017 1 commit
    • Josh Poimboeuf's avatar
      x86/unwind: Add end-of-stack check for ftrace handlers · 519fb5c3
      Josh Poimboeuf authored
      
      
      Dave Jones and Steven Rostedt reported unwinder warnings like the
      following:
      
        WARNING: kernel stack frame pointer at ffff8800bda0ff30 in sshd:1090 has bad value 000055b32abf1fa8
      
      In both cases, the unwinder was attempting to unwind from an ftrace
      handler into entry code.  The callchain was something like:
      
        syscall entry code
          C function
            ftrace handler
              save_stack_trace()
      
      The problem is that the unwinder's end-of-stack logic gets confused by
      the way ftrace lays out the stack frame (with fentry enabled).
      
      I was able to recreate this warning with:
      
        echo call_usermodehelper_exec_async:stacktrace > /sys/kernel/debug/tracing/set_ftrace_filter
        (exit login session)
      
      I considered fixing this by changing the ftrace code to rewrite the
      stack to make the unwinder happy.  But that seemed too intrusive after I
      implemented it.  Instead, just add another check to the unwinder's
      end-of-stack logic to detect this special case.
      
      Side note: We could probably get rid of these end-of-stack checks by
      encoding the frame pointer for syscall entry just like we do for
      interrupt entry.  That would be simpler, but it would also be a lot more
      intrusive since it would slightly affect the performance of every
      syscall.
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: live-patching@vger.kernel.org
      Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer")
      Link: http://lkml.kernel.org/r/671ba22fbc0156b8f7e0cfa5ab2a795e08bc37e1.1495553739.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      519fb5c3
  7. 26 Apr, 2017 2 commits
    • Josh Poimboeuf's avatar
      x86/unwind: Dump all stacks in unwind_dump() · 262fa734
      Josh Poimboeuf authored
      
      
      Currently unwind_dump() dumps only the most recently accessed stack.
      But it has a few issues.
      
      In some cases, 'first_sp' can get out of sync with 'stack_info', causing
      unwind_dump() to start from the wrong address, flood the printk buffer,
      and eventually read a bad address.
      
      In other cases, dumping only the most recently accessed stack doesn't
      give enough data to diagnose the error.
      
      Fix both issues by dumping *all* stacks involved in the trace, not just
      the last one.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8b5e99f0 ("x86/unwind: Dump stack data on warnings")
      Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      262fa734
    • Josh Poimboeuf's avatar
      x86/unwind: Silence more entry-code related warnings · b0d50c7b
      Josh Poimboeuf authored
      
      
      Borislav Petkov reported the following unwinder warning:
      
        WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30
        unwind stack type:0 next_sp:          (null) mask:0x6 graph_idx:0
        ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38)
        ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35)
        ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68)
        ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50)
        ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30)
        ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50)
        ffffc9000024fed8: 0000000000000000 ...
        ffffc9000024fee0: 0000000000000100 (0x100)
        ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488)
        ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000)
        ffffc9000024fef8: 0000000000000000 ...
        ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98)
        ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00)
        ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10)
        ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10)
        ffffc9000024ff30: 0000000000000010 (0x10)
        ffffc9000024ff38: 0000000000000296 (0x296)
        ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50)
        ffffc9000024ff48: 0000000000000018 (0x18)
        ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8)
        ...
      
      It unwinded from an interrupt which came in right after entry code
      called into a C syscall handler, before it had a chance to set up the
      frame pointer, so regs->bp still had its user space value.
      
      Add a check to silence warnings in such a case, where an interrupt
      has occurred and regs->sp is almost at the end of the stack.
      Reported-by: default avatarBorislav Petkov <bp@suse.de>
      Tested-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer")
      Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b0d50c7b
  8. 19 Apr, 2017 3 commits
  9. 14 Apr, 2017 3 commits
  10. 14 Mar, 2017 1 commit
  11. 08 Mar, 2017 1 commit
    • Josh Poimboeuf's avatar
      stacktrace/x86: add function for detecting reliable stack traces · af085d90
      Josh Poimboeuf authored
      
      
      For live patching and possibly other use cases, a stack trace is only
      useful if it can be assured that it's completely reliable.  Add a new
      save_stack_trace_tsk_reliable() function to achieve that.
      
      Note that if the target task isn't the current task, and the target task
      is allowed to run, then it could be writing the stack while the unwinder
      is reading it, resulting in possible corruption.  So the caller of
      save_stack_trace_tsk_reliable() must ensure that the task is either
      'current' or inactive.
      
      save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection
      of pt_regs on the stack.  If the pt_regs are not user-mode registers
      from a syscall, then they indicate an in-kernel interrupt or exception
      (e.g. preemption or a page fault), in which case the stack is considered
      unreliable due to the nature of frame pointers.
      
      It also relies on the x86 unwinder's detection of other issues, such as:
      
      - corrupted stack data
      - stack grows the wrong way
      - stack walk doesn't reach the bottom
      - user didn't provide a large enough entries array
      
      Such issues are reported by checking unwind_error() and !unwind_done().
      
      Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can
      determine at build time whether the function is implemented.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: Ingo Molnar <mingo@kernel.org>	# for the x86 changes
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      af085d90
  12. 02 Mar, 2017 2 commits
  13. 12 Jan, 2017 2 commits
    • Josh Poimboeuf's avatar
      x86/unwind: Disable KASAN checks for non-current tasks · 84936118
      Josh Poimboeuf authored
      
      
      There are a handful of callers to save_stack_trace_tsk() and
      show_stack() which try to unwind the stack of a task other than current.
      In such cases, it's remotely possible that the task is running on one
      CPU while the unwinder is reading its stack from another CPU, causing
      the unwinder to see stack corruption.
      
      These cases seem to be mostly harmless.  The unwinder has checks which
      prevent it from following bad pointers beyond the bounds of the stack.
      So it's not really a bug as long as the caller understands that
      unwinding another task will not always succeed.
      
      In such cases, it's possible that the unwinder may read a KASAN-poisoned
      region of the stack.  Account for that by using READ_ONCE_NOCHECK() when
      reading the stack of another task.
      
      Use READ_ONCE() when reading the stack of the current task, since KASAN
      warnings can still be useful for finding bugs in that case.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      84936118
    • Josh Poimboeuf's avatar
      x86/unwind: Silence warnings for non-current tasks · 900742d8
      Josh Poimboeuf authored
      
      
      There are a handful of callers to save_stack_trace_tsk() and
      show_stack() which try to unwind the stack of a task other than current.
      In such cases, it's remotely possible that the task is running on one
      CPU while the unwinder is reading its stack from another CPU, causing
      the unwinder to see stack corruption.
      
      These cases seem to be mostly harmless.  The unwinder has checks which
      prevent it from following bad pointers beyond the bounds of the stack.
      So it's not really a bug as long as the caller understands that
      unwinding another task will not always succeed.
      
      Since stack "corruption" on another task's stack isn't necessarily a
      bug, silence the warnings when unwinding tasks other than current.
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      900742d8
  14. 23 Dec, 2016 1 commit
    • Josh Poimboeuf's avatar
      Revert "x86/unwind: Detect bad stack return address" · c280f773
      Josh Poimboeuf authored
      Revert the following commit:
      
        b6959a36
      
       ("x86/unwind: Detect bad stack return address")
      
      ... because Andrey Konovalov reported an unwinder warning:
      
        WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467
      
      The unwind was initiated from an interrupt which occurred while running in the
      generated code for a kprobe.  The unwinder printed the warning because it
      expected regs->ip to point to a valid text address, but instead it pointed to
      the generated code.
      
      Eventually we may want come up with a way to identify generated kprobe
      code so the unwinder can know that it's a valid return address.  Until
      then, just remove the warning.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c280f773
  15. 19 Dec, 2016 2 commits
  16. 28 Oct, 2016 1 commit
  17. 27 Oct, 2016 2 commits
  18. 21 Oct, 2016 2 commits
    • Josh Poimboeuf's avatar
      x86/unwind: Create stack frames for saved syscall registers · acb4608a
      Josh Poimboeuf authored
      
      
      The entry code doesn't encode the pt_regs pointer for syscalls.  But the
      pt_regs are always at the same location, so we can add a manual check
      for them.
      
      A later patch prints them as part of the oops stack dump.  They could be
      useful, for example, to determine the arguments to a system call.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e176aa9272930cd3f51fda0b94e2eae356677da4.1476973742.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      acb4608a
    • Josh Poimboeuf's avatar
      x86/entry/unwind: Create stack frames for saved interrupt registers · 946c1911
      Josh Poimboeuf authored
      
      
      With frame pointers, when a task is interrupted, its stack is no longer
      completely reliable because the function could have been interrupted
      before it had a chance to save the previous frame pointer on the stack.
      So the caller of the interrupted function could get skipped by a stack
      trace.
      
      This is problematic for live patching, which needs to know whether a
      stack trace of a sleeping task can be relied upon.  There's currently no
      way to detect if a sleeping task was interrupted by a page fault
      exception or preemption before it went to sleep.
      
      Another issue is that when dumping the stack of an interrupted task, the
      unwinder has no way of knowing where the saved pt_regs registers are, so
      it can't print them.
      
      This solves those issues by encoding the pt_regs pointer in the frame
      pointer on entry from an interrupt or an exception.
      
      This patch also updates the unwinder to be able to decode it, because
      otherwise the unwinder would be broken by this change.
      
      Note that this causes a change in the behavior of the unwinder: each
      instance of a pt_regs on the stack is now considered a "frame".  So
      callers of unwind_get_return_address() will now get an occasional
      'regs->ip' address that would have previously been skipped over.
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8b9f84a21e39d249049e0547b559ff8da0df0988.1476973742.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      946c1911
  19. 20 Sep, 2016 1 commit
    • Josh Poimboeuf's avatar
      x86/unwind: Add new unwind interface and implementations · 7c7900f8
      Josh Poimboeuf authored
      The x86 stack dump code is a bit of a mess.  dump_trace() uses
      callbacks, and each user of it seems to have slightly different
      requirements, so there are several slightly different callbacks floating
      around.
      
      Also there are some upcoming features which will need more changes to
      the stack dump code, including the printing of stack pt_regs, reliable
      stack detection for live patching, and a DWARF unwinder.  Each of those
      features would at least need more callbacks and/or callback interfaces,
      resulting in a much bigger mess than what we have today.
      
      Before doing all that, we should try to clean things up and replace
      dump_trace() with something cleaner and more flexible.
      
      The new unwinder is a simple state machine which was heavily inspired by
      a suggestion from Andy Lutomirski:
      
        https://lkml.kernel.org/r/CALCETrUbNTqaM2LRyXGRx=kVLRPeY5A3Pc6k4TtQxF320rUT=w@mail.gmail.com
      
      It's also similar to the libunwind API:
      
        http://www.nongnu.org/libunwind/man/libunwind(3).html
      
      
      
      Some if its advantages:
      
      - Simplicity: no more callback sprawl and less code duplication.
      
      - Flexibility: it allows the caller to stop and inspect the stack state
        at each step in the unwinding process.
      
      - Modularity: the unwinder code, console stack dump code, and stack
        metadata analysis code are all better separated so that changing one
        of them shouldn't have much of an impact on any of the others.
      
      Two implementations are added which conform to the new unwind interface:
      
      - The frame pointer unwinder which is used for CONFIG_FRAME_POINTER=y.
      
      - The "guess" unwinder which is used for CONFIG_FRAME_POINTER=n.  This
        isn't an "unwinder" per se.  All it does is scan the stack for kernel
        text addresses.  But with no frame pointers, guesses are better than
        nothing in most cases.
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/6dc2f909c47533d213d0505f0a113e64585bec82.1474045023.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7c7900f8