Skip to content
Snippets Groups Projects
  1. Dec 15, 2020
  2. Dec 11, 2020
  3. Dec 02, 2020
  4. Nov 30, 2020
  5. Nov 27, 2020
  6. Nov 24, 2020
  7. Nov 23, 2020
  8. Nov 19, 2020
  9. Nov 17, 2020
    • Mickaël Salaün's avatar
      seccomp: Set PF_SUPERPRIV when checking capability · fb14528e
      Mickaël Salaün authored
      
      Replace the use of security_capable(current_cred(), ...) with
      ns_capable_noaudit() which set PF_SUPERPRIV.
      
      Since commit 98f368e9 ("kernel: Add noaudit variant of
      ns_capable()"), a new ns_capable_noaudit() helper is available.  Let's
      use it!
      
      Cc: Jann Horn <jannh@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
      Cc: Will Drewry <wad@chromium.org>
      Cc: stable@vger.kernel.org
      Fixes: e2cfabdf ("seccomp: add system call filtering using BPF")
      Signed-off-by: default avatarMickaël Salaün <mic@linux.microsoft.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20201030123849.770769-3-mic@digikod.net
      fb14528e
    • Mickaël Salaün's avatar
      ptrace: Set PF_SUPERPRIV when checking capability · cf237052
      Mickaël Salaün authored
      
      Commit 69f594a3 ("ptrace: do not audit capability check when outputing
      /proc/pid/stat") replaced the use of ns_capable() with
      has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV.
      
      Commit 6b3ad664 ("ptrace: reintroduce usage of subjective credentials in
      ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with
      security_capable(), which doesn't set PF_SUPERPRIV neither.
      
      Since commit 98f368e9 ("kernel: Add noaudit variant of ns_capable()"), a
      new ns_capable_noaudit() helper is available.  Let's use it!
      
      As a result, the signature of ptrace_has_cap() is restored to its original one.
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
      Cc: stable@vger.kernel.org
      Fixes: 6b3ad664 ("ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()")
      Fixes: 69f594a3 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
      Signed-off-by: default avatarMickaël Salaün <mic@linux.microsoft.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20201030123849.770769-2-mic@digikod.net
      cf237052
    • Boqun Feng's avatar
      lockdep: Put graph lock/unlock under lock_recursion protection · 43be4388
      Boqun Feng authored
      
      A warning was hit when running xfstests/generic/068 in a Hyper-V guest:
      
      [...] ------------[ cut here ]------------
      [...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
      [...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 check_flags.part.0+0x165/0x170
      [...] ...
      [...] Workqueue: events pwq_unbound_release_workfn
      [...] RIP: 0010:check_flags.part.0+0x165/0x170
      [...] ...
      [...] Call Trace:
      [...]  lock_is_held_type+0x72/0x150
      [...]  ? lock_acquire+0x16e/0x4a0
      [...]  rcu_read_lock_sched_held+0x3f/0x80
      [...]  __send_ipi_one+0x14d/0x1b0
      [...]  hv_send_ipi+0x12/0x30
      [...]  __pv_queued_spin_unlock_slowpath+0xd1/0x110
      [...]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
      [...]  .slowpath+0x9/0xe
      [...]  lockdep_unregister_key+0x128/0x180
      [...]  pwq_unbound_release_workfn+0xbb/0xf0
      [...]  process_one_work+0x227/0x5c0
      [...]  worker_thread+0x55/0x3c0
      [...]  ? process_one_work+0x5c0/0x5c0
      [...]  kthread+0x153/0x170
      [...]  ? __kthread_bind_mask+0x60/0x60
      [...]  ret_from_fork+0x1f/0x30
      
      The cause of the problem is we have call chain lockdep_unregister_key()
      -> <irq disabled by raw_local_irq_save()> lockdep_unlock() ->
      arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() ->
      __send_ipi_one() -> trace_hyperv_send_ipi_one().
      
      Although this particular warning is triggered because Hyper-V has a
      trace point in ipi sending, but in general arch_spin_unlock() may call
      another function having a trace point in it, so put the arch_spin_lock()
      and arch_spin_unlock() after lock_recursion protection to fix this
      problem and avoid similiar problems.
      
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20201113110512.1056501-1-boqun.feng@gmail.com
      43be4388
    • Juri Lelli's avatar
      sched/deadline: Fix priority inheritance with multiple scheduling classes · 2279f540
      Juri Lelli authored
      
      Glenn reported that "an application [he developed produces] a BUG in
      deadline.c when a SCHED_DEADLINE task contends with CFS tasks on nested
      PTHREAD_PRIO_INHERIT mutexes.  I believe the bug is triggered when a CFS
      task that was boosted by a SCHED_DEADLINE task boosts another CFS task
      (nested priority inheritance).
      
       ------------[ cut here ]------------
       kernel BUG at kernel/sched/deadline.c:1462!
       invalid opcode: 0000 [#1] PREEMPT SMP
       CPU: 12 PID: 19171 Comm: dl_boost_bug Tainted: ...
       Hardware name: ...
       RIP: 0010:enqueue_task_dl+0x335/0x910
       Code: ...
       RSP: 0018:ffffc9000c2bbc68 EFLAGS: 00010002
       RAX: 0000000000000009 RBX: ffff888c0af94c00 RCX: ffffffff81e12500
       RDX: 000000000000002e RSI: ffff888c0af94c00 RDI: ffff888c10b22600
       RBP: ffffc9000c2bbd08 R08: 0000000000000009 R09: 0000000000000078
       R10: ffffffff81e12440 R11: ffffffff81e1236c R12: ffff888bc8932600
       R13: ffff888c0af94eb8 R14: ffff888c10b22600 R15: ffff888bc8932600
       FS:  00007fa58ac55700(0000) GS:ffff888c10b00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fa58b523230 CR3: 0000000bf44ab003 CR4: 00000000007606e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
        ? intel_pstate_update_util_hwp+0x13/0x170
        rt_mutex_setprio+0x1cc/0x4b0
        task_blocks_on_rt_mutex+0x225/0x260
        rt_spin_lock_slowlock_locked+0xab/0x2d0
        rt_spin_lock_slowlock+0x50/0x80
        hrtimer_grab_expiry_lock+0x20/0x30
        hrtimer_cancel+0x13/0x30
        do_nanosleep+0xa0/0x150
        hrtimer_nanosleep+0xe1/0x230
        ? __hrtimer_init_sleeper+0x60/0x60
        __x64_sys_nanosleep+0x8d/0xa0
        do_syscall_64+0x4a/0x100
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
       RIP: 0033:0x7fa58b52330d
       ...
       ---[ end trace 0000000000000002 ]—
      
      He also provided a simple reproducer creating the situation below:
      
       So the execution order of locking steps are the following
       (N1 and N2 are non-deadline tasks. D1 is a deadline task. M1 and M2
       are mutexes that are enabled * with priority inheritance.)
      
       Time moves forward as this timeline goes down:
      
       N1              N2               D1
       |               |                |
       |               |                |
       Lock(M1)        |                |
       |               |                |
       |             Lock(M2)           |
       |               |                |
       |               |              Lock(M2)
       |               |                |
       |             Lock(M1)           |
       |             (!!bug triggered!) |
      
      Daniel reported a similar situation as well, by just letting ksoftirqd
      run with DEADLINE (and eventually block on a mutex).
      
      Problem is that boosted entities (Priority Inheritance) use static
      DEADLINE parameters of the top priority waiter. However, there might be
      cases where top waiter could be a non-DEADLINE entity that is currently
      boosted by a DEADLINE entity from a different lock chain (i.e., nested
      priority chains involving entities of non-DEADLINE classes). In this
      case, top waiter static DEADLINE parameters could be null (initialized
      to 0 at fork()) and replenish_dl_entity() would hit a BUG().
      
      Fix this by keeping track of the original donor and using its parameters
      when a task is boosted.
      
      Reported-by: default avatarGlenn Elliott <glenn@aurora.tech>
      Reported-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Link: https://lkml.kernel.org/r/20201117061432.517340-1-juri.lelli@redhat.com
      2279f540
    • Peter Zijlstra's avatar
      sched: Fix rq->nr_iowait ordering · ec618b84
      Peter Zijlstra authored
      
        schedule()				ttwu()
          deactivate_task();			  if (p->on_rq && ...) // false
      					    atomic_dec(&task_rq(p)->nr_iowait);
          if (prev->in_iowait)
            atomic_inc(&rq->nr_iowait);
      
      Allows nr_iowait to be decremented before it gets incremented,
      resulting in more dodgy IO-wait numbers than usual.
      
      Note that because we can now do ttwu_queue_wakelist() before
      p->on_cpu==0, we lose the natural ordering and have to further delay
      the decrement.
      
      Fixes: c6e7bd7a ("sched/core: Optimize ttwu() spinning on p->on_cpu")
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Link: https://lkml.kernel.org/r/20201117093829.GD3121429@hirez.programming.kicks-ass.net
      ec618b84
    • Quentin Perret's avatar
      sched/fair: Fix overutilized update in enqueue_task_fair() · 8e1ac429
      Quentin Perret authored
      
      enqueue_task_fair() attempts to skip the overutilized update for new
      tasks as their util_avg is not accurate yet. However, the flag we check
      to do so is overwritten earlier on in the function, which makes the
      condition pretty much a nop.
      
      Fix this by saving the flag early on.
      
      Fixes: 2802bf3c ("sched/fair: Add over-utilization/tipping point indicator")
      Reported-by: default avatarRick Yiu <rickyiu@google.com>
      Signed-off-by: default avatarQuentin Perret <qperret@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Reviewed-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Link: https://lkml.kernel.org/r/20201112111201.2081902-1-qperret@google.com
      8e1ac429
  10. Nov 16, 2020
  11. Nov 15, 2020
  12. Nov 14, 2020
    • Thomas Gleixner's avatar
      genirq: Remove GENERIC_IRQ_LEGACY_ALLOC_HWIRQ · f296dcd6
      Thomas Gleixner authored
      
      Commit bb9d8126 ("arch: remove tile port") removed the last user of
      this cruft two years ago...
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/87eekvac06.fsf@nanos.tec.linutronix.de
      f296dcd6
    • Christophe Leroy's avatar
      panic: don't dump stack twice on warn · 2f31ad64
      Christophe Leroy authored
      
      Before commit 3f388f28 ("panic: dump registers on panic_on_warn"),
      __warn() was calling show_regs() when regs was not NULL, and show_stack()
      otherwise.
      
      After that commit, show_stack() is called regardless of whether
      show_regs() has been called or not, leading to duplicated Call Trace:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/nohash/8xx.c:186 mmu_mark_initmem_nx+0x24/0x94
        CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092
        NIP:  c00128b4 LR: c0010228 CTR: 00000000
        REGS: c9023e40 TRAP: 0700   Not tainted  (5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty)
        MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24000424  XER: 00000000
      
        GPR00: c0010228 c9023ef8 c2100000 0074c000 ffffffff 00000000 c2151000 c07b3880
        GPR08: ff000900 0074c000 c8000000 c33b53a8 24000822 00000000 c0003a20 00000000
        GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00800000
        NIP [c00128b4] mmu_mark_initmem_nx+0x24/0x94
        LR [c0010228] free_initmem+0x20/0x58
        Call Trace:
          free_initmem+0x20/0x58
          kernel_init+0x1c/0x114
          ret_from_kernel_thread+0x14/0x1c
        Instruction dump:
        7d291850 7d234b78 4e800020 9421ffe0 7c0802a6 bfc10018 3fe0c060 3bff0000
        3fff4080 3bffffff 90010024 57ff0010 <0fe00000> 392001cd 7c3e0b78 953e0008
        CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092
        Call Trace:
          __warn+0x8c/0xd8 (unreliable)
          report_bug+0x11c/0x154
          program_check_exception+0x1dc/0x6e0
          ret_from_except_full+0x0/0x4
        --- interrupt: 700 at mmu_mark_initmem_nx+0x24/0x94
            LR = free_initmem+0x20/0x58
          free_initmem+0x20/0x58
          kernel_init+0x1c/0x114
          ret_from_kernel_thread+0x14/0x1c
        ---[ end trace 31702cd2a9570752 ]---
      
      Only call show_stack() when regs is NULL.
      
      Fixes: 3f388f28 ("panic: dump registers on panic_on_warn")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Link: https://lkml.kernel.org/r/e8c055458b080707f1bc1a98ff8bea79d0cec445.1604748361.git.christophe.leroy@csgroup.eu
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2f31ad64
    • Santosh Sivaraj's avatar
      kernel/watchdog: fix watchdog_allowed_mask not used warning · e7e04615
      Santosh Sivaraj authored
      
      Define watchdog_allowed_mask only when SOFTLOCKUP_DETECTOR is enabled.
      
      Fixes: 7feeb9cd ("watchdog/sysctl: Clean up sysctl variable name space")
      Signed-off-by: default avatarSantosh Sivaraj <santosh@fossix.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20201106015025.1281561-1-santosh@fossix.org
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7e04615
    • Matteo Croce's avatar
      reboot: fix overflow parsing reboot cpu number · df5b0ab3
      Matteo Croce authored
      
      Limit the CPU number to num_possible_cpus(), because setting it to a
      value lower than INT_MAX but higher than NR_CPUS produces the following
      error on reboot and shutdown:
      
          BUG: unable to handle page fault for address: ffffffff90ab1bb0
          #PF: supervisor read access in kernel mode
          #PF: error_code(0x0000) - not-present page
          PGD 1c09067 P4D 1c09067 PUD 1c0a063 PMD 0
          Oops: 0000 [#1] SMP
          CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.9.0-rc8-kvm #110
          Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
          RIP: 0010:migrate_to_reboot_cpu+0xe/0x60
          Code: ea ea 00 48 89 fa 48 c7 c7 30 57 f1 81 e9 fa ef ff ff 66 2e 0f 1f 84 00 00 00 00 00 53 8b 1d d5 ea ea 00 e8 14 33 fe ff 89 da <48> 0f a3 15 ea fc bd 00 48 89 d0 73 29 89 c2 c1 e8 06 65 48 8b 3c
          RSP: 0018:ffffc90000013e08 EFLAGS: 00010246
          RAX: ffff88801f0a0000 RBX: 0000000077359400 RCX: 0000000000000000
          RDX: 0000000077359400 RSI: 0000000000000002 RDI: ffffffff81c199e0
          RBP: ffffffff81c1e3c0 R08: ffff88801f41f000 R09: ffffffff81c1e348
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
          R13: 00007f32bedf8830 R14: 00000000fee1dead R15: 0000000000000000
          FS:  00007f32bedf8980(0000) GS:ffff88801f480000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffffff90ab1bb0 CR3: 000000001d057000 CR4: 00000000000006a0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
            __do_sys_reboot.cold+0x34/0x5b
            do_syscall_64+0x2d/0x40
      
      Fixes: 1b3a5d02 ("reboot: move arch/x86 reboot= handling to generic kernel")
      Signed-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201103214025.116799-3-mcroce@linux.microsoft.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df5b0ab3
    • Matteo Croce's avatar
      Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint" · 8b92c4ff
      Matteo Croce authored
      
      Patch series "fix parsing of reboot= cmdline", v3.
      
      The parsing of the reboot= cmdline has two major errors:
      
       - a missing bound check can crash the system on reboot
      
       - parsing of the cpu number only works if specified last
      
      Fix both.
      
      This patch (of 2):
      
      This reverts commit 616feab7.
      
      kstrtoint() and simple_strtoul() have a subtle difference which makes
      them non interchangeable: if a non digit character is found amid the
      parsing, the former will return an error, while the latter will just
      stop parsing, e.g.  simple_strtoul("123xyx") = 123.
      
      The kernel cmdline reboot= argument allows to specify the CPU used for
      rebooting, with the syntax `s####` among the other flags, e.g.
      "reboot=warm,s31,force", so if this flag is not the last given, it's
      silently ignored as well as the subsequent ones.
      
      Fixes: 616feab7 ("kernel/reboot.c: convert simple_strtoul to kstrtoint")
      Signed-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b92c4ff
    • Dmitrii Banshchikov's avatar
      bpf: Relax return code check for subprograms · f782e2c3
      Dmitrii Banshchikov authored
      
      Currently verifier enforces return code checks for subprograms in the
      same manner as it does for program entry points. This prevents returning
      arbitrary scalar values from subprograms. Scalar type of returned values
      is checked by btf_prepare_func_args() and hence it should be safe to
      allow only scalars for now. Relax return code checks for subprograms and
      allow any correct scalar values.
      
      Fixes: 51c39bb1 (bpf: Introduce function-by-function verification)
      Signed-off-by: default avatarDmitrii Banshchikov <me@ubique.spb.ru>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20201113171756.90594-1-me@ubique.spb.ru
      f782e2c3
  13. Nov 11, 2020
  14. Nov 10, 2020
Loading