1. 06 Mar, 2017 1 commit
  2. 02 Mar, 2017 1 commit
  3. 08 Feb, 2017 1 commit
    • Luis R. Rodriguez's avatar
      kernel/ucount.c: mark user_header with kmemleak_ignore() · ed5bd7dc
      Luis R. Rodriguez authored
      The user_header gets caught by kmemleak with the following splat as
      missing a free:
      
        unreferenced object 0xffff99667a733d80 (size 96):
        comm "swapper/0", pid 1, jiffies 4294892317 (age 62191.468s)
        hex dump (first 32 bytes):
          a0 b6 92 b4 ff ff ff ff 00 00 00 00 01 00 00 00  ................
          01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
           kmemleak_alloc+0x4a/0xa0
           __kmalloc+0x144/0x260
           __register_sysctl_table+0x54/0x5e0
           register_sysctl+0x1b/0x20
           user_namespace_sysctl_init+0x17/0x34
           do_one_initcall+0x52/0x1a0
           kernel_init_freeable+0x173/0x200
           kernel_init+0xe/0x100
           ret_from_fork+0x2c/0x40
      
      The BUG_ON()s are intended to crash so no need to clean up after
      ourselves on error there.  This is also a kernel/ subsys_init() we don't
      need a respective exit call here as this is never modular, so just white
      list it.
      
      Link: http://lkml.kernel.org/r/20170203211404.31458-1-mcgrof@kernel.orgSigned-off-by: default avatarLuis R. Rodriguez <mcgrof@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
      Cc: Serge Hallyn <serge@hallyn.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed5bd7dc
  4. 23 Jan, 2017 2 commits
    • Nikolay Borisov's avatar
      inotify: Convert to using per-namespace limits · 1cce1eea
      Nikolay Borisov authored
      This patchset converts inotify to using the newly introduced
      per-userns sysctl infrastructure.
      
      Currently the inotify instances/watches are being accounted in the
      user_struct structure. This means that in setups where multiple
      users in unprivileged containers map to the same underlying
      real user (i.e. pointing to the same user_struct) the inotify limits
      are going to be shared as well, allowing one user(or application) to exhaust
      all others limits.
      
      Fix this by switching the inotify sysctls to using the
      per-namespace/per-user limits. This will allow the server admin to
      set sensible global limits, which can further be tuned inside every
      individual user namespace. Additionally, in order to preserve the
      sysctl ABI make the existing inotify instances/watches sysctls
      modify the values of the initial user namespace.
      Signed-off-by: default avatarNikolay Borisov <n.borisov.lkml@gmail.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      1cce1eea
    • Nikolay Borisov's avatar
      userns: Make ucounts lock irq-safe · 880a3854
      Nikolay Borisov authored
      The ucounts_lock is being used to protect various ucounts lifecycle
      management functionalities. However, those services can also be invoked
      when a pidns is being freed in an RCU callback (e.g. softirq context).
      This can lead to deadlocks. There were already efforts trying to
      prevent similar deadlocks in add7c65c ("pid: fix lockdep deadlock
      warning due to ucount_lock"), however they just moved the context
      from hardirq to softrq. Fix this issue once and for all by explictly
      making the lock disable irqs altogether.
      
      Dmitry Vyukov <dvyukov@google.com> reported:
      
      > I've got the following deadlock report while running syzkaller fuzzer
      > on eec0d3d065bfcdf9cd5f56dd2a36b94d12d32297 of linux-next (on odroid
      > device if it matters):
      >
      > =================================
      > [ INFO: inconsistent lock state ]
      > 4.10.0-rc3-next-20170112-xc2-dirty #6 Not tainted
      > ---------------------------------
      > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      > swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      >  (ucounts_lock){+.?...}, at: [<     inline     >] spin_lock
      > ./include/linux/spinlock.h:302
      >  (ucounts_lock){+.?...}, at: [<ffff2000081678c8>]
      > put_ucounts+0x60/0x138 kernel/ucount.c:162
      > {SOFTIRQ-ON-W} state was registered at:
      > [<ffff2000081c82d8>] mark_lock+0x220/0xb60 kernel/locking/lockdep.c:3054
      > [<     inline     >] mark_irqflags kernel/locking/lockdep.c:2941
      > [<ffff2000081c97a8>] __lock_acquire+0x388/0x3260 kernel/locking/lockdep.c:3295
      > [<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
      > [<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
      > [<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
      > [<     inline     >] spin_lock ./include/linux/spinlock.h:302
      > [<     inline     >] get_ucounts kernel/ucount.c:131
      > [<ffff200008167c28>] inc_ucount+0x80/0x6c8 kernel/ucount.c:189
      > [<     inline     >] inc_mnt_namespaces fs/namespace.c:2818
      > [<ffff200008481850>] alloc_mnt_ns+0x78/0x3a8 fs/namespace.c:2849
      > [<ffff200008487298>] create_mnt_ns+0x28/0x200 fs/namespace.c:2959
      > [<     inline     >] init_mount_tree fs/namespace.c:3199
      > [<ffff200009bd6674>] mnt_init+0x258/0x384 fs/namespace.c:3251
      > [<ffff200009bd60bc>] vfs_caches_init+0x6c/0x80 fs/dcache.c:3626
      > [<ffff200009bb1114>] start_kernel+0x414/0x460 init/main.c:648
      > [<ffff200009bb01e8>] __primary_switched+0x6c/0x70 arch/arm64/kernel/head.S:456
      > irq event stamp: 2316924
      > hardirqs last  enabled at (2316924): [<     inline     >] rcu_do_batch
      > kernel/rcu/tree.c:2911
      > hardirqs last  enabled at (2316924): [<     inline     >]
      > invoke_rcu_callbacks kernel/rcu/tree.c:3182
      > hardirqs last  enabled at (2316924): [<     inline     >]
      > __rcu_process_callbacks kernel/rcu/tree.c:3149
      > hardirqs last  enabled at (2316924): [<ffff200008210414>]
      > rcu_process_callbacks+0x7a4/0xc28 kernel/rcu/tree.c:3166
      > hardirqs last disabled at (2316923): [<     inline     >] rcu_do_batch
      > kernel/rcu/tree.c:2900
      > hardirqs last disabled at (2316923): [<     inline     >]
      > invoke_rcu_callbacks kernel/rcu/tree.c:3182
      > hardirqs last disabled at (2316923): [<     inline     >]
      > __rcu_process_callbacks kernel/rcu/tree.c:3149
      > hardirqs last disabled at (2316923): [<ffff20000820fe80>]
      > rcu_process_callbacks+0x210/0xc28 kernel/rcu/tree.c:3166
      > softirqs last  enabled at (2316912): [<ffff20000811b4c4>]
      > _local_bh_enable+0x4c/0x80 kernel/softirq.c:155
      > softirqs last disabled at (2316913): [<     inline     >]
      > do_softirq_own_stack ./include/linux/interrupt.h:488
      > softirqs last disabled at (2316913): [<     inline     >]
      > invoke_softirq kernel/softirq.c:371
      > softirqs last disabled at (2316913): [<ffff20000811c994>]
      > irq_exit+0x264/0x308 kernel/softirq.c:405
      >
      > other info that might help us debug this:
      >  Possible unsafe locking scenario:
      >
      >        CPU0
      >        ----
      >   lock(ucounts_lock);
      >   <Interrupt>
      >     lock(ucounts_lock);
      >
      >  *** DEADLOCK ***
      >
      > 1 lock held by swapper/2/0:
      >  #0:  (rcu_callback){......}, at: [<     inline     >] __rcu_reclaim
      > kernel/rcu/rcu.h:108
      >  #0:  (rcu_callback){......}, at: [<     inline     >] rcu_do_batch
      > kernel/rcu/tree.c:2919
      >  #0:  (rcu_callback){......}, at: [<     inline     >]
      > invoke_rcu_callbacks kernel/rcu/tree.c:3182
      >  #0:  (rcu_callback){......}, at: [<     inline     >]
      > __rcu_process_callbacks kernel/rcu/tree.c:3149
      >  #0:  (rcu_callback){......}, at: [<ffff200008210390>]
      > rcu_process_callbacks+0x720/0xc28 kernel/rcu/tree.c:3166
      >
      > stack backtrace:
      > CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.10.0-rc3-next-20170112-xc2-dirty #6
      > Hardware name: Hardkernel ODROID-C2 (DT)
      > Call trace:
      > [<ffff20000808fa60>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:500
      > [<ffff20000808fec0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:225
      > [<ffff2000088a99e0>] dump_stack+0x110/0x168
      > [<ffff2000082fa2b4>] print_usage_bug.part.27+0x49c/0x4bc
      > kernel/locking/lockdep.c:2387
      > [<     inline     >] print_usage_bug kernel/locking/lockdep.c:2357
      > [<     inline     >] valid_state kernel/locking/lockdep.c:2400
      > [<     inline     >] mark_lock_irq kernel/locking/lockdep.c:2617
      > [<ffff2000081c89ec>] mark_lock+0x934/0xb60 kernel/locking/lockdep.c:3065
      > [<     inline     >] mark_irqflags kernel/locking/lockdep.c:2923
      > [<ffff2000081c9a60>] __lock_acquire+0x640/0x3260 kernel/locking/lockdep.c:3295
      > [<ffff2000081cce24>] lock_acquire+0xa4/0x138 kernel/locking/lockdep.c:3753
      > [<     inline     >] __raw_spin_lock ./include/linux/spinlock_api_smp.h:144
      > [<ffff200009798128>] _raw_spin_lock+0x90/0xd0 kernel/locking/spinlock.c:151
      > [<     inline     >] spin_lock ./include/linux/spinlock.h:302
      > [<ffff2000081678c8>] put_ucounts+0x60/0x138 kernel/ucount.c:162
      > [<ffff200008168364>] dec_ucount+0xf4/0x158 kernel/ucount.c:214
      > [<     inline     >] dec_pid_namespaces kernel/pid_namespace.c:89
      > [<ffff200008293dc8>] delayed_free_pidns+0x40/0xe0 kernel/pid_namespace.c:156
      > [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
      > [<     inline     >] rcu_do_batch kernel/rcu/tree.c:2919
      > [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3182
      > [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3149
      > [<ffff2000082103d8>] rcu_process_callbacks+0x768/0xc28 kernel/rcu/tree.c:3166
      > [<ffff2000080821dc>] __do_softirq+0x324/0x6e0 kernel/softirq.c:284
      > [<     inline     >] do_softirq_own_stack ./include/linux/interrupt.h:488
      > [<     inline     >] invoke_softirq kernel/softirq.c:371
      > [<ffff20000811c994>] irq_exit+0x264/0x308 kernel/softirq.c:405
      > [<ffff2000081ecc28>] __handle_domain_irq+0xc0/0x150 kernel/irq/irqdesc.c:636
      > [<ffff200008081c80>] gic_handle_irq+0x68/0xd8
      > Exception stack(0xffff8000648e7dd0 to 0xffff8000648e7f00)
      > 7dc0:                                   ffff8000648d4b3c 0000000000000007
      > 7de0: 0000000000000000 1ffff0000c91a967 1ffff0000c91a967 1ffff0000c91a967
      > 7e00: ffff20000a4b6b68 0000000000000001 0000000000000007 0000000000000001
      > 7e20: 1fffe4000149ae90 ffff200009d35000 0000000000000000 0000000000000002
      > 7e40: 0000000000000000 0000000000000000 0000000002624a1a 0000000000000000
      > 7e60: 0000000000000000 ffff200009cbcd88 000060006d2ed000 0000000000000140
      > 7e80: ffff200009cff000 ffff200009cb6000 ffff200009cc2020 ffff200009d2159d
      > 7ea0: 0000000000000000 ffff8000648d4380 0000000000000000 ffff8000648e7f00
      > 7ec0: ffff20000820a478 ffff8000648e7f00 ffff20000820a47c 0000000010000145
      > 7ee0: 0000000000000140 dfff200000000000 ffffffffffffffff ffff20000820a478
      > [<ffff2000080837f8>] el1_irq+0xb8/0x130 arch/arm64/kernel/entry.S:486
      > [<     inline     >] arch_local_irq_restore
      > ./arch/arm64/include/asm/irqflags.h:81
      > [<ffff20000820a47c>] rcu_idle_exit+0x64/0xa8 kernel/rcu/tree.c:1030
      > [<     inline     >] cpuidle_idle_call kernel/sched/idle.c:200
      > [<ffff2000081bcbfc>] do_idle+0x1dc/0x2d0 kernel/sched/idle.c:243
      > [<ffff2000081bd1cc>] cpu_startup_entry+0x24/0x28 kernel/sched/idle.c:345
      > [<ffff200008099f8c>] secondary_start_kernel+0x2cc/0x358
      > arch/arm64/kernel/smp.c:276
      > [<000000000279f1a4>] 0x279f1a4
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Fixes: add7c65c ("pid: fix lockdep deadlock warning due to ucount_lock")
      Fixes: f333c700 ("pidns: Add a limit on the number of pid namespaces")
      Cc: stable@vger.kernel.org
      Link: https://www.spinics.net/lists/kernel/msg2426637.htmlSigned-off-by: default avatarNikolay Borisov <n.borisov.lkml@gmail.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      880a3854
  5. 31 Aug, 2016 1 commit
  6. 08 Aug, 2016 9 commits