1. 29 Sep, 2018 4 commits
  2. 26 Sep, 2018 6 commits
  3. 19 Sep, 2018 4 commits
  4. 15 Sep, 2018 8 commits
    • Nadav Amit's avatar
      mm: respect arch_dup_mmap() return value · 7a5d47d5
      Nadav Amit authored
      commit 1ed0cc5a upstream.
      
      Commit d70f2a14 ("include/linux/sched/mm.h: uninline mmdrop_async(),
      etc") ignored the return value of arch_dup_mmap(). As a result, on x86,
      a failure to duplicate the LDT (e.g. due to memory allocation error)
      would leave the duplicated memory mapping in an inconsistent state.
      
      Fix by using the return value, as it was before the change.
      
      Link: http://lkml.kernel.org/r/20180823051229.211856-1-namit@vmware.com
      Fixes: d70f2a14 ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
      Signed-off-by: 's avatarNadav Amit <namit@vmware.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      7a5d47d5
    • Yonghong Song's avatar
      bpf: fix bpffs non-array map seq_show issue · c565269d
      Yonghong Song authored
      [ Upstream commit dc1508a5 ]
      
      In function map_seq_next() of kernel/bpf/inode.c,
      the first key will be the "0" regardless of the map type.
      This works for array. But for hash type, if it happens
      key "0" is in the map, the bpffs map show will miss
      some items if the key "0" is not the first element of
      the first bucket.
      
      This patch fixed the issue by guaranteeing to get
      the first element, if the seq_show is just started,
      by passing NULL pointer key to map_get_next_key() callback.
      This way, no missing elements will occur for
      bpffs hash table show even if key "0" is in the map.
      
      Fixes: a26ca7c9 ("bpf: btf: Add pretty print support to the basic arraymap")
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarYonghong Song <yhs@fb.com>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c565269d
    • Daniel Borkmann's avatar
      bpf, sockmap: fix leakage of smap_psock_map_entry · 2262b26d
      Daniel Borkmann authored
      [ Upstream commit d40b0116 ]
      
      While working on sockmap I noticed that we do not always kfree the
      struct smap_psock_map_entry list elements which track psocks attached
      to maps. In the case of sock_hash_ctx_update_elem(), these map entries
      are allocated outside of __sock_map_ctx_update_elem() with their
      linkage to the socket hash table filled. In the case of sock array,
      the map entries are allocated inside of __sock_map_ctx_update_elem()
      and added with their linkage to the psock->maps. Both additions are
      under psock->maps_lock each.
      
      Now, we drop these elements from their psock->maps list in a few
      occasions: i) in sock array via smap_list_map_remove() when an entry
      is either deleted from the map from user space, or updated via
      user space or BPF program where we drop the old socket at that map
      slot, or the sock array is freed via sock_map_free() and drops all
      its elements; ii) for sock hash via smap_list_hash_remove() in exactly
      the same occasions as just described for sock array; iii) in the
      bpf_tcp_close() where we remove the elements from the list via
      psock_map_pop() and iterate over them dropping themselves from either
      sock array or sock hash; and last but not least iv) once again in
      smap_gc_work() which is a callback for deferring the work once the
      psock refcount hit zero and thus the socket is being destroyed.
      
      Problem is that the only case where we kfree() the list entry is
      in case iv), which at that point should have an empty list in
      normal cases. So in cases from i) to iii) we unlink the elements
      without freeing where they go out of reach from us. Hence fix is
      to properly kfree() them as well to stop the leakage. Given these
      are all handled under psock->maps_lock there is no need for deferred
      RCU freeing.
      
      I later also ran with kmemleak detector and it confirmed the finding
      as well where in the state before the fix the object goes unreferenced
      while after the patch no kmemleak report related to BPF showed up.
      
        [...]
        unreferenced object 0xffff880378eadae0 (size 64):
          comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s)
          hex dump (first 32 bytes):
            00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de  ................
            50 4d 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00  PMu]............
          backtrace:
            [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210
            [<0000000045dd6d3c>] bpf_sock_map_update+0x29/0x60
            [<00000000877723aa>] ___bpf_prog_run+0x1e1f/0x4960
            [<000000002ef89e83>] 0xffffffffffffffff
        unreferenced object 0xffff880378ead240 (size 64):
          comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s)
          hex dump (first 32 bytes):
            00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de  ................
            00 44 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00  .Du]............
          backtrace:
            [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210
            [<0000000030e37a3a>] sock_map_update_elem+0x125/0x240
            [<000000002e5ce36e>] map_update_elem+0x4eb/0x7b0
            [<00000000db453cc9>] __x64_sys_bpf+0x1f9/0x360
            [<0000000000763660>] do_syscall_64+0x9a/0x300
            [<00000000422a2bb2>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
            [<000000002ef89e83>] 0xffffffffffffffff
        [...]
      
      Fixes: e9db4ef6 ("bpf: sockhash fix omitted bucket lock in sock_close")
      Fixes: 54fedb42 ("bpf: sockmap, fix smap_list_map_remove when psock is in many maps")
      Fixes: 2f857d04 ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: 's avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2262b26d
    • Daniel Borkmann's avatar
      bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexist · 7fb58bc7
      Daniel Borkmann authored
      [ Upstream commit 585f5a62 ]
      
      The current code in sock_map_ctx_update_elem() allows for BPF_EXIST
      and BPF_NOEXIST map update flags. While on array-like maps this approach
      is rather uncommon, e.g. bpf_fd_array_map_update_elem() and others
      enforce map update flags to be BPF_ANY such that xchg() can be used
      directly, the current implementation in sock map does not guarantee
      that such operation with BPF_EXIST / BPF_NOEXIST is atomic.
      
      The initial test does a READ_ONCE(stab->sock_map[i]) to fetch the
      socket from the slot which is then tested for NULL / non-NULL. However
      later after __sock_map_ctx_update_elem(), the actual update is done
      through osock = xchg(&stab->sock_map[i], sock). Problem is that in
      the meantime a different CPU could have updated / deleted a socket
      on that specific slot and thus flag contraints won't hold anymore.
      
      I've been thinking whether best would be to just break UAPI and do
      an enforcement of BPF_ANY to check if someone actually complains,
      however trouble is that already in BPF kselftest we use BPF_NOEXIST
      for the map update, and therefore it might have been copied into
      applications already. The fix to keep the current behavior intact
      would be to add a map lock similar to the sock hash bucket lock only
      for covering the whole map.
      
      Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: 's avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7fb58bc7
    • Daniel Borkmann's avatar
      bpf, sockmap: fix map elem deletion race with smap_stop_sock · 98d30c6a
      Daniel Borkmann authored
      [ Upstream commit 166ab6f0 ]
      
      The smap_start_sock() and smap_stop_sock() are each protected under
      the sock->sk_callback_lock from their call-sites except in the case
      of sock_map_delete_elem() where we drop the old socket from the map
      slot. This is racy because the same sock could be part of multiple
      sock maps, so we run smap_stop_sock() in parallel, and given at that
      point psock->strp_enabled might be true on both CPUs, we might for
      example wrongly restore the sk->sk_data_ready / sk->sk_write_space.
      Therefore, hold the sock->sk_callback_lock as well on delete. Looks
      like 2f857d04 ("bpf: sockmap, remove STRPARSER map_flags and add
      multi-map support") had this right, but later on e9db4ef6 ("bpf:
      sockhash fix omitted bucket lock in sock_close") removed it again
      from delete leaving this smap_stop_sock() instance unprotected.
      
      Fixes: e9db4ef6 ("bpf: sockhash fix omitted bucket lock in sock_close")
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: 's avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98d30c6a
    • Johannes Berg's avatar
      workqueue: re-add lockdep dependencies for flushing · 13892551
      Johannes Berg authored
      [ Upstream commit 87915adc ]
      
      In flush_work(), we need to create a lockdep dependency so that
      the following scenario is appropriately tagged as a problem:
      
        work_function()
        {
          mutex_lock(&mutex);
          ...
        }
      
        other_function()
        {
          mutex_lock(&mutex);
          flush_work(&work); // or cancel_work_sync(&work);
        }
      
      This is a problem since the work might be running and be blocked
      on trying to acquire the mutex.
      
      Similarly, in flush_workqueue().
      
      These were removed after cross-release partially caught these
      problems, but now cross-release was reverted anyway. IMHO the
      removal was erroneous anyway though, since lockdep should be
      able to catch potential problems, not just actual ones, and
      cross-release would only have caught the problem when actually
      invoking wait_for_completion().
      
      Fixes: fd1a5b04 ("workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes")
      Signed-off-by: 's avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13892551
    • Johannes Berg's avatar
      workqueue: skip lockdep wq dependency in cancel_work_sync() · b7a3d36d
      Johannes Berg authored
      [ Upstream commit d6e89786 ]
      
      In cancel_work_sync(), we can only have one of two cases, even
      with an ordered workqueue:
       * the work isn't running, just cancelled before it started
       * the work is running, but then nothing else can be on the
         workqueue before it
      
      Thus, we need to skip the lockdep workqueue dependency handling,
      otherwise we get false positive reports from lockdep saying that
      we have a potential deadlock when the workqueue also has other
      work items with locking, e.g.
      
        work1_function() { mutex_lock(&mutex); ... }
        work2_function() { /* nothing */ }
      
        other_function() {
          queue_work(ordered_wq, &work1);
          queue_work(ordered_wq, &work2);
          mutex_lock(&mutex);
          cancel_work_sync(&work2);
        }
      
      As described above, this isn't a problem, but lockdep will
      currently flag it as if cancel_work_sync() was flush_work(),
      which *is* a problem.
      Signed-off-by: 's avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7a3d36d
    • Jann Horn's avatar
      fork: don't copy inconsistent signal handler state to child · 0de1a998
      Jann Horn authored
      [ Upstream commit 06e62a46 ]
      
      Before this change, if a multithreaded process forks while one of its
      threads is changing a signal handler using sigaction(), the memcpy() in
      copy_sighand() can race with the struct assignment in do_sigaction().  It
      isn't clear whether this can cause corruption of the userspace signal
      handler pointer, but it definitely can cause inconsistency between
      different fields of struct sigaction.
      
      Take the appropriate spinlock to avoid this.
      
      I have tested that this patch prevents inconsistency between sa_sigaction
      and sa_flags, which is possible before this patch.
      
      Link: http://lkml.kernel.org/r/20180702145108.73189-1-jannh@google.comSigned-off-by: 's avatarJann Horn <jannh@google.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0de1a998
  5. 09 Sep, 2018 11 commits
  6. 05 Sep, 2018 7 commits