1. 02 Mar, 2019 2 commits
    • Mao Wenan's avatar
      net: sit: fix memory leak in sit_init_net() · 07f12b26
      Mao Wenan authored
      If register_netdev() is failed to register sitn->fb_tunnel_dev,
      it will go to err_reg_dev and forget to free netdev(sitn->fb_tunnel_dev).
      
      BUG: memory leak
      unreferenced object 0xffff888378daad00 (size 512):
        comm "syz-executor.1", pid 4006, jiffies 4295121142 (age 16.115s)
        hex dump (first 32 bytes):
          00 e6 ed c0 83 88 ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
          [<00000000d6dcb63e>] kvmalloc include/linux/mm.h:577 [inline]
          [<00000000d6dcb63e>] kvzalloc include/linux/mm.h:585 [inline]
          [<00000000d6dcb63e>] netif_alloc_netdev_queues net/core/dev.c:8380 [inline]
          [<00000000d6dcb63e>] alloc_netdev_mqs+0x600/0xcc0 net/core/dev.c:8970
          [<00000000867e172f>] sit_init_net+0x295/0xa40 net/ipv6/sit.c:1848
          [<00000000871019fa>] ops_init+0xad/0x3e0 net/core/net_namespace.c:129
          [<00000000319507f6>] setup_net+0x2ba/0x690 net/core/net_namespace.c:314
          [<0000000087db4f96>] copy_net_ns+0x1dc/0x330 net/core/net_namespace.c:437
          [<0000000057efc651>] create_new_namespaces+0x382/0x730 kernel/nsproxy.c:107
          [<00000000676f83de>] copy_namespaces+0x2ed/0x3d0 kernel/nsproxy.c:165
          [<0000000030b74bac>] copy_process.part.27+0x231e/0x6db0 kernel/fork.c:1919
          [<00000000fff78746>] copy_process kernel/fork.c:1713 [inline]
          [<00000000fff78746>] _do_fork+0x1bc/0xe90 kernel/fork.c:2224
          [<000000001c2e0d1c>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000ec48bd44>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<0000000039acff8a>] 0xffffffffffffffff
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f12b26
    • Hangbin Liu's avatar
      ipv4: Add ICMPv6 support when parse route ipproto · 5e1a99ea
      Hangbin Liu authored
      For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers.
      But for ip -6 route, currently we only support tcp, udp and icmp.
      
      Add ICMPv6 support so we can match ipv6-icmp rules for route lookup.
      
      v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to
      rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: eacb9384 ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e1a99ea
  2. 28 Feb, 2019 4 commits
    • Matthias Maennich's avatar
      sctp: chunk.c: correct format string for size_t in printk · ac510505
      Matthias Maennich authored
      According to Documentation/core-api/printk-formats.rst, size_t should be
      printed with %zu, rather than %Zu.
      
      In addition, using %Zu triggers a warning on clang (-Wformat-extra-args):
      
      net/sctp/chunk.c:196:25: warning: data argument not used by format string [-Wformat-extra-args]
                                          __func__, asoc, max_data);
                                          ~~~~~~~~~~~~~~~~^~~~~~~~~
      ./include/linux/printk.h:440:49: note: expanded from macro 'pr_warn_ratelimited'
              printk_ratelimited(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      ./include/linux/printk.h:424:17: note: expanded from macro 'printk_ratelimited'
                      printk(fmt, ##__VA_ARGS__);                             \
                             ~~~    ^
      
      Fixes: 5b5e0928 ("lib/vsprintf.c: remove %Z support")
      Link: https://github.com/ClangBuiltLinux/linux/issues/378Signed-off-by: default avatarMatthias Maennich <maennich@google.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac510505
    • Sheng Lan's avatar
      net: netem: fix skb length BUG_ON in __skb_to_sgvec · 5845f706
      Sheng Lan authored
      It can be reproduced by following steps:
      1. virtio_net NIC is configured with gso/tso on
      2. configure nginx as http server with an index file bigger than 1M bytes
      3. use tc netem to produce duplicate packets and delay:
         tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90%
      4. continually curl the nginx http server to get index file on client
      5. BUG_ON is seen quickly
      
      [10258690.371129] kernel BUG at net/core/skbuff.c:4028!
      [10258690.371748] invalid opcode: 0000 [#1] SMP PTI
      [10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W         5.0.0-rc6 #2
      [10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202
      [10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea
      [10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002
      [10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028
      [10258690.372094] FS:  0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000
      [10258690.372094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0
      [10258690.372094] Call Trace:
      [10258690.372094]  <IRQ>
      [10258690.372094]  skb_to_sgvec+0x11/0x40
      [10258690.372094]  start_xmit+0x38c/0x520 [virtio_net]
      [10258690.372094]  dev_hard_start_xmit+0x9b/0x200
      [10258690.372094]  sch_direct_xmit+0xff/0x260
      [10258690.372094]  __qdisc_run+0x15e/0x4e0
      [10258690.372094]  net_tx_action+0x137/0x210
      [10258690.372094]  __do_softirq+0xd6/0x2a9
      [10258690.372094]  irq_exit+0xde/0xf0
      [10258690.372094]  smp_apic_timer_interrupt+0x74/0x140
      [10258690.372094]  apic_timer_interrupt+0xf/0x20
      [10258690.372094]  </IRQ>
      
      In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's
      linear data size and nonlinear data size, thus BUG_ON triggered.
      Because the skb is cloned and a part of nonlinear data is split off.
      
      Duplicate packet is cloned in netem_enqueue() and may be delayed
      some time in qdisc. When qdisc len reached the limit and returns
      NET_XMIT_DROP, the skb will be retransmit later in write queue.
      the skb will be fragmented by tso_fragment(), the limit size
      that depends on cwnd and mss decrease, the skb's nonlinear
      data will be split off. The length of the skb cloned by netem
      will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(),
      the BUG_ON trigger.
      
      To fix it, netem returns NET_XMIT_SUCCESS to upper stack
      when it clones a duplicate packet.
      
      Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()")
      Signed-off-by: default avatarSheng Lan <lansheng@huawei.com>
      Reported-by: default avatarQin Ji <jiqin.ji@huawei.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5845f706
    • Paul Moore's avatar
      netlabel: fix out-of-bounds memory accesses · 5578de48
      Paul Moore authored
      There are two array out-of-bounds memory accesses, one in
      cipso_v4_map_lvl_valid(), the other in netlbl_bitmap_walk().  Both
      errors are embarassingly simple, and the fixes are straightforward.
      
      As a FYI for anyone backporting this patch to kernels prior to v4.8,
      you'll want to apply the netlbl_bitmap_walk() patch to
      cipso_v4_bitmap_walk() as netlbl_bitmap_walk() doesn't exist before
      Linux v4.8.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Fixes: 446fda4f ("[NetLabel]: CIPSOv4 engine")
      Fixes: 3faa8f98 ("netlabel: Move bitmap manipulation functions to the NetLabel core.")
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5578de48
    • David Ahern's avatar
      ipv4: Pass original device to ip_rcv_finish_core · a1fd1ad2
      David Ahern authored
      ip_route_input_rcu expects the original ingress device (e.g., for
      proper multicast handling). The skb->dev can be changed by l3mdev_ip_rcv,
      so dev needs to be saved prior to calling it. This was the behavior prior
      to the listify changes.
      
      Fixes: 5fa12739 ("net: ipv4: listify ip_rcv_finish")
      Cc: Edward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1fd1ad2
  3. 27 Feb, 2019 1 commit
    • YueHaibing's avatar
      net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails · 58bdd544
      YueHaibing authored
      KASAN report this:
      
      BUG: KASAN: null-ptr-deref in nfc_llcp_build_gb+0x37f/0x540 [nfc]
      Read of size 3 at addr 0000000000000000 by task syz-executor.0/5401
      
      CPU: 0 PID: 5401 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xfa/0x1ce lib/dump_stack.c:113
       kasan_report+0x171/0x18d mm/kasan/report.c:321
       memcpy+0x1f/0x50 mm/kasan/common.c:130
       nfc_llcp_build_gb+0x37f/0x540 [nfc]
       nfc_llcp_register_device+0x6eb/0xb50 [nfc]
       nfc_register_device+0x50/0x1d0 [nfc]
       nfcsim_device_new+0x394/0x67d [nfcsim]
       ? 0xffffffffc1080000
       nfcsim_init+0x6b/0x1000 [nfcsim]
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f9cb79dcc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
      RBP: 00007f9cb79dcc70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9cb79dd6bc
      R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004
      
      nfc_llcp_build_tlv will return NULL on fails, caller should check it,
      otherwise will trigger a NULL dereference.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: eda21f16 ("NFC: Set MIU and RW values from CONNECT and CC LLCP frames")
      Fixes: d646960f ("NFC: Initial LLCP support")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58bdd544
  4. 26 Feb, 2019 4 commits
    • Tung Nguyen's avatar
      tipc: fix race condition causing hung sendto · bfd07f3d
      Tung Nguyen authored
      When sending multicast messages via blocking socket,
      if sending link is congested (tsk->cong_link_cnt is set to 1),
      the sending thread will be put into sleeping state. However,
      tipc_sk_filter_rcv() is called under socket spin lock but
      tipc_wait_for_cond() is not. So, there is no guarantee that
      the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
      CPU-1 will be perceived by CPU-0. If that is the case, the sending
      thread in CPU-0 after being waken up, will continue to see
      tsk->cong_link_cnt as 1 and put the sending thread into sleeping
      state again. The sending thread will sleep forever.
      
      CPU-0                                | CPU-1
      tipc_wait_for_cond()                 |
      {                                    |
       // condition_ = !tsk->cong_link_cnt |
       while ((rc_ = !(condition_))) {     |
        ...                                |
        release_sock(sk_);                 |
        wait_woken();                      |
                                           | if (!sock_owned_by_user(sk))
                                           |  tipc_sk_filter_rcv()
                                           |  {
                                           |   ...
                                           |   tipc_sk_proto_rcv()
                                           |   {
                                           |    ...
                                           |    tsk->cong_link_cnt--;
                                           |    ...
                                           |    sk->sk_write_space(sk);
                                           |    ...
                                           |   }
                                           |   ...
                                           |  }
        sched_annotate_sleep();            |
        lock_sock(sk_);                    |
        remove_wait_queue();               |
       }                                   |
      }                                    |
      
      This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
      and tipc_wait_for_cond().
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfd07f3d
    • David Ahern's avatar
      mpls: Return error for RTA_GATEWAY attribute · be48220e
      David Ahern authored
      MPLS does not support nexthops with an MPLS address family.
      Specifically, it does not handle RTA_GATEWAY attribute. Make it
      clear by returning an error.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be48220e
    • David Ahern's avatar
      ipv6: Return error for RTA_VIA attribute · e3818541
      David Ahern authored
      IPv6 currently does not support nexthops outside of the AF_INET6 family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        $ ip ro ls
        ...
        2001:db8:2::/64 dev eth0 metric 1024 pref medium
      
      Catch this and fail the route add:
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        Error: IPv6 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3818541
    • David Ahern's avatar
      ipv4: Return error for RTA_VIA attribute · b6e9e5df
      David Ahern authored
      IPv4 currently does not support nexthops outside of the AF_INET family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        $ ip ro ls
        ...
        172.16.1.0/24 dev eth0
      
      Catch this and fail the route add:
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        Error: IPv4 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6e9e5df
  5. 25 Feb, 2019 7 commits
    • Nazarov Sergey's avatar
      net: avoid use IPCB in cipso_v4_error · 3da1ed7a
      Nazarov Sergey authored
      Extract IP options in cipso_v4_error and use __icmp_send.
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3da1ed7a
    • Nazarov Sergey's avatar
      net: Add __icmp_send helper. · 9ef6b42a
      Nazarov Sergey authored
      Add __icmp_send function having ip_options struct parameter
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ef6b42a
    • Eric Biggers's avatar
      net: socket: set sock->sk to NULL after calling proto_ops::release() · ff7b11aa
      Eric Biggers authored
      Commit 9060cb71 ("net: crypto set sk to NULL when af_alg_release.")
      fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
      closed concurrently with fchownat().  However, it ignored that many
      other proto_ops::release() methods don't set sock->sk to NULL and
      therefore allow the same use-after-free:
      
          - base_sock_release
          - bnep_sock_release
          - cmtp_sock_release
          - data_sock_release
          - dn_release
          - hci_sock_release
          - hidp_sock_release
          - iucv_sock_release
          - l2cap_sock_release
          - llcp_sock_release
          - llc_ui_release
          - rawsock_release
          - rfcomm_sock_release
          - sco_sock_release
          - svc_release
          - vcc_release
          - x25_release
      
      Rather than fixing all these and relying on every socket type to get
      this right forever, just make __sock_release() set sock->sk to NULL
      itself after calling proto_ops::release().
      
      Reproducer that produces the KASAN splat when any of these socket types
      are configured into the kernel:
      
          #include <pthread.h>
          #include <stdlib.h>
          #include <sys/socket.h>
          #include <unistd.h>
      
          pthread_t t;
          volatile int fd;
      
          void *close_thread(void *arg)
          {
              for (;;) {
                  usleep(rand() % 100);
                  close(fd);
              }
          }
      
          int main()
          {
              pthread_create(&t, NULL, close_thread, NULL);
              for (;;) {
                  fd = socket(rand() % 50, rand() % 11, 0);
                  fchownat(fd, "", 1000, 1000, 0x1000);
                  close(fd);
              }
          }
      
      Fixes: 86741ec2 ("net: core: Add a UID field to struct sock.")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff7b11aa
    • Vlad Buslov's avatar
      net: sched: act_tunnel_key: fix NULL pointer dereference during init · a3df633a
      Vlad Buslov authored
      Metadata pointer is only initialized for action TCA_TUNNEL_KEY_ACT_SET, but
      it is unconditionally dereferenced in tunnel_key_init() error handler.
      Verify that metadata pointer is not NULL before dereferencing it in
      tunnel_key_init error handling code.
      
      Fixes: ee28bb56 ("net/sched: fix memory leak in act_tunnel_key_init()")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3df633a
    • Wen Yang's avatar
      net: dsa: fix a leaked reference by adding missing of_node_put · 9919a363
      Wen Yang authored
      The call to of_parse_phandle returns a node pointer with refcount
      incremented thus it must be explicitly decremented after the last
      usage.
      
      Detected by coccinelle with the following warnings:
      ./net/dsa/port.c:294:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 284, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:627:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:630:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:636:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      ./net/dsa/dsa2.c:639:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function.
      Signed-off-by: default avatarWen Yang <wen.yang99@zte.com.cn>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9919a363
    • Davide Caratti's avatar
      net/sched: act_skbedit: fix refcount leak when replace fails · 6191da98
      Davide Caratti authored
      when act_skbedit was converted to use RCU in the data plane, we added an
      error path, but we forgot to drop the action refcount in case of failure
      during a 'replace' operation:
      
       # tc actions add action skbedit ptype otherhost pass index 100
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 1 bind 0
       # tc actions replace action skbedit ptype otherhost drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'params_new' allocation failed,
      also when the action is being replaced.
      
      Fixes: c749cdda ("net/sched: act_skbedit: don't use spinlock in the data path")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6191da98
    • Davide Caratti's avatar
      net/sched: act_ipt: fix refcount leak when replace fails · 8f67c90e
      Davide Caratti authored
      After commit 4e8ddd7f ("net: sched: don't release reference on action
      overwrite"), the error path of all actions was converted to drop refcount
      also when the action was being overwritten. But we forgot act_ipt_init(),
      in case allocation of 'tname' was not successful:
      
       # tc action add action xt -j LOG --log-prefix hello index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "hello" index 100
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 1 bind 0
       # tc action replace action xt -j LOG --log-prefix world index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "world" index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'tname' allocation failed, also
      when the action is being replaced.
      
      Fixes: 4e8ddd7f ("net: sched: don't release reference on action overwrite")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f67c90e
  6. 24 Feb, 2019 3 commits
    • Eric Dumazet's avatar
      tcp: repaired skbs must init their tso_segs · bf50b606
      Eric Dumazet authored
      syzbot reported a WARN_ON(!tcp_skb_pcount(skb))
      in tcp_send_loss_probe() [1]
      
      This was caused by TCP_REPAIR sent skbs that inadvertenly
      were missing a call to tcp_init_tso_segs()
      
      [1]
      WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       panic+0x2cb/0x65c kernel/panic.c:214
       __warn.cold+0x20/0x45 kernel/panic.c:571
       report_bug+0x263/0x2b0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       fixup_bug arch/x86/kernel/traps.c:173 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
      RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89
      RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206
      RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb
      RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005
      RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1
      R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040
      R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860
       tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583
       tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
       </IRQ>
      RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58
      Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
      RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc
      RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000
       arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555
       default_idle_call+0x36/0x90 kernel/sched/idle.c:93
       cpuidle_idle_call kernel/sched/idle.c:153 [inline]
       do_idle+0x386/0x570 kernel/sched/idle.c:262
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353
       start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 79861919 ("tcp: fix TCP_REPAIR xmit queue setup")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf50b606
    • Eric Dumazet's avatar
      net/x25: fix a race in x25_bind() · 797a22bd
      Eric Dumazet authored
      syzbot was able to trigger another soft lockup [1]
      
      I first thought it was the O(N^2) issue I mentioned in my
      prior fix (f657d22ee1f "net/x25: do not hold the cpu
      too long in x25_new_lci()"), but I eventually found
      that x25_bind() was not checking SOCK_ZAPPED state under
      socket lock protection.
      
      This means that multiple threads can end up calling
      x25_insert_socket() for the same socket, and corrupt x25_list
      
      [1]
      watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492]
      Modules linked in:
      irq event stamp: 27515
      hardirqs last  enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c
      hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c
      softirqs last  enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336
      softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341
      CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97
      Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2
      RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000
      RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628
      RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628
      R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000
      FS:  00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __x25_find_socket net/x25/af_x25.c:327 [inline]
       x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342
       x25_new_lci net/x25/af_x25.c:355 [inline]
       x25_connect+0x380/0xde0 net/x25/af_x25.c:784
       __sys_connect+0x266/0x330 net/socket.c:1662
       __do_sys_connect net/socket.c:1673 [inline]
       __se_sys_connect net/socket.c:1670 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1670
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29
      RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005
      RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4
      R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
      RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86
      Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00
      RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206
      RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00
      RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff
      R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003
      FS:  00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
       do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
       _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
       x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
       x25_bind+0x273/0x340 net/x25/af_x25.c:703
       __sys_bind+0x23f/0x290 net/socket.c:1481
       __do_sys_bind net/socket.c:1492 [inline]
       __se_sys_bind net/socket.c:1490 [inline]
       __x64_sys_bind+0x73/0xb0 net/socket.c:1490
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      
      Fixes: 90c27297 ("X.25 remove bkl in bind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: andrew hendry <andrew.hendry@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      797a22bd
    • Hangbin Liu's avatar
      Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" · 278e2148
      Hangbin Liu authored
      This reverts commit 5a2de63f ("bridge: do not add port to router list
      when receives query with source 0.0.0.0") and commit 0fe5119e ("net:
      bridge: remove ipv6 zero address check in mcast queries")
      
      The reason is RFC 4541 is not a standard but suggestive. Currently we
      will elect 0.0.0.0 as Querier if there is no ip address configured on
      bridge. If we do not add the port which recives query with source
      0.0.0.0 to router list, the IGMP reports will not be about to forward
      to Querier, IGMP data will also not be able to forward to dest.
      
      As Nikolay suggested, revert this change first and add a boolopt api
      to disable none-zero election in future if needed.
      Reported-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Reported-by: default avatarSebastian Gottschall <s.gottschall@newmedia-net.de>
      Fixes: 5a2de63f ("bridge: do not add port to router list when receives query with source 0.0.0.0")
      Fixes: 0fe5119e ("net: bridge: remove ipv6 zero address check in mcast queries")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278e2148
  7. 23 Feb, 2019 4 commits
  8. 22 Feb, 2019 10 commits
    • Kalash Nainwal's avatar
      net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 · 97f0082a
      Kalash Nainwal authored
      Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 to
      keep legacy software happy. This is similar to what was done for
      ipv4 in commit 709772e6 ("net: Fix routing tables with
      id > 255 for legacy software").
      Signed-off-by: default avatarKalash Nainwal <kalash@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97f0082a
    • Jann Horn's avatar
      net: socket: add check for negative optlen in compat setsockopt · 52baf987
      Jann Horn authored
      __sys_setsockopt() already checks for `optlen < 0`. Add an equivalent check
      to the compat path for robustness. This has to be `> INT_MAX` instead of
      `< 0` because the signedness of `optlen` is different here.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52baf987
    • Paolo Abeni's avatar
      ipv6: route: purge exception on removal · f5b51fe8
      Paolo Abeni authored
      When a netdevice is unregistered, we flush the relevant exception
      via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
      
      Finally, we end-up calling rt6_remove_exception(), where we release
      the relevant dst, while we keep the references to the related fib6_info and
      dev. Such references should be released later when the dst will be
      destroyed.
      
      There are a number of caches that can keep the exception around for an
      unlimited amount of time - namely dst_cache, possibly even socket cache.
      As a result device registration may hang, as demonstrated by this script:
      
      ip netns add cl
      ip netns add rt
      ip netns add srv
      ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
      
      ip link add name cl_veth type veth peer name cl_rt_veth
      ip link set dev cl_veth netns cl
      ip -n cl link set dev cl_veth up
      ip -n cl addr add dev cl_veth 2001::2/64
      ip -n cl route add default via 2001::1
      
      ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
      ip -n cl link set tunv6 up
      ip -n cl addr add 2013::2/64 dev tunv6
      
      ip link set dev cl_rt_veth netns rt
      ip -n rt link set dev cl_rt_veth up
      ip -n rt addr add dev cl_rt_veth 2001::1/64
      
      ip link add name rt_srv_veth type veth peer name srv_veth
      ip link set dev srv_veth netns srv
      ip -n srv link set dev srv_veth up
      ip -n srv addr add dev srv_veth 2002::1/64
      ip -n srv route add default via 2002::2
      
      ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
      ip -n srv link set tunv6 up
      ip -n srv addr add 2013::1/64 dev tunv6
      
      ip link set dev rt_srv_veth netns rt
      ip -n rt link set dev rt_srv_veth up
      ip -n rt addr add dev rt_srv_veth 2002::2/64
      
      ip netns exec srv netserver & sleep 0.1
      ip netns exec cl ping6 -c 4 2013::1
      ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
      ip -n rt link set dev rt_srv_veth mtu 1400
      wait %2
      
      ip -n cl link del cl_veth
      
      This commit addresses the issue purging all the references held by the
      exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
      
      v1 -> v2:
       - re-order the code to avoid accessing dst and net after dst_dev_put()
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5b51fe8
    • Lorenzo Bianconi's avatar
      net: ip6_gre: fix possible NULL pointer dereference in ip6erspan_set_version · efcc9bca
      Lorenzo Bianconi authored
      Fix a possible NULL pointer dereference in ip6erspan_set_version checking
      nlattr data pointer
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 7549 Comm: syz-executor432 Not tainted 5.0.0-rc6-next-20190218
      #37
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726
      Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43
      54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f
      85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f
      RSP: 0018:ffff888089ed7168 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000
      RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0
      RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58
      R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000
      R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58
      FS:  0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        ip6erspan_newlink+0x66/0x7b0 net/ipv6/ip6_gre.c:2210
        __rtnl_newlink+0x107b/0x16c0 net/core/rtnetlink.c:3176
        rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3234
        rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192
        netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
        rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210
        netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
        netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
        netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
        sock_sendmsg_nosec net/socket.c:621 [inline]
        sock_sendmsg+0xdd/0x130 net/socket.c:631
        ___sys_sendmsg+0x806/0x930 net/socket.c:2136
        __sys_sendmsg+0x105/0x1d0 net/socket.c:2174
        __do_sys_sendmsg net/socket.c:2183 [inline]
        __se_sys_sendmsg net/socket.c:2181 [inline]
        __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181
        do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440159
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
      48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
      ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffa69156e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440159
      RDX: 0000000000000000 RSI: 0000000020001340 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000001 R09: 00000000004002c8
      R10: 0000000000000011 R11: 0000000000000246 R12: 00000000004019e0
      R13: 0000000000401a70 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace 09f8a7d13b4faaa1 ]---
      RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726
      Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43
      54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f
      85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f
      RSP: 0018:ffff888089ed7168 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000
      RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0
      RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58
      R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000
      R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58
      FS:  0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 4974d5f6 ("net: ip6_gre: initialize erspan_ver just for erspan tunnels")
      Reported-and-tested-by: syzbot+30191cf1057abd3064af@syzkaller.appspotmail.com
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Reviewed-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efcc9bca
    • Maciej Kwiecien's avatar
      sctp: don't compare hb_timer expire date before starting it · d1f20c03
      Maciej Kwiecien authored
      hb_timer might not start at all for a particular transport because its
      start is conditional. In a result a node is not sending heartbeats.
      
      Function sctp_transport_reset_hb_timer has two roles:
          - initial start of hb_timer for a given transport,
          - update expire date of hb_timer for a given transport.
      The function is optimized to update timer's expire only if it is before
      a new calculated one but this comparison is invalid for a timer which
      has not yet started. Such a timer has expire == 0 and if a new expire
      value is bigger than (MAX_JIFFIES / 2 + 2) then "time_before" macro will
      fail and timer will not start resulting in no heartbeat packets send by
      the node.
      
      This was found when association was initialized within first 5 mins
      after system boot due to jiffies init value which is near to MAX_JIFFIES.
      
      Test kernel version: 4.9.154 (ARCH=arm)
      hb_timer.expire = 0;                //initialized, not started timer
      new_expire = MAX_JIFFIES / 2 + 2;   //or more
      time_before(hb_timer.expire, new_expire) == false
      
      Fixes: ba6f5e33 ("sctp: avoid refreshing heartbeat timer too often")
      Reported-by: default avatarMarcin Stojek <marcin.stojek@nokia.com>
      Tested-by: default avatarMarcin Stojek <marcin.stojek@nokia.com>
      Signed-off-by: default avatarMaciej Kwiecien <maciej.kwiecien@nokia.com>
      Reviewed-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1f20c03
    • Felix Fietkau's avatar
      mac80211: allocate tailroom for forwarded mesh packets · 51d0af22
      Felix Fietkau authored
      Forwarded packets enter the tx path through ieee80211_add_pending_skb,
      which skips the ieee80211_skb_resize call.
      Fixes WARN_ON in ccmp_encrypt_skb and resulting packet loss.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      51d0af22
    • Toke Høiland-Jørgensen's avatar
      mac80211: Change default tx_sk_pacing_shift to 7 · 5c14a4d0
      Toke Høiland-Jørgensen authored
      When we did the original tests for the optimal value of sk_pacing_shift, we
      came up with 6 ms of buffering as the default. Sadly, 6 is not a power of
      two, so when picking the shift value I erred on the size of less buffering
      and picked 4 ms instead of 8. This was probably wrong; those 2 ms of extra
      buffering makes a larger difference than I thought.
      
      So, change the default pacing shift to 7, which corresponds to 8 ms of
      buffering. The point of diminishing returns really kicks in after 8 ms, and
      so having this as a default should cut down on the need for extensive
      per-device testing and overrides needed in the drivers.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      5c14a4d0
    • Arnd Bergmann's avatar
      phonet: fix building with clang · 6321aa19
      Arnd Bergmann authored
      clang warns about overflowing the data[] member in the struct pnpipehdr:
      
      net/phonet/pep.c:295:8: warning: array index 4 is past the end of the array (which contains 1 element) [-Warray-bounds]
                              if (hdr->data[4] == PEP_IND_READY)
                                  ^         ~
      include/net/phonet/pep.h:66:3: note: array 'data' declared here
                      u8              data[1];
      
      Using a flexible array member at the end of the struct avoids the
      warning, but since we cannot have a flexible array member inside
      of the union, each index now has to be moved back by one, which
      makes it a little uglier.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarRémi Denis-Courmont <remi@remlab.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6321aa19
    • Lorenzo Bianconi's avatar
      net: ip6_gre: do not report erspan_ver for ip6gre or ip6gretap · 103d0244
      Lorenzo Bianconi authored
      Report erspan version field to userspace in ip6gre_fill_info just for
      erspan_v6 tunnels. Moreover report IFLA_GRE_ERSPAN_INDEX only for
      erspan version 1.
      The issue can be triggered with the following reproducer:
      
      $ip link add name gre6 type ip6gre local 2001::1 remote 2002::2
      $ip link set gre6 up
      $ip -d link sh gre6
      14: grep6@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1448 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/gre6 2001::1 peer 2002::2 promiscuity 0 minmtu 0 maxmtu 0
          ip6gre remote 2002::2 local 2001::1 hoplimit 64 encaplimit 4 tclass 0x00 flowlabel 0x00000 erspan_index 0 erspan_ver 0 addrgenmode eui64
      
      Fixes: 94d7d8f2 ("ip6_gre: add erspan v2 support")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      103d0244
    • Lorenzo Bianconi's avatar
      net: ip_gre: do not report erspan_ver for gre or gretap · 2bdf700e
      Lorenzo Bianconi authored
      Report erspan version field to userspace in ipgre_fill_info just for
      erspan tunnels. The issue can be triggered with the following reproducer:
      
      $ip link add name gre1 type gre local 192.168.0.1 remote 192.168.1.1
      $ip link set dev gre1 up
      $ip -d link sh gre1
      13: gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1476 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/gre 192.168.0.1 peer 192.168.1.1 promiscuity 0 minmtu 0 maxmtu 0
          gre remote 192.168.1.1 local 192.168.0.1 ttl inherit erspan_ver 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1
      
      Fixes: f551c91d ("net: erspan: introduce erspan v2 for ip_gre")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2bdf700e
  9. 21 Feb, 2019 5 commits