Skip to content
Snippets Groups Projects
  1. Aug 24, 2021
  2. Aug 19, 2021
    • Matthieu Baerts's avatar
      mptcp: full fully established support after ADD_ADDR · 67b12f79
      Matthieu Baerts authored
      
      If directly after an MP_CAPABLE 3WHS, the client receives an ADD_ADDR
      with HMAC from the server, it is enough to switch to a "fully
      established" mode because it has received more MPTCP options.
      
      It was then OK to enable the "fully_established" flag on the MPTCP
      socket. Still, best to check if the ADD_ADDR looks valid by looking if
      it contains an HMAC (no 'echo' bit). If an ADD_ADDR echo is received
      while we are not in "fully established" mode, it is strange and then
      we should not switch to this mode now.
      
      But that is not enough. On one hand, the path-manager has be notified
      the state has changed. On the other hand, the "fully_established" flag
      on the subflow socket should be turned on as well not to re-send the
      MP_CAPABLE 3rd ACK content with the next ACK.
      
      Fixes: 84dfe367 ("mptcp: send out dedicated ADD_ADDR packet")
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67b12f79
    • Paolo Abeni's avatar
      mptcp: fix memory leak on address flush · a0eea5f1
      Paolo Abeni authored
      
      The endpoint cleanup path is prone to a memory leak, as reported
      by syzkaller:
      
       BUG: memory leak
       unreferenced object 0xffff88810680ea00 (size 64):
         comm "syz-executor.6", pid 6191, jiffies 4295756280 (age 24.138s)
         hex dump (first 32 bytes):
           58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de  Xu}<....".......
           01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00  ................
         backtrace:
           [<0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline]
           [<0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170
           [<00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731
           [<00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
           [<00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792
           [<0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504
           [<00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
           [<000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
           [<000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340
           [<0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929
           [<00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline]
           [<00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724
           [<000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403
           [<000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457
           [<00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486
           [<00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
           [<00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
           [<000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      We should not require an allocation to cleanup stuff.
      
      Rework the code a bit so that the additional RCU work is no more needed.
      
      Fixes: 1729cf18 ("mptcp: create the listening socket for new port")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0eea5f1
  3. Aug 18, 2021
  4. Aug 14, 2021
  5. Aug 03, 2021
  6. Jul 13, 2021
  7. Jul 10, 2021
    • Paolo Abeni's avatar
      mptcp: properly account bulk freed memory · ce599c51
      Paolo Abeni authored
      After commit 87952603 ("mptcp: protect the rx path with
      the msk socket spinlock") the rmem currently used by a given
      msk is really sk_rmem_alloc - rmem_released.
      
      The safety check in mptcp_data_ready() does not take the above
      in due account, as a result legit incoming data is kept in
      subflow receive queue with no reason, delaying or blocking
      MPTCP-level ack generation.
      
      This change addresses the issue introducing a new helper to fetch
      the rmem memory and using it as needed. Additionally add a MIB
      counter for the exceptional event described above - the peer is
      misbehaving.
      
      Finally, introduce the required annotation when rmem_released is
      updated.
      
      Fixes: 87952603 ("mptcp: protect the rx path with the msk socket spinlock")
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/211
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce599c51
    • Jianguo Wu's avatar
      mptcp: avoid processing packet if a subflow reset · 6787b7e3
      Jianguo Wu authored
      
      If check_fully_established() causes a subflow reset, it should not
      continue to process the packet in tcp_data_queue().
      Add a return value to mptcp_incoming_options(), and return false if a
      subflow has been reset, else return true. Then drop the packet in
      tcp_data_queue()/tcp_rcv_state_process() if mptcp_incoming_options()
      return false.
      
      Fixes: d5824847 ("mptcp: fix fallback for MP_JOIN subflows")
      Signed-off-by: default avatarJianguo Wu <wujianguo@chinatelecom.cn>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6787b7e3
    • Jianguo Wu's avatar
      mptcp: fix syncookie process if mptcp can not_accept new subflow · 8547ea5f
      Jianguo Wu authored
      
      Lots of "TCP: tcp_fin: Impossible, sk->sk_state=7" in client side
      when doing stress testing using wrk and webfsd.
      
      There are at least two cases may trigger this warning:
      1.mptcp is in syncookie, and server recv MP_JOIN SYN request,
        in subflow_check_req(), the mptcp_can_accept_new_subflow()
        return false, so subflow_init_req_cookie_join_save() isn't
        called, i.e. not store the data present in the MP_JOIN syn
        request and the random nonce in hash table - join_entries[],
        but still send synack. When recv 3rd-ack,
        mptcp_token_join_cookie_init_state() will return false, and
        3rd-ack is dropped, then if mptcp conn is closed by client,
        client will send a DATA_FIN and a MPTCP FIN, the DATA_FIN
        doesn't have MP_CAPABLE or MP_JOIN,
        so mptcp_subflow_init_cookie_req() will return 0, and pass
        the cookie check, MP_JOIN request is fallback to normal TCP.
        Server will send a TCP FIN if closed, in client side,
        when process TCP FIN, it will do reset, the code path is:
          tcp_data_queue()->mptcp_incoming_options()
            ->check_fully_established()->mptcp_subflow_reset().
        mptcp_subflow_reset() will set sock state to TCP_CLOSE,
        so tcp_fin will hit TCP_CLOSE, and print the warning.
      
      2.mptcp is in syncookie, and server recv 3rd-ack, in
        mptcp_subflow_init_cookie_req(), mptcp_can_accept_new_subflow()
        return false, and subflow_req->mp_join is not set to 1,
        so in subflow_syn_recv_sock() will not reset the MP_JOIN
        subflow, but fallback to normal TCP, and then the same thing
        happens when server will send a TCP FIN if closed.
      
      For case1, subflow_check_req() return -EPERM,
      then tcp_conn_request() will drop MP_JOIN SYN.
      
      For case2, let subflow_syn_recv_sock() call
      mptcp_can_accept_new_subflow(), and do fatal fallback, send reset.
      
      Fixes: 9466a1cc ("mptcp: enable JOIN requests even if cookies are in use")
      Signed-off-by: default avatarJianguo Wu <wujianguo@chinatelecom.cn>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8547ea5f
    • Jianguo Wu's avatar
      mptcp: remove redundant req destruct in subflow_check_req() · 030d37bd
      Jianguo Wu authored
      
      In subflow_check_req(), if subflow sport is mismatch, will put msk,
      destroy token, and destruct req, then return -EPERM, which can be
      done by subflow_req_destructor() via:
      
        tcp_conn_request()
          |--__reqsk_free()
            |--subflow_req_destructor()
      
      So we should remove these redundant code, otherwise will call
      tcp_v4_reqsk_destructor() twice, and may double free
      inet_rsk(req)->ireq_opt.
      
      Fixes: 5bc56388 ("mptcp: add port number check for MP_JOIN")
      Signed-off-by: default avatarJianguo Wu <wujianguo@chinatelecom.cn>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      030d37bd
    • Jianguo Wu's avatar
      mptcp: fix warning in __skb_flow_dissect() when do syn cookie for subflow join · 0c71929b
      Jianguo Wu authored
      I did stress test with wrk[1] and webfsd[2] with the assistance of
      mptcp-tools[3]:
      
        Server side:
            ./use_mptcp.sh webfsd -4 -R /tmp/ -p 8099
        Client side:
            ./use_mptcp.sh wrk -c 200 -d 30 -t 4 http://192.168.174.129:8099/
      
      and got the following warning message:
      
      [   55.552626] TCP: request_sock_subflow: Possible SYN flooding on port 8099. Sending cookies.  Check SNMP counters.
      [   55.553024] ------------[ cut here ]------------
      [   55.553027] WARNING: CPU: 0 PID: 10 at net/core/flow_dissector.c:984 __skb_flow_dissect+0x280/0x1650
      ...
      [   55.553117] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.12.0+ #18
      [   55.553121] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
      [   55.553124] RIP: 0010:__skb_flow_dissect+0x280/0x1650
      ...
      [   55.553133] RSP: 0018:ffffb79580087770 EFLAGS: 00010246
      [   55.553137] RAX: 0000000000000000 RBX: ffffffff8ddb58e0 RCX: ffffb79580087888
      [   55.553139] RDX: ffffffff8ddb58e0 RSI: ffff8f7e4652b600 RDI: 0000000000000000
      [   55.553141] RBP: ffffb79580087858 R08: 0000000000000000 R09: 0000000000000008
      [   55.553143] R10: 000000008c622965 R11: 00000000d3313a5b R12: ffff8f7e4652b600
      [   55.553146] R13: ffff8f7e465c9062 R14: 0000000000000000 R15: ffffb79580087888
      [   55.553149] FS:  0000000000000000(0000) GS:ffff8f7f75e00000(0000) knlGS:0000000000000000
      [   55.553152] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   55.553154] CR2: 00007f73d1d19000 CR3: 0000000135e10004 CR4: 00000000003706f0
      [   55.553160] Call Trace:
      [   55.553166]  ? __sha256_final+0x67/0xd0
      [   55.553173]  ? sha256+0x7e/0xa0
      [   55.553177]  __skb_get_hash+0x57/0x210
      [   55.553182]  subflow_init_req_cookie_join_save+0xac/0xc0
      [   55.553189]  subflow_check_req+0x474/0x550
      [   55.553195]  ? ip_route_output_key_hash+0x67/0x90
      [   55.553200]  ? xfrm_lookup_route+0x1d/0xa0
      [   55.553207]  subflow_v4_route_req+0x8e/0xd0
      [   55.553212]  tcp_conn_request+0x31e/0xab0
      [   55.553218]  ? selinux_socket_sock_rcv_skb+0x116/0x210
      [   55.553224]  ? tcp_rcv_state_process+0x179/0x6d0
      [   55.553229]  tcp_rcv_state_process+0x179/0x6d0
      [   55.553235]  tcp_v4_do_rcv+0xaf/0x220
      [   55.553239]  tcp_v4_rcv+0xce4/0xd80
      [   55.553243]  ? ip_route_input_rcu+0x246/0x260
      [   55.553248]  ip_protocol_deliver_rcu+0x35/0x1b0
      [   55.553253]  ip_local_deliver_finish+0x44/0x50
      [   55.553258]  ip_local_deliver+0x6c/0x110
      [   55.553262]  ? ip_rcv_finish_core.isra.19+0x5a/0x400
      [   55.553267]  ip_rcv+0xd1/0xe0
      ...
      
      After debugging, I found in __skb_flow_dissect(), skb->dev and skb->sk
      are both NULL, then net is NULL, and trigger WARN_ON_ONCE(!net),
      actually net is always NULL in this code path, as skb->dev is set to
      NULL in tcp_v4_rcv(), and skb->sk is never set.
      
      Code snippet in __skb_flow_dissect() that trigger warning:
        975         if (skb) {
        976                 if (!net) {
        977                         if (skb->dev)
        978                                 net = dev_net(skb->dev);
        979                         else if (skb->sk)
        980                                 net = sock_net(skb->sk);
        981                 }
        982         }
        983
        984         WARN_ON_ONCE(!net);
      
      So, using seq and transport header derived hash.
      
      [1] https://github.com/wg/wrk
      [2] https://github.com/ourway/webfsd
      [3] https://github.com/pabeni/mptcp-tools
      
      
      
      Fixes: 9466a1cc ("mptcp: enable JOIN requests even if cookies are in use")
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarJianguo Wu <wujianguo@chinatelecom.cn>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c71929b
  8. Jul 01, 2021
  9. Jun 29, 2021
  10. Jun 28, 2021
  11. Jun 22, 2021
  12. Jun 21, 2021
Loading