1. 24 Jul, 2018 1 commit
  2. 08 Feb, 2018 1 commit
    • Sowmini Varadhan's avatar
      rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and... · ebeeb1ad
      Sowmini Varadhan authored
      rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management
      
      An rds_connection can get added during netns deletion between lines 528
      and 529 of
      
        506 static void rds_tcp_kill_sock(struct net *net)
        :
        /* code to pull out all the rds_connections that should be destroyed */
        :
        528         spin_unlock_irq(&rds_tcp_conn_lock);
        529         list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node)
        530                 rds_conn_destroy(tc->t_cpath->cp_conn);
      
      Such an rds_connection would miss out the rds_conn_destroy()
      loop (that cancels all pending work) and (if it was scheduled
      after netns deletion) could trigger the use-after-free.
      
      A similar race-window exists for the module unload path
      in rds_tcp_exit -> rds_tcp_destroy_conns
      
      Concurrency with netns deletion (rds_tcp_kill_sock()) must be handled
      by checking check_net() before enqueuing new work or adding new
      connections.
      
      Concurrency with module-unload is handled by maintaining a module
      specific flag that is set at the start of the module exit function,
      and must be checked before enqueuing new work or adding new connections.
      
      This commit refactors existing RDS_DESTROY_PENDING checks added by
      commit 3db6e0d1 ("rds: use RCU to synchronize work-enqueue with
      connection teardown") and consolidates all the concurrency checks
      listed above into the function rds_destroy_pending().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebeeb1ad
  3. 05 Jan, 2018 1 commit
  4. 17 Jul, 2017 1 commit
    • Sowmini Varadhan's avatar
      rds: cancel send/recv work before queuing connection shutdown · aed20a53
      Sowmini Varadhan authored
      We could end up executing rds_conn_shutdown before the rds_recv_worker
      thread, then rds_conn_shutdown -> rds_tcp_conn_shutdown can do a
      sock_release and set sock->sk to null, which may interleave in bad
      ways with rds_recv_worker, e.g., it could result in:
      
      "BUG: unable to handle kernel NULL pointer dereference at 0000000000000078"
          [ffff881769f6fd70] release_sock at ffffffff815f337b
          [ffff881769f6fd90] rds_tcp_recv at ffffffffa043c888 [rds_tcp]
          [ffff881769f6fdb0] rds_recv_worker at ffffffffa04a4810 [rds]
          [ffff881769f6fde0] process_one_work at ffffffff810a14c1
          [ffff881769f6fe40] worker_thread at ffffffff810a1940
          [ffff881769f6fec0] kthread at ffffffff810a6b1e
      
      Also, do not enqueue any new shutdown workq items when the connection is
      shutting down (this may happen for rds-tcp in softirq mode, if a FIN
      or CLOSE is received while the modules is in the middle of an unload)
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aed20a53
  5. 16 Jun, 2017 1 commit
  6. 03 Apr, 2017 1 commit
  7. 17 Oct, 2016 1 commit
  8. 15 Jul, 2016 1 commit
  9. 01 Jul, 2016 3 commits
  10. 15 Jun, 2016 4 commits
  11. 07 Jun, 2016 1 commit
    • Sowmini Varadhan's avatar
      RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one() · 9c79440e
      Sowmini Varadhan authored
      The send path needs to be quiesced before resetting callbacks from
      rds_tcp_accept_one(), and commit eb192840 ("RDS:TCP: Synchronize
      rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
      this using the c_state and RDS_IN_XMIT bit following the pattern
      used by rds_conn_shutdown(). However this leaves the possibility
      of a race window as shown in the sequence below
          take t_conn_lock in rds_tcp_conn_connect
          send outgoing syn to peer
          drop t_conn_lock in rds_tcp_conn_connect
          incoming from peer triggers rds_tcp_accept_one, conn is
      	marked CONNECTING
          wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
          call rds_tcp_reset_callbacks
          [.. race-window where incoming syn-ack can cause the conn
      	to be marked UP from rds_tcp_state_change ..]
          lock_sock called from rds_tcp_reset_callbacks, and we set
      	t_sock to null
      As soon as the conn is marked UP in the race-window above, rds_send_xmit()
      threads will proceed to rds_tcp_xmit and may encounter a null-pointer
      deref on the t_sock.
      
      Given that rds_tcp_state_change() is invoked in softirq context, whereas
      rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
      after lock_sock could result in a deadlock with tcp_sendmsg, this
      commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
      will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c79440e
  12. 05 Oct, 2015 1 commit
  13. 03 Oct, 2014 1 commit
  14. 31 Oct, 2011 1 commit
  15. 09 Sep, 2010 5 commits
    • Zach Brown's avatar
      RDS: remove __init and __exit annotation · ef87b7ea
      Zach Brown authored
      The trivial amount of memory saved isn't worth the cost of dealing with section
      mismatches.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      ef87b7ea
    • Zach Brown's avatar
      RDS: return to a single-threaded krdsd · 80c51be5
      Zach Brown authored
      We were seeing very nasty bugs due to fundamental assumption the current code
      makes about concurrent work struct processing.  The code simpy isn't able to
      handle concurrent connection shutdown work function execution today, for
      example, which is very much possible once a multi-threaded krdsd was
      introduced.  The problem compounds as additional work structs are added to the
      mix.
      
      krdsd is no longer perforance critical now that send and receive posting and
      FMR flushing are done elsewhere, so the safest fix is to move back to the
      single threaded krdsd that the current code was built around.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      80c51be5
    • Zach Brown's avatar
      rds: fix rds_send_xmit() serialization · 0f4b1c7e
      Zach Brown authored
      rds_send_xmit() was changed to hold an interrupt masking spinlock instead of a
      mutex so that it could be called from the IB receive tasklet path.  This broke
      the TCP transport because its xmit method can block and masks and unmasks
      interrupts.
      
      This patch serializes callers to rds_send_xmit() with a simple bit instead of
      the current spinlock or previous mutex.  This enables rds_send_xmit() to be
      called from any context and to call functions which block.  Getting rid of the
      c_send_lock exposes the bare c_lock acquisitions which are changed to block
      interrupts.
      
      A waitqueue is added so that rds_conn_shutdown() can wait for callers to leave
      rds_send_xmit() before tearing down partial send state.  This lets us get rid
      of c_senders.
      
      rds_send_xmit() is changed to check the conn state after acquiring the
      RDS_IN_XMIT bit to resolve races with the shutdown path.  Previously both
      worked with the conn state and then the lock in the same order, allowing them
      to race and execute the paths concurrently.
      
      rds_send_reset() isn't racing with rds_send_xmit() now that rds_conn_shutdown()
      properly ensures that rds_send_xmit() can't start once the conn state has been
      changed.  We can remove its previous use of the spinlock.
      
      Finally, c_send_generation is redundant.  Callers can race to test the c_flags
      bit by simply retrying instead of racing to test the c_send_generation atomic.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      0f4b1c7e
    • Andy Grover's avatar
      RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons · 8690bfa1
      Andy Grover authored
      Favor "if (foo)" style over "if (foo != NULL)".
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      8690bfa1
    • Andy Grover's avatar
      RDS: move rds_shutdown_worker impl. to rds_conn_shutdown · 2dc39357
      Andy Grover authored
      This fits better in connection.c, rather than threads.c.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      2dc39357
  16. 17 Mar, 2010 1 commit
  17. 30 Nov, 2009 1 commit
  18. 24 Aug, 2009 1 commit
  19. 27 Feb, 2009 1 commit
    • Andy Grover's avatar
      RDS: Connection handling · 00e0f34c
      Andy Grover authored
      While arguably the fact that the underlying transport needs a
      connection to convey RDS's datagrame reliably is not important
      to rds proper, the transports implemented so far (IB and TCP)
      have both been connection-oriented, and so the connection
      state machine-related code is in the common rds code.
      
      This patch also includes several work items, to handle connecting,
      sending, receiving, and shutdown.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00e0f34c