Skip to content
  • Lars Ellenberg's avatar
    drbd: avoid potential deadlock during handshake · 5f7c0124
    Lars Ellenberg authored
    
    
    During handshake communication, we also reconsider our device size,
    using drbd_determine_dev_size(). Just in case we need to change the
    offsets or layout of our on-disk metadata, we lock out application
    and other meta data IO, and wait for the activity log to be "idle"
    (no more referenced extents).
    
    If this handshake happens just after a connection loss, with a fencing
    policy of "resource-and-stonith", we have frozen IO.
    
    If, additionally, the activity log was "starving" (too many incoming
    random writes at that point in time), it won't become idle, ever,
    because of the frozen IO, and this would be a lockup of the receiver
    thread, and consquentially of DRBD.
    
    Previous logic (re-)initialized with a special "empty" transaction
    block, which required the activity log to fully drain first.
    
    Instead, write out some standard activity log transactions.
    Using lc_try_lock_for_transaction() instead of lc_try_lock() does not
    care about pending activity log references, avoiding the potential
    deadlock.
    
    Signed-off-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    5f7c0124