Skip to content
  • Huang Ying's avatar
    mm, swap: fix race between swapoff and some swap operations · 3275ada4
    Huang Ying authored
    When swapin is performed, after getting the swap entry information from
    the page table, system will swap in the swap entry, without any lock held
    to prevent the swap device from being swapoff.  This may cause the race
    like below,
    
    CPU 1				CPU 2
    -----				-----
    				do_swap_page
    				  swapin_readahead
    				    __read_swap_cache_async
    swapoff				      swapcache_prepare
      p->swap_map = NULL		        __swap_duplicate
    					  p->swap_map[?] /* !!! NULL pointer access */
    
    Because swapoff is usually done when system shutdown only, the race may
    not hit many people in practice.  But it is still a race need to be fixed.
    
    To fix the race, get_swap_device() is added to check whether the specified
    swap entry is valid in its swap device.  If so, it will keep the swap
    entry valid via preventing the swap device from being swapoff, until
    put_swap_device() is called.
    
    Because swapoff() is very rare code path, to make the normal path runs as
    fast as possible, disabling preemption + stop_machine() instead of
    reference count is used to implement get/put_swap_device().  From
    get_swap_device() to put_swap_device(), the preemption is disabled, so
    stop_machine() in swapoff() will wait until put_swap_device() is called.
    
    In addition to swap_map, cluster_info, etc.  data structure in the struct
    swap_info_struct, the swap cache radix tree will be freed after swapoff,
    so this patch fixes the race between swap cache looking up and swapoff
    too.
    
    Races between some other swap cache usages protected via disabling
    preemption and swapoff are fixed too via calling stop_machine() between
    clearing PageSwapCache() and freeing swap cache data structure.
    
    Alternative implementation could be replacing disable preemption with
    rcu_read_lock_sched and stop_machine() with synchronize_sched().
    
    Link: http://lkml.kernel.org/r/20180213014220.2464-1-ying.huang@intel.com
    
    
    Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Tim Chen <tim.c.chen@linux.intel.com>
    Cc: Shaohua Li <shli@fb.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Jérôme Glisse <jglisse@redhat.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Dave Jiang <dave.jiang@intel.com>
    Cc: Aaron Lu <aaron.lu@intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
    3275ada4