Skip to content
  • Vladimir Davydov's avatar
    slab: get_online_mems for kmem_cache_{create,destroy,shrink} · 03afc0e2
    Vladimir Davydov authored
    
    
    When we create a sl[au]b cache, we allocate kmem_cache_node structures
    for each online NUMA node.  To handle nodes taken online/offline, we
    register memory hotplug notifier and allocate/free kmem_cache_node
    corresponding to the node that changes its state for each kmem cache.
    
    To synchronize between the two paths we hold the slab_mutex during both
    the cache creationg/destruction path and while tuning per-node parts of
    kmem caches in memory hotplug handler, but that's not quite right,
    because it does not guarantee that a newly created cache will have all
    kmem_cache_nodes initialized in case it races with memory hotplug.  For
    instance, in case of slub:
    
        CPU0                            CPU1
        ----                            ----
        kmem_cache_create:              online_pages:
         __kmem_cache_create:            slab_memory_callback:
                                          slab_mem_going_online_callback:
                                           lock slab_mutex
                                           for each slab_caches list entry
                                               allocate kmem_cache node
                                           unlock slab_mutex
          lock slab_mutex
          init_kmem_cache_nodes:
           for_each_node_state(node, N_NORMAL_MEMORY)
               allocate kmem_cache node
          add kmem_cache to slab_caches list
          unlock slab_mutex
                                        online_pages (continued):
                                         node_states_set_node
    
    As a result we'll get a kmem cache with not all kmem_cache_nodes
    allocated.
    
    To avoid issues like that we should hold get/put_online_mems() during
    the whole kmem cache creation/destruction/shrink paths, just like we
    deal with cpu hotplug.  This patch does the trick.
    
    Note, that after it's applied, there is no need in taking the slab_mutex
    for kmem_cache_shrink any more, so it is removed from there.
    
    Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Tang Chen <tangchen@cn.fujitsu.com>
    Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Cc: Toshi Kani <toshi.kani@hp.com>
    Cc: Xishi Qiu <qiuxishi@huawei.com>
    Cc: Jiang Liu <liuj97@gmail.com>
    Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Wen Congyang <wency@cn.fujitsu.com>
    Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    03afc0e2