Skip to content
  • Shaohua Li's avatar
    mm: batch activate_page() to reduce lock contention · eb709b0d
    Shaohua Li authored
    The zone->lru_lock is heavily contented in workload where activate_page()
    is frequently used.  We could do batch activate_page() to reduce the lock
    contention.  The batched pages will be added into zone list when the pool
    is full or page reclaim is trying to drain them.
    
    For example, in a 4 socket 64 CPU system, create a sparse file and 64
    processes, processes shared map to the file.  Each process read access the
    whole file and then exit.  The process exit will do unmap_vmas() and cause
    a lot of activate_page() call.  In such workload, we saw about 58% total
    time reduction with below patch.  Other workloads with a lot of
    activate_page also benefits a lot too.
    
    Andrew Morton suggested activate_page() and putback_lru_pages() should
    follow the same path to active pages, but this is hard to implement (see
    commit 7a608572
    
     ("Revert "mm: batch activate_page() to reduce lock
    contention")).  On the other hand, do we really need putback_lru_pages()
    to follow the same path?  I tested several FIO/FFSB benchmark (about 20
    scripts for each benchmark) in 3 machines here from 2 sockets to 4
    sockets.  My test doesn't show anything significant with/without below
    patch (there is slight difference but mostly some noise which we found
    even without below patch before).  Below patch basically returns to the
    same as my first post.
    
    I tested some microbenchmarks:
      case-anon-cow-rand-mt         0.58%
      case-anon-cow-rand           -3.30%
      case-anon-cow-seq-mt         -0.51%
      case-anon-cow-seq            -5.68%
      case-anon-r-rand-mt           0.23%
      case-anon-r-rand              0.81%
      case-anon-r-seq-mt           -0.71%
      case-anon-r-seq              -1.99%
      case-anon-rx-rand-mt          2.11%
      case-anon-rx-seq-mt           3.46%
      case-anon-w-rand-mt          -0.03%
      case-anon-w-rand             -0.50%
      case-anon-w-seq-mt           -1.08%
      case-anon-w-seq              -0.12%
      case-anon-wx-rand-mt         -5.02%
      case-anon-wx-seq-mt          -1.43%
      case-fork                     1.65%
      case-fork-sleep              -0.07%
      case-fork-withmem             1.39%
      case-hugetlb                 -0.59%
      case-lru-file-mmap-read-mt   -0.54%
      case-lru-file-mmap-read       0.61%
      case-lru-file-mmap-read-rand -2.24%
      case-lru-file-readonce       -0.64%
      case-lru-file-readtwice     -11.69%
      case-lru-memcg               -1.35%
      case-mmap-pread-rand-mt       1.88%
      case-mmap-pread-rand        -15.26%
      case-mmap-pread-seq-mt        0.89%
      case-mmap-pread-seq         -69.72%
      case-mmap-xread-rand-mt       0.71%
      case-mmap-xread-seq-mt        0.38%
    
    The most significent are:
      case-lru-file-readtwice     -11.69%
      case-mmap-pread-rand        -15.26%
      case-mmap-pread-seq         -69.72%
    
    which use activate_page a lot.  others are basically variations because
    each run has slightly difference.
    
    In UP case, 'size mm/swap.o'
    before the two patches:
       text    data     bss     dec     hex filename
       6466     896       4    7366    1cc6 mm/swap.o
    after the two patches:
       text    data     bss     dec     hex filename
       6343     896       4    7243    1c4b mm/swap.o
    
    Signed-off-by: default avatarShaohua Li <shaohua.li@intel.com>
    Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Cc: Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: Minchan Kim <minchan.kim@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Mel Gorman <mel@csn.ul.ie>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Hugh Dickins <hughd@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    eb709b0d