Skip to content
  • Tetsuo Handa's avatar
    mm,writeback: don't use memory reserves for wb_start_writeback · 78ebc2f7
    Tetsuo Handa authored
    When writeback operation cannot make forward progress because memory
    allocation requests needed for doing I/O cannot be satisfied (e.g.
    under OOM-livelock situation), we can observe flood of order-0 page
    allocation failure messages caused by complete depletion of memory
    reserves.
    
    This is caused by unconditionally allocating "struct wb_writeback_work"
    objects using GFP_ATOMIC from PF_MEMALLOC context.
    
    __alloc_pages_nodemask() {
      __alloc_pages_slowpath() {
        __alloc_pages_direct_reclaim() {
          __perform_reclaim() {
            current->flags |= PF_MEMALLOC;
            try_to_free_pages() {
              do_try_to_free_pages() {
                wakeup_flusher_threads() {
                  wb_start_writeback() {
                    kzalloc(sizeof(*work), GFP_ATOMIC) {
                      /* ALLOC_NO_WATERMARKS via PF_MEMALLOC */
                    }
                  }
                }
              }
            }
            current->flags &= ~PF_MEMALLOC;
          }
        }
      }
    }
    
    Since I/O is stalling, allocating writeback requests forever shall
    deplete memory reserves.  Fortunately, since wb_start_writeback() can
    fall back to wb_wakeup() when allocating "struct wb_writeback_work"
    failed, we don't need to allow wb_start_writeback() to use memory
    reserves.
    
      Mem-Info:
      active_anon:289393 inactive_anon:2093 isolated_anon:29
       active_file:10838 inactive_file:113013 isolated_file:859
       unevictable:0 dirty:108531 writeback:5308 unstable:0
       slab_reclaimable:5526 slab_unreclaimable:7077
       mapped:9970 shmem:2159 pagetables:2387 bounce:0
       free:3042 free_pcp:0 free_cma:0
      Node 0 DMA free:6968kB min:44kB low:52kB high:64kB active_anon:6056kB inactive_anon:176kB active_file:712kB inactive_file:744kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:756kB writeback:0kB mapped:736kB shmem:184kB slab_reclaimable:48kB slab_unreclaimable:208kB kernel_stack:160kB pagetables:144kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9708 all_unreclaimable? yes
      lowmem_reserve[]: 0 1732 1732 1732
      Node 0 DMA32 free:5200kB min:5200kB low:6500kB high:7800kB active_anon:1151516kB inactive_anon:8196kB active_file:42640kB inactive_file:451076kB unevictable:0kB isolated(anon):116kB isolated(file):3564kB present:2080640kB managed:1775332kB mlocked:0kB dirty:433368kB writeback:21232kB mapped:39144kB shmem:8452kB slab_reclaimable:22056kB slab_unreclaimable:28100kB kernel_stack:20976kB pagetables:9404kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2701604 all_unreclaimable? no
      lowmem_reserve[]: 0 0 0 0
      Node 0 DMA: 25*4kB (UME) 16*8kB (UME) 3*16kB (UE) 5*32kB (UME) 2*64kB (UM) 2*128kB (ME) 2*256kB (ME) 1*512kB (E) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 6964kB
      Node 0 DMA32: 925*4kB (UME) 140*8kB (UME) 5*16kB (ME) 5*32kB (M) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5060kB
      Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      126847 total pagecache pages
      0 pages in swap cache
      Swap cache stats: add 0, delete 0, find 0/0
      Free swap  = 0kB
      Total swap = 0kB
      524157 pages RAM
      0 pages HighMem/MovableOnly
      76348 pages reserved
      0 pages hwpoisoned
      Out of memory: Kill process 4450 (file_io.00) score 998 or sacrifice child
      Killed process 4450 (file_io.00) total-vm:4308kB, anon-rss:100kB, file-rss:1184kB, shmem-rss:0kB
      kthreadd: page allocation failure: order:0, mode:0x2200020
      file_io.00: page allocation failure: order:0, mode:0x2200020
      CPU: 0 PID: 4457 Comm: file_io.00 Not tainted 4.5.0-rc7+ #45
      Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
      Call Trace:
        warn_alloc_failed+0xf7/0x150
        __alloc_pages_nodemask+0x23f/0xa60
        alloc_pages_current+0x87/0x110
        new_slab+0x3a1/0x440
        ___slab_alloc+0x3cf/0x590
        __slab_alloc.isra.64+0x18/0x1d
        kmem_cache_alloc+0x11c/0x150
        wb_start_writeback+0x39/0x90
        wakeup_flusher_threads+0x7f/0xf0
        do_try_to_free_pages+0x1f9/0x410
        try_to_free_pages+0x94/0xc0
        __alloc_pages_nodemask+0x566/0xa60
        alloc_pages_current+0x87/0x110
        __page_cache_alloc+0xaf/0xc0
        pagecache_get_page+0x88/0x260
        grab_cache_page_write_begin+0x21/0x40
        xfs_vm_write_begin+0x2f/0xf0
        generic_perform_write+0xca/0x1c0
        xfs_file_buffered_aio_write+0xcc/0x1f0
        xfs_file_write_iter+0x84/0x140
        __vfs_write+0xc7/0x100
        vfs_write+0x9d/0x190
        SyS_write+0x50/0xc0
        entry_SYSCALL_64_fastpath+0x12/0x6a
      Mem-Info:
      active_anon:293335 inactive_anon:2093 isolated_anon:0
       active_file:10829 inactive_file:110045 isolated_file:32
       unevictable:0 dirty:109275 writeback:822 unstable:0
       slab_reclaimable:5489 slab_unreclaimable:10070
       mapped:9999 shmem:2159 pagetables:2420 bounce:0
       free:3 free_pcp:0 free_cma:0
      Node 0 DMA free:12kB min:44kB low:52kB high:64kB active_anon:6060kB inactive_anon:176kB active_file:708kB inactive_file:756kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:756kB writeback:0kB mapped:736kB shmem:184kB slab_reclaimable:48kB slab_unreclaimable:7160kB kernel_stack:160kB pagetables:144kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9844 all_unreclaimable? yes
      lowmem_reserve[]: 0 1732 1732 1732
      Node 0 DMA32 free:0kB min:5200kB low:6500kB high:7800kB active_anon:1167280kB inactive_anon:8196kB active_file:42608kB inactive_file:439424kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:2080640kB managed:1775332kB mlocked:0kB dirty:436344kB writeback:3288kB mapped:39260kB shmem:8452kB slab_reclaimable:21908kB slab_unreclaimable:33120kB kernel_stack:20976kB pagetables:9536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:11073180 all_unreclaimable? yes
      lowmem_reserve[]: 0 0 0 0
      Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
      Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
      Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      123086 total pagecache pages
      0 pages in swap cache
      Swap cache stats: add 0, delete 0, find 0/0
      Free swap  = 0kB
      Total swap = 0kB
      524157 pages RAM
      0 pages HighMem/MovableOnly
      76348 pages reserved
      0 pages hwpoisoned
      SLUB: Unable to allocate memory on node -1 (gfp=0x2088020)
        cache: kmalloc-64, object size: 64, buffer size: 64, default order: 0, min order: 0
        node 0: slabs: 3218, objs: 205952, free: 0
      file_io.00: page allocation failure: order:0, mode:0x2200020
      CPU: 0 PID: 4457 Comm: file_io.00 Not tainted 4.5.0-rc7+ #45
    
    Assuming that somebody will find a better solution, let's apply this
    patch for now to stop bleeding, for this problem frequently prevents me
    from testing OOM livelock condition.
    
    Link: http://lkml.kernel.org/r/20160318131136.GE7152@quack.suse.cz
    
    
    Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    78ebc2f7