Skip to content
  • Mel Gorman's avatar
    mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand · 0aaa29a5
    Mel Gorman authored
    
    
    High-order watermark checking exists for two reasons -- kswapd high-order
    awareness and protection for high-order atomic requests.  Historically the
    kernel depended on MIGRATE_RESERVE to preserve min_free_kbytes as
    high-order free pages for as long as possible.  This patch introduces
    MIGRATE_HIGHATOMIC that reserves pageblocks for high-order atomic
    allocations on demand and avoids using those blocks for order-0
    allocations.  This is more flexible and reliable than MIGRATE_RESERVE was.
    
    A MIGRATE_HIGHORDER pageblock is created when an atomic high-order
    allocation request steals a pageblock but limits the total number to 1% of
    the zone.  Callers that speculatively abuse atomic allocations for
    long-lived high-order allocations to access the reserve will quickly fail.
     Note that SLUB is currently not such an abuser as it reclaims at least
    once.  It is possible that the pageblock stolen has few suitable
    high-order pages and will need to steal again in the near future but there
    would need to be strong justification to search all pageblocks for an
    ideal candidate.
    
    The pageblocks are unreserved if an allocation fails after a direct
    reclaim attempt.
    
    The watermark checks account for the reserved pageblocks when the
    allocation request is not a high-order atomic allocation.
    
    The reserved pageblocks can not be used for order-0 allocations.  This may
    allow temporary wastage until a failed reclaim reassigns the pageblock.
    This is deliberate as the intent of the reservation is to satisfy a
    limited number of atomic high-order short-lived requests if the system
    requires them.
    
    The stutter benchmark was used to evaluate this but while it was running
    there was a systemtap script that randomly allocated between 1 high-order
    page and 12.5% of memory's worth of order-3 pages using GFP_ATOMIC.  This
    is much larger than the potential reserve and it does not attempt to be
    realistic.  It is intended to stress random high-order allocations from an
    unknown source, show that there is a reduction in failures without
    introducing an anomaly where atomic allocations are more reliable than
    regular allocations.  The amount of memory reserved varied throughout the
    workload as reserves were created and reclaimed under memory pressure.
    The allocation failures once the workload warmed up were as follows;
    
    4.2-rc5-vanilla		70%
    4.2-rc5-atomic-reserve	56%
    
    The failure rate was also measured while building multiple kernels.  The
    failure rate was 14% but is 6% with this patch applied.
    
    Overall, this is a small reduction but the reserves are small relative to
    the number of allocation requests.  In early versions of the patch, the
    failure rate reduced by a much larger amount but that required much larger
    reserves and perversely made atomic allocations seem more reliable than
    regular allocations.
    
    [yalin.wang2010@gmail.com: fix redundant check and a memory leak]
    Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: default avataryalin wang <yalin.wang2010@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    0aaa29a5