mm/internal.h · d84d940005fb003e51cfe1e7d4a442de42c4d01c · Librem5 / linux

mm, compaction: capture a page under direct compaction · 2b8934b0

Mel Gorman authored Jan 10, 2019

Compaction is inherently race-prone as a suitable page freed during
compaction can be allocated by any parallel task. This patch uses a
capture_control structure to isolate a page immediately when it is freed
by a direct compactor in the slow path of the page allocator. The intent
is to avoid redundant scanning.

4.20.0 4.20.0
selective-v2r15 capture-v2r15
Amean fault-both-1 0.00 ( 0.00%) 0.00 * 0.00%*
Amean fault-both-3 2624.85 ( 0.00%) 2594.49 ( 1.16%)
Amean fault-both-5 3842.66 ( 0.00%) 4088.32 ( -6.39%)
Amean fault-both-7 5459.47 ( 0.00%) 5936.54 ( -8.74%)
Amean fault-both-12 9276.60 ( 0.00%) 10160.85 ( -9.53%)
Amean fault-both-18 14030.73 ( 0.00%) 13908.92 ( 0.87%)
Amean fault-both-24 13298.10 ( 0.00%) 16819.86 * -26.48%*
Amean fault-both-30 17648.62 ( 0.00%) 17901.74 ( -1.43%)
Amean fault-both-32 19161.67 ( 0.00%) 18621.32 ( 2.82%)

Latency is only moderately affected but the devil is in the details. A
closer examination indicates that base page fault latency is much reduced
but latency of huge pages is increased as it takes creater care to
succeed. Part of the "problem" is that allocation success rates are close
to 100% even when under pressure and compaction gets harder

4.20.0 4.20.0
selective-v2r15 capture-v2r15
Percentage huge-1 0.00 ( 0.00%) 0.00 ( 0.00%)
Percentage huge-3 99.95 ( 0.00%) 99.98 ( 0.03%)
Percentage huge-5 98.83 ( 0.00%) 98.01 ( -0.84%)
Percentage huge-7 96.78 ( 0.00%) 98.30 ( 1.58%)
Percentage huge-12 98.85 ( 0.00%) 97.76 ( -1.10%)
Percentage huge-18 97.52 ( 0.00%) 99.05 ( 1.57%)
Percentage huge-24 97.07 ( 0.00%) 99.34 ( 2.35%)
Percentage huge-30 96.59 ( 0.00%) 99.08 ( 2.58%)
Percentage huge-32 95.94 ( 0.00%) 99.03 ( 3.22%)

And scan rates are reduced as expected by 10% for the migration scanner
and 37% for the free scanner indicating that there is less redundant work.

Compaction migrate scanned 20338945.00 18133661.00
Compaction free scanned 12590377.00 7986174.00

The impact on 2-socket is much larger albeit not presented. Under a
different workload that fragments heavily, the allocation latency is
reduced by 26% while the success rate goes from 63% to 80%

Link: http://lkml.kernel.org/r/20190104125011.16071-25-mgorman@techsingularity.net

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

2b8934b0

Admin message