• Dave Hansen's avatar
    mm, sparsemem: break out of loops early · c4e1be9e
    Dave Hansen authored
    There are a number of times that we loop over NR_MEM_SECTIONS, looking
    for section_present() on each section.  But, when we have very large
    physical address spaces (large MAX_PHYSMEM_BITS), NR_MEM_SECTIONS
    becomes very large, making the loops quite long.
    
    With MAX_PHYSMEM_BITS=46 and a section size of 128MB, the current loops
    are 512k iterations, which we barely notice on modern hardware.  But,
    raising MAX_PHYSMEM_BITS higher (like we will see on systems that
    support 5-level paging) makes this 64x longer and we start to notice,
    especially on slower systems like simulators.  A 10-second delay for
    512k iterations is annoying.  But, a 640- second delay is crippling.
    
    This does not help if we have extremely sparse physical address spaces,
    but those are quite rare.  We expect that most of the "slow" systems
    where this matters will also be quite small and non-sparse.
    
    To fix this, we track the highest section we've ever encountered.  This
    lets us know when we will *never* see another section_present(), and
    lets us break out of the loops earlier.
    
    Doing the whole for_each_present_section_nr() macro is probably
    overkill, but it will ensure that any future loop iterations that we
    grow are more likely to be correct.
    
    Kirrill said "It shaved almost 40 seconds from boot time in qemu with
    5-level paging enabled for me".
    
    Link: http://lkml.kernel.org/r/20170504174434.C45A4735@viggo.jf.intel.comSigned-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Tested-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    c4e1be9e
mmzone.h 38.3 KB