Skip to content
  • Dave Hansen's avatar
    x86/mm/init: Remove freed kernel image areas from alias mapping · c2d73c25
    Dave Hansen authored
    commit c40a56a7 upstream.
    
    The kernel image is mapped into two places in the virtual address space
    (addresses without KASLR, of course):
    
    	1. The kernel direct map (0xffff880000000000)
    	2. The "high kernel map" (0xffffffff81000000)
    
    We actually execute out of #2.  If we get the address of a kernel symbol,
    it points to #2, but almost all physical-to-virtual translations point to
    
    Parts of the "high kernel map" alias are mapped in the userspace page
    tables with the Global bit for performance reasons.  The parts that we map
    to userspace do not (er, should not) have secrets. When PTI is enabled then
    the global bit is usually not set in the high mapping and just used to
    compensate for poor performance on systems which lack PCID.
    
    This is fine, except that some areas in the kernel image that are adjacent
    to the non-secret-containing areas are unused holes.  We free these holes
    back into the normal page allocator and reuse them as normal kernel memory.
    The memory will, of course, get *used* via the normal map, but the alias
    mapping is kept.
    
    This otherwise unused alias mapping of the holes will, by default keep the
    Global bit, be mapped out to userspace, and be vulnerable to Meltdown.
    
    Remove the alias mapping of these pages entirely.  This is likely to
    fracture the 2M page mapping the kernel image near these areas, but this
    should affect a minority of the area.
    
    The pageattr code changes *all* aliases mapping the physical pages that it
    operates on (by default).  We only want to modify a single alias, so we
    need to tweak its behavior.
    
    This unmapping behavior is currently dependent on PTI being in place.
    Going forward, we should at least consider doing this for all
    configurations.  Having an extra read-write alias for memory is not exactly
    ideal for debugging things like random memory corruption and this does
    undercut features like DEBUG_PAGEALLOC or future work like eXclusive Page
    Frame Ownership (XPFO).
    
    Before this patch:
    
    current_kernel:---[ High Kernel Mapping ]---
    current_kernel-0xffffffff80000000-0xffffffff81000000          16M                               pmd
    current_kernel-0xffffffff81000000-0xffffffff81e00000          14M     ro         PSE     GLB x  pmd
    current_kernel-0xffffffff81e00000-0xffffffff81e11000          68K     ro                 GLB x  pte
    current_kernel-0xffffffff81e11000-0xffffffff82000000        1980K     RW                     NX pte
    current_kernel-0xffffffff82000000-0xffffffff82600000           6M     ro         PSE     GLB NX pmd
    current_kernel-0xffffffff82600000-0xffffffff82c00000           6M     RW         PSE         NX pmd
    current_kernel-0xffffffff82c00000-0xffffffff82e00000           2M     RW                     NX pte
    current_kernel-0xffffffff82e00000-0xffffffff83200000           4M     RW         PSE         NX pmd
    current_kernel-0xffffffff83200000-0xffffffffa0000000         462M                               pmd
    
      current_user:---[ High Kernel Mapping ]---
      current_user-0xffffffff80000000-0xffffffff81000000          16M                               pmd
      current_user-0xffffffff81000000-0xffffffff81e00000          14M     ro         PSE     GLB x  pmd
      current_user-0xffffffff81e00000-0xffffffff81e11000          68K     ro                 GLB x  pte
      current_user-0xffffffff81e11000-0xffffffff82000000        1980K     RW                     NX pte
      current_user-0xffffffff82000000-0xffffffff82600000           6M     ro         PSE     GLB NX pmd
      current_user-0xffffffff82600000-0xffffffffa0000000         474M                               pmd
    
    After this patch:
    
    current_kernel:---[ High Kernel Mapping ]---
    current_kernel-0xffffffff80000000-0xffffffff81000000          16M                               pmd
    current_kernel-0xffffffff81000000-0xffffffff81e00000          14M     ro         PSE     GLB x  pmd
    current_kernel-0xffffffff81e00000-0xffffffff81e11000          68K     ro                 GLB x  pte
    current_kernel-0xffffffff81e11000-0xffffffff82000000        1980K                               pte
    current_kernel-0xffffffff82000000-0xffffffff82400000           4M     ro         PSE     GLB NX pmd
    current_kernel-0xffffffff82400000-0xffffffff82488000         544K     ro                     NX pte
    current_kernel-0xffffffff82488000-0xffffffff82600000        1504K                               pte
    current_kernel-0xffffffff82600000-0xffffffff82c00000           6M     RW         PSE         NX pmd
    current_kernel-0xffffffff82c00000-0xffffffff82c0d000          52K     RW                     NX pte
    current_kernel-0xffffffff82c0d000-0xffffffff82dc0000        1740K                               pte
    
      current_user:---[ High Kernel Mapping ]---
      current_user-0xffffffff80000000-0xffffffff81000000          16M                               pmd
      current_user-0xffffffff81000000-0xffffffff81e00000          14M     ro         PSE     GLB x  pmd
      current_user-0xffffffff81e00000-0xffffffff81e11000          68K     ro                 GLB x  pte
      current_user-0xffffffff81e11000-0xffffffff82000000        1980K                               pte
      current_user-0xffffffff82000000-0xffffffff82400000           4M     ro         PSE     GLB NX pmd
      current_user-0xffffffff82400000-0xffffffff82488000         544K     ro                     NX pte
      current_user-0xffffffff82488000-0xffffffff82600000        1504K                               pte
      current_user-0xffffffff82600000-0xffffffffa0000000         474M                               pmd
    
    [ tglx: Do not unmap on 32bit as there is only one mapping ]
    
    Fixes: 0f561fce
    
     ("x86/pti: Enable global pages for shared areas")
    Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Kees Cook <keescook@google.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Juergen Gross <jgross@suse.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Joerg Roedel <jroedel@suse.de>
    Link: https://lkml.kernel.org/r/20180802225831.5F6A2BFC@viggo.jf.intel.com
    
    
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    c2d73c25