1. 24 Aug, 2018 1 commit
    • Dave Hansen's avatar
      x86/mm/init: Add helper for freeing kernel image pages · a01cdb47
      Dave Hansen authored
      commit 6ea2738e
      
       upstream.
      
      When chunks of the kernel image are freed, free_init_pages() is used
      directly.  Consolidate the three sites that do this.  Also update the
      string to give an incrementally better description of that memory versus
      what was there before.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: keescook@google.com
      Cc: aarcange@redhat.com
      Cc: jgross@suse.com
      Cc: jpoimboe@redhat.com
      Cc: gregkh@linuxfoundation.org
      Cc: peterz@infradead.org
      Cc: hughd@google.com
      Cc: torvalds@linux-foundation.org
      Cc: bp@alien8.de
      Cc: luto@kernel.org
      Cc: ak@linux.intel.com
      Cc: Kees Cook <keescook@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: https://lkml.kernel.org/r/20180802225829.FE0E32EA@viggo.jf.intel.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a01cdb47
  2. 15 Aug, 2018 2 commits
    • Jiri Kosina's avatar
      x86/bugs, kvm: Introduce boot-time control of L1TF mitigations · 82abbe0e
      Jiri Kosina authored
      commit d90a7a0e
      
       upstream.
      
      Introduce the 'l1tf=' kernel command line option to allow for boot-time
      switching of mitigation that is used on processors affected by L1TF.
      
      The possible values are:
      
        full
      	Provides all available mitigations for the L1TF vulnerability. Disables
      	SMT and enables all mitigations in the hypervisors. SMT control via
      	/sys/devices/system/cpu/smt/control is still possible after boot.
      	Hypervisors will issue a warning when the first VM is started in
      	a potentially insecure configuration, i.e. SMT enabled or L1D flush
      	disabled.
      
        full,force
      	Same as 'full', but disables SMT control. Implies the 'nosmt=force'
      	command line option. sysfs control of SMT and the hypervisor flush
      	control is disabled.
      
        flush
      	Leaves SMT enabled and enables the conditional hypervisor mitigation.
      	Hypervisors will issue a warning when the first VM is started in a
      	potentially insecure configuration, i.e. SMT enabled or L1D flush
      	disabled.
      
        flush,nosmt
      	Disables SMT and enables the conditional hypervisor mitigation. SMT
      	control via /sys/devices/system/cpu/smt/control is still possible
      	after boot. If SMT is reenabled or flushing disabled at runtime
      	hypervisors will issue a warning.
      
        flush,nowarn
      	Same as 'flush', but hypervisors will not warn when
      	a VM is started in a potentially insecure configuration.
      
        off
      	Disables hypervisor mitigations and doesn't emit any warnings.
      
      Default is 'flush'.
      
      Let KVM adhere to these semantics, which means:
      
        - 'lt1f=full,force'	: Performe L1D flushes. No runtime control
          			  possible.
      
        - 'l1tf=full'
        - 'l1tf-flush'
        - 'l1tf=flush,nosmt'	: Perform L1D flushes and warn on VM start if
      			  SMT has been runtime enabled or L1D flushing
      			  has been run-time enabled
      
        - 'l1tf=flush,nowarn'	: Perform L1D flushes and no warnings are emitted.
      
        - 'l1tf=off'		: L1D flushes are not performed and no warnings
      			  are emitted.
      
      KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush'
      module parameter except when lt1f=full,force is set.
      
      This makes KVM's private 'nosmt' option redundant, and as it is a bit
      non-systematic anyway (this is something to control globally, not on
      hypervisor level), remove that option.
      
      Add the missing Documentation entry for the l1tf vulnerability sysfs file
      while at it.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJiri Kosina <jkosina@suse.cz>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82abbe0e
    • Andi Kleen's avatar
      x86/speculation/l1tf: Add sysfs reporting for l1tf · dcd1b109
      Andi Kleen authored
      commit 17dbca11
      
       upstream.
      
      L1TF core kernel workarounds are cheap and normally always enabled, However
      they still should be reported in sysfs if the system is vulnerable or
      mitigated. Add the necessary CPU feature/bug bits.
      
      - Extend the existing checks for Meltdowns to determine if the system is
        vulnerable. All CPUs which are not vulnerable to Meltdown are also not
        vulnerable to L1TF
      
      - Check for 32bit non PAE and emit a warning as there is no practical way
        for mitigation due to the limited physical address bits
      
      - If the system has more than MAX_PA/2 physical memory the invert page
        workarounds don't protect the system against the L1TF attack anymore,
        because an inverted physical address will also point to valid
        memory. Print a warning in this case and report that the system is
        vulnerable.
      
      Add a function which returns the PFN limit for the L1TF mitigation, which
      will be used in follow up patches for sanity and range checks.
      
      [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dcd1b109
  3. 14 Jun, 2018 1 commit
    • Linus Torvalds's avatar
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds authored
      
      
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  4. 13 May, 2018 1 commit
  5. 06 May, 2018 1 commit
  6. 17 Apr, 2018 1 commit
  7. 16 Mar, 2018 2 commits
  8. 17 Feb, 2018 1 commit
  9. 15 Feb, 2018 2 commits
  10. 13 Feb, 2018 1 commit
    • David Woodhouse's avatar
      Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" · f208820a
      David Woodhouse authored
      This reverts commit 64e16720
      
      .
      
      We cannot call C functions like that, without marking all the
      call-clobbered registers as, well, clobbered. We might have got away
      with it for now because the __ibp_barrier() function was *fairly*
      unlikely to actually use any other registers. But no. Just no.
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arjan.van.de.ven@intel.com
      Cc: dave.hansen@intel.com
      Cc: jmattson@google.com
      Cc: karahmed@amazon.de
      Cc: kvm@vger.kernel.org
      Cc: pbonzini@redhat.com
      Cc: rkrcmar@redhat.com
      Cc: sironi@amazon.de
      Link: http://lkml.kernel.org/r/1518305967-31356-3-git-send-email-dwmw@amazon.co.uk
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f208820a
  11. 30 Jan, 2018 1 commit
  12. 27 Jan, 2018 1 commit
  13. 15 Jan, 2018 1 commit
    • Kees Cook's avatar
      x86: Implement thread_struct whitelist for hardened usercopy · f7d83c1c
      Kees Cook authored
      
      
      This whitelists the FPU register state portion of the thread_struct for
      copying to userspace, instead of the default entire struct. This is needed
      because FPU register state is dynamically sized, so it doesn't bypass the
      hardened usercopy checks.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Mathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      f7d83c1c
  14. 23 Dec, 2017 1 commit
    • Andy Lutomirski's avatar
      x86/pti: Put the LDT in its own PGD if PTI is on · f55f0501
      Andy Lutomirski authored
      
      
      With PTI enabled, the LDT must be mapped in the usermode tables somewhere.
      The LDT is per process, i.e. per mm.
      
      An earlier approach mapped the LDT on context switch into a fixmap area,
      but that's a big overhead and exhausted the fixmap space when NR_CPUS got
      big.
      
      Take advantage of the fact that there is an address space hole which
      provides a completely unused pgd. Use this pgd to manage per-mm LDT
      mappings.
      
      This has a down side: the LDT isn't (currently) randomized, and an attack
      that can write the LDT is instant root due to call gates (thanks, AMD, for
      leaving call gates in AMD64 but designing them wrong so they're only useful
      for exploits).  This can be mitigated by making the LDT read-only or
      randomizing the mapping, either of which is strightforward on top of this
      patch.
      
      This will significantly slow down LDT users, but that shouldn't matter for
      important workloads -- the LDT is only used by DOSEMU(2), Wine, and very
      old libc implementations.
      
      [ tglx: Cleaned it up. ]
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f55f0501
  15. 22 Dec, 2017 1 commit
    • Dave Hansen's avatar
      x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack · 4fe2d8b1
      Dave Hansen authored
      
      
      If the kernel oopses while on the trampoline stack, it will print
      "<SYSENTER>" even if SYSENTER is not involved.  That is rather confusing.
      
      The "SYSENTER" stack is used for a lot more than SYSENTER now.  Give it a
      better string to display in stack dumps, and rename the kernel code to
      match.
      
      Also move the 32-bit code over to the new naming even though it still uses
      the entry stack only for SYSENTER.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4fe2d8b1
  16. 17 Dec, 2017 8 commits
    • Thomas Gleixner's avatar
      x86/cpufeatures: Make CPU bugs sticky · 6cbd2171
      Thomas Gleixner authored
      
      
      There is currently no way to force CPU bug bits like CPU feature bits. That
      makes it impossible to set a bug bit once at boot and have it stick for all
      upcoming CPUs.
      
      Extend the force set/clear arrays to handle bug bits as well.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6cbd2171
    • Andy Lutomirski's avatar
      x86/entry/64: Make cpu_entry_area.tss read-only · c482feef
      Andy Lutomirski authored
      
      
      The TSS is a fairly juicy target for exploits, and, now that the TSS
      is in the cpu_entry_area, it's no longer protected by kASLR.  Make it
      read-only on x86_64.
      
      On x86_32, it can't be RO because it's written by the CPU during task
      switches, and we use a task gate for double faults.  I'd also be
      nervous about errata if we tried to make it RO even on configurations
      without double fault handling.
      
      [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO.  So
        	it's probably safe to assume that it's a non issue, though Intel
        	might have been creative in that area. Still waiting for
        	confirmation. ]
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bpetkov@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c482feef
    • Andy Lutomirski's avatar
      x86/entry: Clean up the SYSENTER_stack code · 0f9a4810
      Andy Lutomirski authored
      
      
      The existing code was a mess, mainly because C arrays are nasty.  Turn
      SYSENTER_stack into a struct, add a helper to find it, and do all the
      obvious cleanups this enables.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bpetkov@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0f9a4810
    • Andy Lutomirski's avatar
      x86/entry/64: Remove the SYSENTER stack canary · 7fbbd5cb
      Andy Lutomirski authored
      
      
      Now that the SYSENTER stack has a guard page, there's no need for a canary
      to detect overflow after the fact.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.572577316@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7fbbd5cb
    • Andy Lutomirski's avatar
      x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 · 9aaefe7b
      Andy Lutomirski authored
      
      
      On 64-bit kernels, we used to assume that TSS.sp0 was the current
      top of stack.  With the addition of an entry trampoline, this will
      no longer be the case.  Store the current top of stack in TSS.sp1,
      which is otherwise unused but shares the same cacheline.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.050864668@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9aaefe7b
    • Andy Lutomirski's avatar
      x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct · 1a935bc3
      Andy Lutomirski authored
      
      
      SYSENTER_stack should have reliable overflow detection, which
      means that it needs to be at the bottom of a page, not the top.
      Move it to the beginning of struct tss_struct and page-align it.
      
      Also add an assertion to make sure that the fixed hardware TSS
      doesn't cross a page boundary.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.881827433@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1a935bc3
    • Andy Lutomirski's avatar
      x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss · 7fb983b4
      Andy Lutomirski authored
      
      
      A future patch will move SYSENTER_stack to the beginning of cpu_tss
      to help detect overflow.  Before this can happen, fix several code
      paths that hardcode assumptions about the old layout.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.722425540@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7fb983b4
    • Andy Lutomirski's avatar
      x86/entry/64: Allocate and enable the SYSENTER stack · 1a79797b
      Andy Lutomirski authored
      
      
      This will simplify future changes that want scratch variables early in
      the SYSENTER handler -- they'll be able to spill registers to the
      stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.
      
      This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
      stack space even without IA32 emulation.
      
      As far as I can tell, the reason that this wasn't done from day 1 is
      that we use IST for #DB and #BP, which is IMO rather nasty and causes
      a lot more problems than it solves.  But, since #DB uses IST, we don't
      actually need a real stack for SYSENTER (because SYSENTER with TF set
      will invoke #DB on the IST stack rather than the SYSENTER stack).
      
      I want to remove IST usage from these vectors some day, and this patch
      is a prerequisite for that as well.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1a79797b
  17. 17 Nov, 2017 1 commit
    • Andi Kleen's avatar
      x86/topology: Avoid wasting 128k for package id array · 30bb9811
      Andi Kleen authored
      Analyzing large early boot allocations unveiled the logical package id
      storage as a prominent memory waste. Since commit 1f12e32f
      
      
      ("x86/topology: Create logical package id") every 64-bit system allocates a
      128k array to convert logical package ids.
      
      This happens because the array is sized for MAX_LOCAL_APIC which is always
      32k on 64bit systems, and it needs 4 bytes for each entry.
      
      This is fairly wasteful, especially for the common case of having only one
      socket, which uses exactly 4 byte out of 128K.
      
      There is no user of the package id map which is performance critical, so
      the lookup is not required to be O(1). Store the logical processor id in
      cpu_data and use a loop based lookup.
      
      To keep the mapping stable accross cpu hotplug operations, add a flag to
      cpu_data which is set when the CPU is brought up the first time. When the
      flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
      subsequent bringups.
      
      [ tglx: Rename the flag to 'initialized', use proper pointers instead of
        	repeated cpu_data(x) evaluation and massage changelog. ]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: He Chen <he.chen@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Piotr Luc <piotr.luc@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com
      30bb9811
  18. 02 Nov, 2017 6 commits
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      
      
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
    • Andy Lutomirski's avatar
      x86/traps: Use a new on_thread_stack() helper to clean up an assertion · 3383642c
      Andy Lutomirski authored
      
      
      Let's keep the stack-related logic together rather than open-coding
      a comparison in an assertion in the traps code.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/856b15bee1f55017b8f79d3758b0d51c48a08cf8.1509609304.git.luto@kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3383642c
    • Andy Lutomirski's avatar
      x86/entry/64: Remove thread_struct::sp0 · d375cf15
      Andy Lutomirski authored
      
      
      On x86_64, we can easily calculate sp0 when needed instead of
      storing it in thread_struct.
      
      On x86_32, a similar cleanup would be possible, but it would require
      cleaning up the vm86 code first, and that can wait for a later
      cleanup series.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/719cd9c66c548c4350d98a90f050aee8b17f8919.1509609304.git.luto@kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d375cf15
    • Andy Lutomirski's avatar
      x86/entry: Add task_top_of_stack() to find the top of a task's stack · 3500130b
      Andy Lutomirski authored
      
      
      This will let us get rid of a few places that hardcode accesses to
      thread.sp0.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/b49b3f95a8ff858c40c9b0f5b32be0355324327d.1509609304.git.luto@kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3500130b
    • Andy Lutomirski's avatar
      x86/entry/64: Pass SP0 directly to load_sp0() · da51da18
      Andy Lutomirski authored
      
      
      load_sp0() had an odd signature:
      
        void load_sp0(struct tss_struct *tss, struct thread_struct *thread);
      
      Simplify it to:
      
        void load_sp0(unsigned long sp0);
      
      Also simplify a few get_cpu()/put_cpu() sequences to
      preempt_disable()/preempt_enable().
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2655d8b42ed940aa384fe18ee1129bbbcf730a08.1509609304.git.luto@kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      da51da18
    • Andy Lutomirski's avatar
      x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() · bd7dc5a6
      Andy Lutomirski authored
      
      
      This causes the MSR_IA32_SYSENTER_CS write to move out of the
      paravirt callback.  This shouldn't affect Xen PV: Xen already ignores
      MSR_IA32_SYSENTER_ESP writes.  In any event, Xen doesn't support
      vm86() in a useful way.
      
      Note to any potential backporters: This patch won't break lguest, as
      lguest didn't have any SYSENTER support at all.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/75cf09fe03ae778532d0ca6c65aa58e66bc2f90c.1509609304.git.luto@kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      bd7dc5a6
  19. 23 Sep, 2017 1 commit
    • Josh Poimboeuf's avatar
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf authored
      
      
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f5caf621
  20. 24 Aug, 2017 1 commit
  21. 21 Jul, 2017 3 commits
    • Kirill A. Shutemov's avatar
      x86/mm: Allow userspace have mappings above 47-bit · ee00f4a3
      Kirill A. Shutemov authored
      
      
      All bits and pieces are now in place and we can allow userspace to have VMAs
      above 47 bits.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170716225954.74185-8-kirill.shutemov@linux.intel.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ee00f4a3
    • Kirill A. Shutemov's avatar
      x86/mm: Prepare to expose larger address space to userspace · b569bab7
      Kirill A. Shutemov authored
      
      
      On x86, 5-level paging enables 56-bit userspace virtual address space.
      Not all user space is ready to handle wide addresses. It's known that
      at least some JIT compilers use higher bits in pointers to encode their
      information. It collides with valid pointers with 5-level paging and
      leads to crashes.
      
      To mitigate this, we are not going to allocate virtual address space
      above 47-bit by default.
      
      But userspace can ask for allocation from full address space by
      specifying hint address (with or without MAP_FIXED) above 47-bits.
      
      If hint address set above 47-bit, but MAP_FIXED is not specified, we try
      to look for unmapped area by specified address. If it's already
      occupied, we look for unmapped area in *full* address space, rather than
      from 47-bit window.
      
      A high hint address would only affect the allocation in question, but not
      any future mmap()s.
      
      Specifying high hint address on older kernel or on machine without 5-level
      paging support is safe. The hint will be ignored and kernel will fall back
      to allocation from 47-bit address space.
      
      This approach helps to easily make application's memory allocator aware
      about large address space without manually tracking allocated virtual
      address space.
      
      The patch puts all machinery in place, but not yet allows userspace to have
      mappings above 47-bit -- TASK_SIZE_MAX has to be raised to get the effect.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170716225954.74185-7-kirill.shutemov@linux.intel.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b569bab7
    • Kirill A. Shutemov's avatar
      x86/mpx: Do not allow MPX if we have mappings above 47-bit · 44b04912
      Kirill A. Shutemov authored
      
      
      MPX (without MAWA extension) cannot handle addresses above 47 bits, so we
      need to make sure that MPX cannot be enabled if we already have a VMA above
      the boundary and forbid creating such VMAs once MPX is enabled.
      
      The patch implements mpx_unmapped_area_check() which is called from all
      variants of get_unmapped_area() to check if the requested address fits
      mpx.
      
      On enabling MPX, we check if we already have any vma above 47-bit
      boundary and forbit the enabling if we do.
      
      As long as DEFAULT_MAP_WINDOW is equal to TASK_SIZE_MAX, the change is
      nop. It will change when we allow userspace to have mappings above
      47-bits.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170716225954.74185-6-kirill.shutemov@linux.intel.com
      
      
      [ Readability edits. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      44b04912
  22. 18 Jul, 2017 2 commits
    • Tom Lendacky's avatar
      x86/mm: Add SME support for read_cr3_pa() · eef9c4ab
      Tom Lendacky authored
      
      
      The CR3 register entry can contain the SME encryption mask that indicates
      the PGD is encrypted.  The encryption mask should not be used when
      creating a virtual address from the CR3 register, so remove the SME
      encryption mask in the read_cr3_pa() function.
      
      During early boot SME will need to use a native version of read_cr3_pa(),
      so create native_read_cr3_pa().
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kvm@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/767b085c384a46f67f451f8589903a462c7ff68a.1500319216.git.thomas.lendacky@amd.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      eef9c4ab
    • Tom Lendacky's avatar
      x86/mm: Provide general kernel support for memory encryption · 21729f81
      Tom Lendacky authored
      
      
      Changes to the existing page table macros will allow the SME support to
      be enabled in a simple fashion with minimal changes to files that use these
      macros.  Since the memory encryption mask will now be part of the regular
      pagetable macros, we introduce two new macros (_PAGE_TABLE_NOENC and
      _KERNPG_TABLE_NOENC) to allow for early pagetable creation/initialization
      without the encryption mask before SME becomes active.  Two new pgprot()
      macros are defined to allow setting or clearing the page encryption mask.
      
      The FIXMAP_PAGE_NOCACHE define is introduced for use with MMIO.  SME does
      not support encryption for MMIO areas so this define removes the encryption
      mask from the page attribute.
      
      Two new macros are introduced (__sme_pa() / __sme_pa_nodebug()) to allow
      creating a physical address with the encryption mask.  These are used when
      working with the cr3 register so that the PGD can be encrypted. The current
      __va() macro is updated so that the virtual address is generated based off
      of the physical address without the encryption mask thus allowing the same
      virtual address to be generated regardless of whether encryption is enabled
      for that physical location or not.
      
      Also, an early initialization function is added for SME.  If SME is active,
      this function:
      
       - Updates the early_pmd_flags so that early page faults create mappings
         with the encryption mask.
      
       - Updates the __supported_pte_mask to include the encryption mask.
      
       - Updates the protection_map entries to include the encryption mask so
         that user-space allocations will automatically have the encryption mask
         applied.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kvm@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/b36e952c4c39767ae7f0a41cf5345adf27438480.1500319216.git.thomas.lendacky@amd.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      21729f81