Skip to content
Snippets Groups Projects
  1. Dec 05, 2021
    • Tom Lendacky's avatar
      x86/sme: Explicitly map new EFI memmap table as encrypted · 1ff2fc02
      Tom Lendacky authored
      Reserving memory using efi_mem_reserve() calls into the x86
      efi_arch_mem_reserve() function. This function will insert a new EFI
      memory descriptor into the EFI memory map representing the area of
      memory to be reserved and marking it as EFI runtime memory. As part
      of adding this new entry, a new EFI memory map is allocated and mapped.
      The mapping is where a problem can occur. This new memory map is mapped
      using early_memremap() and generally mapped encrypted, unless the new
      memory for the mapping happens to come from an area of memory that is
      marked as EFI_BOOT_SERVICES_DATA memory. In this case, the new memory will
      be mapped unencrypted. However, during replacement of the old memory map,
      efi_mem_type() is disabled, so the new memory map will now be long-term
      mapped encrypted (in efi.memmap), resulting in the map containing invalid
      data and causing the kernel boot to crash.
      
      Since it is known that the area will be mapped encrypted going forward,
      explicitly map the new memory map as encrypted using early_memremap_prot().
      
      Cc: <stable@vger.kernel.org> # 4.14.x
      Fixes: 8f716c9b ("x86/mm: Add support to access boot related data in the clear")
      Link: https://lore.kernel.org/all/ebf1eb2940405438a09d51d121ec0d02c8755558.1634752931.git.thomas.lendacky@amd.com/
      
      
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      [ardb: incorporate Kconfig fix by Arnd]
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      1ff2fc02
    • Tom Lendacky's avatar
      KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure · ad5b3532
      Tom Lendacky authored
      
      Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
      exit code or exit parameters fails.
      
      The VMGEXIT instruction can be issued from userspace, even though
      userspace (likely) can't update the GHCB. To prevent userspace from being
      able to kill the guest, return an error through the GHCB when validation
      fails rather than terminating the guest. For cases where the GHCB can't be
      updated (e.g. the GHCB can't be mapped, etc.), just return back to the
      guest.
      
      The new error codes are documented in the lasest update to the GHCB
      specification.
      
      Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <b57280b5562893e2616257ac9c2d4525a9aeeb42.1638471124.git.thomas.lendacky@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ad5b3532
    • Sean Christopherson's avatar
      KVM: SEV: Fall back to vmalloc for SEV-ES scratch area if necessary · a655276a
      Sean Christopherson authored
      
      Use kvzalloc() to allocate KVM's buffer for SEV-ES's GHCB scratch area so
      that KVM falls back to __vmalloc() if physically contiguous memory isn't
      available.  The buffer is purely a KVM software construct, i.e. there's
      no need for it to be physically contiguous.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109222350.2266045-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a655276a
    • Sean Christopherson's avatar
      KVM: SEV: Return appropriate error codes if SEV-ES scratch setup fails · 75236f5f
      Sean Christopherson authored
      
      Return appropriate error codes if setting up the GHCB scratch area for an
      SEV-ES guest fails.  In particular, returning -EINVAL instead of -ENOMEM
      when allocating the kernel buffer could be confusing as userspace would
      likely suspect a guest issue.
      
      Fixes: 8f423a80 ("KVM: SVM: Support MMIO for an SEV-ES guest")
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211109222350.2266045-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      75236f5f
  2. Dec 04, 2021
    • Helge Deller's avatar
      parisc: Mark cr16 CPU clocksource unstable on all SMP machines · afdb4a5b
      Helge Deller authored
      
      In commit c8c37359 ("parisc: Enhance detection of synchronous cr16
      clocksources") I assumed that CPUs on the same physical core are syncronous.
      While booting up the kernel on two different C8000 machines, one with a
      dual-core PA8800 and one with a dual-core PA8900 CPU, this turned out to be
      wrong. The symptom was that I saw a jump in the internal clocks printed to the
      syslog and strange overall behaviour.  On machines which have 4 cores (2
      dual-cores) the problem isn't visible, because the current logic already marked
      the cr16 clocksource unstable in this case.
      
      This patch now marks the cr16 interval timers unstable if we have more than one
      CPU in the system, and it fixes this issue.
      
      Fixes: c8c37359 ("parisc: Enhance detection of synchronous cr16 clocksources")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v5.15+
      afdb4a5b
    • Helge Deller's avatar
      parisc: Fix "make install" on newer debian releases · 0f9fee4c
      Helge Deller authored
      
      On newer debian releases the debian-provided "installkernel" script is
      installed in /usr/sbin. Fix the kernel install.sh script to look for the
      script in this directory as well.
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v3.13+
      0f9fee4c
  3. Dec 03, 2021
  4. Dec 02, 2021
    • Heiko Carstens's avatar
      s390: update defconfigs · 3c088b1e
      Heiko Carstens authored
      
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      3c088b1e
    • Mark Rutland's avatar
      arm64: ftrace: add missing BTIs · 35b6b28e
      Mark Rutland authored
      
      When branch target identifiers are in use, code reachable via an
      indirect branch requires a BTI landing pad at the branch target site.
      
      When building FTRACE_WITH_REGS atop patchable-function-entry, we miss
      BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller`
      trampolines, and when these are called from a module via a PLT (which
      will use a `BR X16`), we will encounter a BTI failure, e.g.
      
      | # insmod lkdtm.ko
      | lkdtm: No crash points registered, enable through debugfs
      | # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      | # cat /sys/kernel/debug/provoke-crash/DIRECT
      | Unhandled 64-bit el1h sync exception on CPU0, ESR 0x34000001 -- BTI
      | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 60400405 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=jc)
      | pc : ftrace_caller+0x0/0x3c
      | lr : lkdtm_debugfs_open+0xc/0x20 [lkdtm]
      | sp : ffff800012e43b00
      | x29: ffff800012e43b00 x28: 0000000000000000 x27: ffff800012e43c88
      | x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000c171f200
      | x23: ffff0000c27b1e00 x22: ffff0000c2265240 x21: ffff0000c23c8c30
      | x20: ffff8000090ba380 x19: 0000000000000000 x18: 0000000000000000
      | x17: 0000000000000000 x16: ffff80001002bb4c x15: 0000000000000000
      | x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000900ff0
      | x11: ffff0000c4166310 x10: ffff800012e43b00 x9 : ffff8000104f2384
      | x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000003f
      | x5 : 0000000000000040 x4 : ffff800012e43af0 x3 : 0000000000000001
      | x2 : ffff8000090b0000 x1 : ffff0000c171f200 x0 : ffff0000c23c8c30
      | Kernel panic - not syncing: Unhandled exception
      | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3
      | Hardware name: linux,dummy-virt (DT)
      | Call trace:
      |  dump_backtrace+0x0/0x1a4
      |  show_stack+0x24/0x30
      |  dump_stack_lvl+0x68/0x84
      |  dump_stack+0x1c/0x38
      |  panic+0x168/0x360
      |  arm64_exit_nmi.isra.0+0x0/0x80
      |  el1h_64_sync_handler+0x68/0xd4
      |  el1h_64_sync+0x78/0x7c
      |  ftrace_caller+0x0/0x3c
      |  do_dentry_open+0x134/0x3b0
      |  vfs_open+0x38/0x44
      |  path_openat+0x89c/0xe40
      |  do_filp_open+0x8c/0x13c
      |  do_sys_openat2+0xbc/0x174
      |  __arm64_sys_openat+0x6c/0xbc
      |  invoke_syscall+0x50/0x120
      |  el0_svc_common.constprop.0+0xdc/0x100
      |  do_el0_svc+0x84/0xa0
      |  el0_svc+0x28/0x80
      |  el0t_64_sync_handler+0xa8/0x130
      |  el0t_64_sync+0x1a0/0x1a4
      | SMP: stopping secondary CPUs
      | Kernel Offset: disabled
      | CPU features: 0x0,00000f42,da660c5f
      | Memory Limit: none
      | ---[ end Kernel panic - not syncing: Unhandled exception ]---
      
      Fix this by adding the required `BTI C`, as we only require these to be
      reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these
      are open-coded in the function prologue, matching the style of the
      `__hwasan_tag_mismatch` trampoline.
      
      In future we may wish to consider adding a new SYM_CODE_START_*()
      variant which has an implicit BTI.
      
      When ftrace is built atop mcount, the trampolines are marked with
      SYM_FUNC_START(), and so get an implicit BTI. We may need to change
      these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in
      case we need to apply special care aroud the return address being
      rewritten.
      
      Fixes: 97fed779 ("arm64: bti: Provide Kconfig for kernel mode BTI")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      35b6b28e
    • Mark Rutland's avatar
      arm64: kexec: use __pa_symbol(empty_zero_page) · 2f218324
      Mark Rutland authored
      
      In machine_kexec_post_load() we use __pa() on `empty_zero_page`, so that
      we can use the physical address during arm64_relocate_new_kernel() to
      switch TTBR1 to a new set of tables. While `empty_zero_page` is part of
      the old kernel, we won't clobber it until after this switch, so using it
      is benign.
      
      However, `empty_zero_page` is part of the kernel image rather than a
      linear map address, so it is not correct to use __pa(x), and we should
      instead use __pa_symbol(x) or __pa(lm_alias(x)). Otherwise, when the
      kernel is built with DEBUG_VIRTUAL, we'll encounter splats as below, as
      I've seen when fuzzing v5.16-rc3 with Syzkaller:
      
      | ------------[ cut here ]------------
      | virt_to_phys used for non-linear address: 000000008492561a (empty_zero_page+0x0/0x1000)
      | WARNING: CPU: 3 PID: 11492 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | CPU: 3 PID: 11492 Comm: syz-executor.0 Not tainted 5.16.0-rc3-00001-g48bd452a045c #1
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      | pc : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | lr : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      | sp : ffff80001af17bb0
      | x29: ffff80001af17bb0 x28: ffff1cc65207b400 x27: ffffb7828730b120
      | x26: 0000000000000e11 x25: 0000000000000000 x24: 0000000000000001
      | x23: ffffb7828963e000 x22: ffffb78289644000 x21: 0000600000000000
      | x20: 000000000000002d x19: 0000b78289644000 x18: 0000000000000000
      | x17: 74706d6528206131 x16: 3635323934383030 x15: 303030303030203a
      | x14: 1ffff000035e2eb8 x13: ffff6398d53f4f0f x12: 1fffe398d53f4f0e
      | x11: 1fffe398d53f4f0e x10: ffff6398d53f4f0e x9 : ffffb7827c6f76dc
      | x8 : ffff1cc6a9fa7877 x7 : 0000000000000001 x6 : ffff6398d53f4f0f
      | x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff1cc66f2a99c0
      | x2 : 0000000000040000 x1 : d7ce7775b09b5d00 x0 : 0000000000000000
      | Call trace:
      |  __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12
      |  machine_kexec_post_load+0x284/0x670 arch/arm64/kernel/machine_kexec.c:150
      |  do_kexec_load+0x570/0x670 kernel/kexec.c:155
      |  __do_sys_kexec_load kernel/kexec.c:250 [inline]
      |  __se_sys_kexec_load kernel/kexec.c:231 [inline]
      |  __arm64_sys_kexec_load+0x1d8/0x268 kernel/kexec.c:231
      |  __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
      |  invoke_syscall+0x90/0x2e0 arch/arm64/kernel/syscall.c:52
      |  el0_svc_common.constprop.2+0x1e4/0x2f8 arch/arm64/kernel/syscall.c:142
      |  do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:181
      |  el0_svc+0x60/0x248 arch/arm64/kernel/entry-common.c:603
      |  el0t_64_sync_handler+0x90/0xb8 arch/arm64/kernel/entry-common.c:621
      |  el0t_64_sync+0x180/0x184 arch/arm64/kernel/entry.S:572
      | irq event stamp: 2428
      | hardirqs last  enabled at (2427): [<ffffb7827c6f2308>] __up_console_sem+0xf0/0x118 kernel/printk/printk.c:255
      | hardirqs last disabled at (2428): [<ffffb7828223df98>] el1_dbg+0x28/0x80 arch/arm64/kernel/entry-common.c:375
      | softirqs last  enabled at (2424): [<ffffb7827c411c00>] softirq_handle_end kernel/softirq.c:401 [inline]
      | softirqs last  enabled at (2424): [<ffffb7827c411c00>] __do_softirq+0xa28/0x11e4 kernel/softirq.c:587
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] invoke_softirq kernel/softirq.c:439 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] __irq_exit_rcu kernel/softirq.c:636 [inline]
      | softirqs last disabled at (2417): [<ffffb7827c59015c>] irq_exit_rcu+0x53c/0x688 kernel/softirq.c:648
      | ---[ end trace 0ca578534e7ca938 ]---
      
      With or without DEBUG_VIRTUAL __pa() will fall back to __kimg_to_phys()
      for non-linear addresses, and will happen to do the right thing in this
      case, even with the warning. But we should not depend upon this, and to
      keep the warning useful we should fix this case.
      
      Fix this issue by using __pa_symbol(), which handles kernel image
      addresses (and checks its input is a kernel image address). This matches
      what we do elsewhere, e.g. in arch/arm64/include/asm/pgtable.h:
      
      | #define ZERO_PAGE(vaddr)       phys_to_page(__pa_symbol(empty_zero_page))
      
      Fixes: 3744b528 ("arm64: kexec: install a copy of the linear-map")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarPasha Tatashin <pasha.tatashin@soleen.com>
      Link: https://lore.kernel.org/r/20211130121849.3319010-1-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      2f218324
    • Sean Christopherson's avatar
      KVM: x86/mmu: Retry page fault if root is invalidated by memslot update · a955cad8
      Sean Christopherson authored
      
      Bail from the page fault handler if the root shadow page was obsoleted by
      a memslot update.  Do the check _after_ acuiring mmu_lock, as the TDP MMU
      doesn't rely on the memslot/MMU generation, and instead relies on the
      root being explicit marked invalid by kvm_mmu_zap_all_fast(), which takes
      mmu_lock for write.
      
      For the TDP MMU, inserting a SPTE into an obsolete root can leak a SP if
      kvm_tdp_mmu_zap_invalidated_roots() has already zapped the SP, i.e. has
      moved past the gfn associated with the SP.
      
      For other MMUs, the resulting behavior is far more convoluted, though
      unlikely to be truly problematic.  Installing SPs/SPTEs into the obsolete
      root isn't directly problematic, as the obsolete root will be unloaded
      and dropped before the vCPU re-enters the guest.  But because the legacy
      MMU tracks shadow pages by their role, any SP created by the fault can
      can be reused in the new post-reload root.  Again, that _shouldn't_ be
      problematic as any leaf child SPTEs will be created for the current/valid
      memslot generation, and kvm_mmu_get_page() will not reuse child SPs from
      the old generation as they will be flagged as obsolete.  But, given that
      continuing with the fault is pointess (the root will be unloaded), apply
      the check to all MMUs.
      
      Fixes: b7cccd39 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
      Cc: stable@vger.kernel.org
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211120045046.3940942-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a955cad8
    • Dan Carpenter's avatar
      KVM: VMX: Set failure code in prepare_vmcs02() · bfbb307c
      Dan Carpenter authored
      
      The error paths in the prepare_vmcs02() function are supposed to set
      *entry_failure_code but this path does not.  It leads to using an
      uninitialized variable in the caller.
      
      Fixes: 71f73470 ("KVM: nVMX: Load GUEST_IA32_PERF_GLOBAL_CTRL MSR on VM-Entry")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Message-Id: <20211130125337.GB24578@kili>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bfbb307c
    • Paolo Bonzini's avatar
      KVM: ensure APICv is considered inactive if there is no APIC · ef8b4b72
      Paolo Bonzini authored
      
      kvm_vcpu_apicv_active() returns false if a virtual machine has no in-kernel
      local APIC, however kvm_apicv_activated might still be true if there are
      no reasons to disable APICv; in fact it is quite likely that there is none
      because APICv is inhibited by specific configurations of the local APIC
      and those configurations cannot be programmed.  This triggers a WARN:
      
         WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu));
      
      To avoid this, introduce another cause for APICv inhibition, namely the
      absence of an in-kernel local APIC.  This cause is enabled by default,
      and is dropped by either KVM_CREATE_IRQCHIP or the enabling of
      KVM_CAP_IRQCHIP_SPLIT.
      
      Reported-by: default avatarIgnat Korchagin <ignat@cloudflare.com>
      Fixes: ee49a893 ("KVM: x86: Move SVM's APICv sanity check to common x86", 2021-10-22)
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Tested-by: default avatarIgnat Korchagin <ignat@cloudflare.com>
      Message-Id: <20211130123746.293379-1-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ef8b4b72
    • Like Xu's avatar
      KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln register · cb1d220d
      Like Xu authored
      
      If we run the following perf command in an AMD Milan guest:
      
        perf stat \
        -e cpu/event=0x1d0/ \
        -e cpu/event=0x1c7/ \
        -e cpu/umask=0x1f,event=0x18e/ \
        -e cpu/umask=0x7,event=0x18e/ \
        -e cpu/umask=0x18,event=0x18e/ \
        ./workload
      
      dmesg will report a #GP warning from an unchecked MSR access
      error on MSR_F15H_PERF_CTLx.
      
      This is because according to APM (Revision: 4.03) Figure 13-7,
      the bits [35:32] of AMD PerfEvtSeln register is a part of the
      event select encoding, which extends the EVENT_SELECT field
      from 8 bits to 12 bits.
      
      Opportunistically update pmu->reserved_bits for reserved bit 19.
      
      Reported-by: default avatarJim Mattson <jmattson@google.com>
      Fixes: ca724305 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211118130320.95997-1-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cb1d220d
  5. Dec 01, 2021
  6. Nov 30, 2021
Loading