Skip to content
Snippets Groups Projects
  1. Oct 23, 2020
  2. Oct 21, 2020
  3. Sep 28, 2020
  4. Sep 11, 2020
  5. Aug 21, 2020
    • Will Deacon's avatar
      KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() · fdfe7cbd
      Will Deacon authored
      
      The 'flags' field of 'struct mmu_notifier_range' is used to indicate
      whether invalidate_range_{start,end}() are permitted to block. In the
      case of kvm_mmu_notifier_invalidate_range_start(), this field is not
      forwarded on to the architecture-specific implementation of
      kvm_unmap_hva_range() and therefore the backend cannot sensibly decide
      whether or not to block.
      
      Add an extra 'flags' parameter to kvm_unmap_hva_range() so that
      architectures are aware as to whether or not they are permitted to block.
      
      Cc: <stable@vger.kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Message-Id: <20200811102725.7121-2-will@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fdfe7cbd
  6. Aug 12, 2020
  7. Aug 05, 2020
  8. Jul 29, 2020
  9. Jul 24, 2020
    • Thomas Gleixner's avatar
      entry: Provide infrastructure for work before transitioning to guest mode · 935ace2f
      Thomas Gleixner authored
      
      Entering a guest is similar to exiting to user space. Pending work like
      handling signals, rescheduling, task work etc. needs to be handled before
      that.
      
      Provide generic infrastructure to avoid duplication of the same handling
      code all over the place.
      
      The transfer to guest mode handling is different from the exit to usermode
      handling, e.g. vs. rseq and live patching, so a separate function is used.
      
      The initial list of work items handled is:
      
          TIF_SIGPENDING, TIF_NEED_RESCHED, TIF_NOTIFY_RESUME
      
      Architecture specific TIF flags can be added via defines in the
      architecture specific include files.
      
      The calling convention is also different from the syscall/interrupt entry
      functions as KVM invokes this from the outer vcpu_run() loop with
      interrupts and preemption enabled. To prevent missing a pending work item
      it invokes a check for pending TIF work from interrupt disabled code right
      before transitioning to guest mode. The lockdep, RCU and tracing state
      handling is also done directly around the switch to and from guest mode.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20200722220519.833296398@linutronix.de
      935ace2f
  10. Jul 09, 2020
  11. Jul 08, 2020
  12. Jul 02, 2020
  13. Jun 11, 2020
  14. Jun 09, 2020
    • Michel Lespinasse's avatar
      mmap locking API: use coccinelle to convert mmap_sem rwsem call sites · d8ed45c5
      Michel Lespinasse authored
      
      This change converts the existing mmap_sem rwsem calls to use the new mmap
      locking API instead.
      
      The change is generated using coccinelle with the following rule:
      
      // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
      
      @@
      expression mm;
      @@
      (
      -init_rwsem
      +mmap_init_lock
      |
      -down_write
      +mmap_write_lock
      |
      -down_write_killable
      +mmap_write_lock_killable
      |
      -down_write_trylock
      +mmap_write_trylock
      |
      -up_write
      +mmap_write_unlock
      |
      -downgrade_write
      +mmap_write_downgrade
      |
      -down_read
      +mmap_read_lock
      |
      -down_read_killable
      +mmap_read_lock_killable
      |
      -down_read_trylock
      +mmap_read_trylock
      |
      -up_read
      +mmap_read_unlock
      )
      -(&mm->mmap_sem)
      +(mm)
      
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: default avatarLaurent Dufour <ldufour@linux.ibm.com>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Liam Howlett <Liam.Howlett@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ying Han <yinghan@google.com>
      Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d8ed45c5
    • Mike Rapoport's avatar
      mm: don't include asm/pgtable.h if linux/mm.h is already included · e31cf2f4
      Mike Rapoport authored
      
      Patch series "mm: consolidate definitions of page table accessors", v2.
      
      The low level page table accessors (pXY_index(), pXY_offset()) are
      duplicated across all architectures and sometimes more than once.  For
      instance, we have 31 definition of pgd_offset() for 25 supported
      architectures.
      
      Most of these definitions are actually identical and typically it boils
      down to, e.g.
      
      static inline unsigned long pmd_index(unsigned long address)
      {
              return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
      }
      
      static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
      {
              return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
      }
      
      These definitions can be shared among 90% of the arches provided
      XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
      
      For architectures that really need a custom version there is always
      possibility to override the generic version with the usual ifdefs magic.
      
      These patches introduce include/linux/pgtable.h that replaces
      include/asm-generic/pgtable.h and add the definitions of the page table
      accessors to the new header.
      
      This patch (of 12):
      
      The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
      functions involving page table manipulations, e.g.  pte_alloc() and
      pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
      in the files that include <linux/mm.h>.
      
      The include statements in such cases are remove with a simple loop:
      
      	for f in $(git grep -l "include <linux/mm.h>") ; do
      		sed -i -e '/include <asm\/pgtable.h>/ d' $f
      	done
      
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e31cf2f4
  15. Jun 08, 2020
    • Souptick Joarder's avatar
      mm/gup.c: convert to use get_user_{page|pages}_fast_only() · dadbb612
      Souptick Joarder authored
      
      API __get_user_pages_fast() renamed to get_user_pages_fast_only() to
      align with pin_user_pages_fast_only().
      
      As part of this we will get rid of write parameter.  Instead caller will
      pass FOLL_WRITE to get_user_pages_fast_only().  This will not change any
      existing functionality of the API.
      
      All the callers are changed to pass FOLL_WRITE.
      
      Also introduce get_user_page_fast_only(), and use it in a few places
      that hard-code nr_pages to 1.
      
      Updated the documentation of the API.
      
      Signed-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: Paul Mackerras <paulus@ozlabs.org>		[arch/powerpc/kvm]
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Michal Suchanek <msuchanek@suse.de>
      Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dadbb612
    • Eiichi Tsukata's avatar
      KVM: x86: Fix APIC page invalidation race · e649b3f0
      Eiichi Tsukata authored
      Commit b1394e74 ("KVM: x86: fix APIC page invalidation") tried
      to fix inappropriate APIC page invalidation by re-introducing arch
      specific kvm_arch_mmu_notifier_invalidate_range() and calling it from
      kvm_mmu_notifier_invalidate_range_start. However, the patch left a
      possible race where the VMCS APIC address cache is updated *before*
      it is unmapped:
      
        (Invalidator) kvm_mmu_notifier_invalidate_range_start()
        (Invalidator) kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD)
        (KVM VCPU) vcpu_enter_guest()
        (KVM VCPU) kvm_vcpu_reload_apic_access_page()
        (Invalidator) actually unmap page
      
      Because of the above race, there can be a mismatch between the
      host physical address stored in the APIC_ACCESS_PAGE VMCS field and
      the host physical address stored in the EPT entry for the APIC GPA
      (0xfee0000).  When this happens, the processor will not trap APIC
      accesses, and will instead show the raw contents of the APIC-access page.
      Because Windows OS periodically checks for unexpected modifications to
      the LAPIC register, this will show up as a BSOD crash with BugCheck
      CRITICAL_STRUCTURE_CORRUPTION (109) we are currently seeing in
      https://bugzilla.redhat.com/show_bug.cgi?id=1751017.
      
      The root cause of the issue is that kvm_arch_mmu_notifier_invalidate_range()
      cannot guarantee that no additional references are taken to the pages in
      the range before kvm_mmu_notifier_invalidate_range_end().  Fortunately,
      this case is supported by the MMU notifier API, as documented in
      include/linux/mmu_notifier.h:
      
      	 * If the subsystem
               * can't guarantee that no additional references are taken to
               * the pages in the range, it has to implement the
               * invalidate_range() notifier to remove any references taken
               * after invalidate_range_start().
      
      The fix therefore is to reload the APIC-access page field in the VMCS
      from kvm_mmu_notifier_invalidate_range() instead of ..._range_start().
      
      Cc: stable@vger.kernel.org
      Fixes: b1394e74 ("KVM: x86: fix APIC page invalidation")
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=197951
      
      
      Signed-off-by: default avatarEiichi Tsukata <eiichi.tsukata@nutanix.com>
      Message-Id: <20200606042627.61070-1-eiichi.tsukata@nutanix.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e649b3f0
  16. Jun 04, 2020
  17. Jun 01, 2020
    • Paolo Bonzini's avatar
      KVM: check userspace_addr for all memslots · 09d952c9
      Paolo Bonzini authored
      
      The userspace_addr alignment and range checks are not performed for private
      memory slots that are prepared by KVM itself.  This is unnecessary and makes
      it questionable to use __*_user functions to access memory later on.  We also
      rely on the userspace address being aligned since we have an entire family
      of functions to map gfn to pfn.
      
      Fortunately skipping the check is completely unnecessary.  Only x86 uses
      private memslots and their userspace_addr is obtained from vm_mmap,
      therefore it must be below PAGE_OFFSET.  In fact, any attempt to pass
      an address above PAGE_OFFSET would have failed because such an address
      would return true for kvm_is_error_hva.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      09d952c9
    • Vitaly Kuznetsov's avatar
      KVM: x86: acknowledgment mechanism for async pf page ready notifications · 557a961a
      Vitaly Kuznetsov authored
      
      If two page ready notifications happen back to back the second one is not
      delivered and the only mechanism we currently have is
      kvm_check_async_pf_completion() check in vcpu_run() loop. The check will
      only be performed with the next vmexit when it happens and in some cases
      it may take a while. With interrupt based page ready notification delivery
      the situation is even worse: unlike exceptions, interrupts are not handled
      immediately so we must check if the slot is empty. This is slow and
      unnecessary. Introduce dedicated MSR_KVM_ASYNC_PF_ACK MSR to communicate
      the fact that the slot is free and host should check its notification
      queue. Mandate using it for interrupt based 'page ready' APF event
      delivery.
      
      As kvm_check_async_pf_completion() is going away from vcpu_run() we need
      a way to communicate the fact that vcpu->async_pf.done queue has
      transitioned from empty to non-empty state. Introduce
      kvm_arch_async_page_present_queued() and KVM_REQ_APF_READY to do the job.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200525144125.143875-7-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      557a961a
    • Vitaly Kuznetsov's avatar
      KVM: introduce kvm_read_guest_offset_cached() · 0958f0ce
      Vitaly Kuznetsov authored
      
      We already have kvm_write_guest_offset_cached(), introduce read analogue.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200525144125.143875-5-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0958f0ce
    • Vitaly Kuznetsov's avatar
      KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() · 7c0ade6c
      Vitaly Kuznetsov authored
      
      An innocent reader of the following x86 KVM code:
      
      bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
      {
              if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED))
                      return true;
      ...
      
      may get very confused: if APF mechanism is not enabled, why do we report
      that we 'can inject async page present'? In reality, upon injection
      kvm_arch_async_page_present() will check the same condition again and,
      in case APF is disabled, will just drop the item. This is fine as the
      guest which deliberately disabled APF doesn't expect to get any APF
      notifications.
      
      Rename kvm_arch_can_inject_async_page_present() to
      kvm_arch_can_dequeue_async_page_present() to make it clear what we are
      checking: if the item can be dequeued (meaning either injected or just
      dropped).
      
      On s390 kvm_arch_can_inject_async_page_present() always returns 'true' so
      the rename doesn't matter much.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200525144125.143875-4-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c0ade6c
    • Paolo Bonzini's avatar
      Revert "KVM: No need to retry for hva_to_pfn_remapped()" · a8387d0b
      Paolo Bonzini authored
      
      This reverts commit 5b494aea.
      If unlocked==true then the vma pointer could be invalidated, so the 2nd
      follow_pfn() is potentially racy: we do need to get out and redo
      find_vma_intersection().
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a8387d0b
    • Paolo Bonzini's avatar
      KVM: check userspace_addr for all memslots · 45f08f4c
      Paolo Bonzini authored
      
      The userspace_addr alignment and range checks are not performed for private
      memory slots that are prepared by KVM itself.  This is unnecessary and makes
      it questionable to use __*_user functions to access memory later on.  We also
      rely on the userspace address being aligned since we have an entire family
      of functions to map gfn to pfn.
      
      Fortunately skipping the check is completely unnecessary.  Only x86 uses
      private memslots and their userspace_addr is obtained from vm_mmap,
      therefore it must be below PAGE_OFFSET.  In fact, any attempt to pass
      an address above PAGE_OFFSET would have failed because such an address
      would return true for kvm_is_error_hva.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      45f08f4c
  18. May 28, 2020
  19. May 16, 2020
  20. May 15, 2020
    • David Matlack's avatar
      kvm: add halt-polling cpu usage stats · cb953129
      David Matlack authored
      
      Two new stats for exposing halt-polling cpu usage:
      halt_poll_success_ns
      halt_poll_fail_ns
      
      Thus sum of these 2 stats is the total cpu time spent polling. "success"
      means the VCPU polled until a virtual interrupt was delivered. "fail"
      means the VCPU had to schedule out (either because the maximum poll time
      was reached or it needed to yield the CPU).
      
      To avoid touching every arch's kvm_vcpu_stat struct, only update and
      export halt-polling cpu usage stats if we're on x86.
      
      Exporting cpu usage as a u64 and in nanoseconds means we will overflow at
      ~500 years, which seems reasonably large.
      
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarJon Cargille <jcargill@google.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      
      Message-Id: <20200508182240.68440-1-jcargill@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cb953129
    • Wanpeng Li's avatar
      KVM: VMX: Optimize posted-interrupt delivery for timer fastpath · 379a3c8e
      Wanpeng Li authored
      
      While optimizing posted-interrupt delivery especially for the timer
      fastpath scenario, I measured kvm_x86_ops.deliver_posted_interrupt()
      to introduce substantial latency because the processor has to perform
      all vmentry tasks, ack the posted interrupt notification vector,
      read the posted-interrupt descriptor etc.
      
      This is not only slow, it is also unnecessary when delivering an
      interrupt to the current CPU (as is the case for the LAPIC timer) because
      PIR->IRR and IRR->RVI synchronization is already performed on vmentry
      Therefore skip kvm_vcpu_trigger_posted_interrupt in this case, and
      instead do vmx_sync_pir_to_irr() on the EXIT_FASTPATH_REENTER_GUEST
      fastpath as well.
      
      Tested-by: default avatarHaiwei Li <lihaiwei@tencent.com>
      Cc: Haiwei Li <lihaiwei@tencent.com>
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1588055009-12677-6-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      379a3c8e
    • Peter Xu's avatar
      KVM: No need to retry for hva_to_pfn_remapped() · 5b494aea
      Peter Xu authored
      
      hva_to_pfn_remapped() calls fixup_user_fault(), which has already
      handled the retry gracefully.  Even if "unlocked" is set to true, it
      means that we've got a VM_FAULT_RETRY inside fixup_user_fault(),
      however the page fault has already retried and we should have the pfn
      set correctly.  No need to do that again.
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20200416155906.267462-1-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5b494aea
  21. May 13, 2020
    • Jason Yan's avatar
      kvm/eventfd: remove unneeded conversion to bool · c4e115f0
      Jason Yan authored
      
      The '==' expression itself is bool, no need to convert it to bool again.
      This fixes the following coccicheck warning:
      
      virt/kvm/eventfd.c:724:38-43: WARNING: conversion to bool not needed
      here
      
      Signed-off-by: default avatarJason Yan <yanaijie@huawei.com>
      Message-Id: <20200420123805.4494-1-yanaijie@huawei.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c4e115f0
    • Davidlohr Bueso's avatar
      kvm: Replace vcpu->swait with rcuwait · da4ad88c
      Davidlohr Bueso authored
      
      The use of any sort of waitqueue (simple or regular) for
      wait/waking vcpus has always been an overkill and semantically
      wrong. Because this is per-vcpu (which is blocked) there is
      only ever a single waiting vcpu, thus no need for any sort of
      queue.
      
      As such, make use of the rcuwait primitive, with the following
      considerations:
      
        - rcuwait already provides the proper barriers that serialize
        concurrent waiter and waker.
      
        - Task wakeup is done in rcu read critical region, with a
        stable task pointer.
      
        - Because there is no concurrency among waiters, we need
        not worry about rcuwait_wait_event() calls corrupting
        the wait->task. As a consequence, this saves the locking
        done in swait when modifying the queue. This also applies
        to per-vcore wait for powerpc kvm-hv.
      
      The x86 tscdeadline_latency test mentioned in 8577370f
      ("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
      latency is reduced by around 15-20% with this change.
      
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: kvmarm@lists.cs.columbia.edu
      Cc: linux-mips@vger.kernel.org
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Message-Id: <20200424054837.5138-6-dave@stgolabs.net>
      [Avoid extra logic changes. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      da4ad88c
  22. May 08, 2020
  23. May 01, 2020
    • Marc Zyngier's avatar
      KVM: arm64: Fix 32bit PC wrap-around · 0225fd5e
      Marc Zyngier authored
      
      In the unlikely event that a 32bit vcpu traps into the hypervisor
      on an instruction that is located right at the end of the 32bit
      range, the emulation of that instruction is going to increment
      PC past the 32bit range. This isn't great, as userspace can then
      observe this value and get a bit confused.
      
      Conversly, userspace can do things like (in the context of a 64bit
      guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
      set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
      that PC hasn't been truncated. More confusion.
      
      Fix both by:
      - truncating PC increments for 32bit guests
      - sanitizing all 32bit regs every time a core reg is changed by
        userspace, and that PSTATE indicates a 32bit mode.
      
      Cc: stable@vger.kernel.org
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      0225fd5e
  24. Apr 30, 2020
  25. Apr 24, 2020
Loading