1. 19 Mar, 2018 11 commits
  2. 26 Feb, 2018 1 commit
  3. 11 Feb, 2018 1 commit
    • Linus Torvalds's avatar
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds authored
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  4. 06 Feb, 2018 7 commits
  5. 01 Feb, 2018 1 commit
  6. 31 Jan, 2018 5 commits
    • Masatake YAMATO's avatar
      kvm: embed vcpu id to dentry of vcpu anon inode · e46b4692
      Masatake YAMATO authored
      All d-entries for vcpu have the same, "anon_inode:kvm-vcpu". That means
      it is impossible to know the mapping between fds for vcpu and vcpu
      from userland.
      
          # LC_ALL=C ls -l /proc/617/fd | grep vcpu
          lrwx------. 1 qemu qemu 64 Jan  7 16:50 18 -> anon_inode:kvm-vcpu
          lrwx------. 1 qemu qemu 64 Jan  7 16:50 19 -> anon_inode:kvm-vcpu
      
      It is also impossible to know the mapping between vma for kvm_run
      structure and vcpu from userland.
      
          # LC_ALL=C grep vcpu /proc/617/maps
          7f9d842d0000-7f9d842d3000 rw-s 00000000 00:0d 20393                      anon_inode:kvm-vcpu
          7f9d842d3000-7f9d842d6000 rw-s 00000000 00:0d 20393                      anon_inode:kvm-vcpu
      
      This change adds vcpu id to d-entries for vcpu. With this change
      you can get the following output:
      
          # LC_ALL=C ls -l /proc/617/fd | grep vcpu
          lrwx------. 1 qemu qemu 64 Jan  7 16:50 18 -> anon_inode:kvm-vcpu:0
          lrwx------. 1 qemu qemu 64 Jan  7 16:50 19 -> anon_inode:kvm-vcpu:1
      
          # LC_ALL=C grep vcpu /proc/617/maps
          7f9d842d0000-7f9d842d3000 rw-s 00000000 00:0d 20393                      anon_inode:kvm-vcpu:0
          7f9d842d3000-7f9d842d6000 rw-s 00000000 00:0d 20393                      anon_inode:kvm-vcpu:1
      
      With the mappings known from the output, a tool like strace can report more details
      of qemu-kvm process activities. Here is the strace output of my local prototype:
      
          # ./strace -KK -f -p 617 2>&1 | grep 'KVM_RUN\| K'
          ...
          [pid   664] ioctl(18, KVM_RUN, 0)       = 0 (KVM_EXIT_MMIO)
           K ready_for_interrupt_injection=1, if_flag=0, flags=0, cr8=0000000000000000, apic_base=0x000000fee00d00
           K phys_addr=0, len=1634035803, [33, 0, 0, 0, 0, 0, 0, 0], is_write=112
          [pid   664] ioctl(18, KVM_RUN, 0)       = 0 (KVM_EXIT_MMIO)
           K ready_for_interrupt_injection=1, if_flag=1, flags=0, cr8=0000000000000000, apic_base=0x000000fee00d00
           K phys_addr=0, len=1634035803, [33, 0, 0, 0, 0, 0, 0, 0], is_write=112
          ...
      Signed-off-by: default avatarMasatake YAMATO <yamato@redhat.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      e46b4692
    • KarimAllah Ahmed's avatar
      kvm: Map PFN-type memory regions as writable (if possible) · a340b3e2
      KarimAllah Ahmed authored
      For EPT-violations that are triggered by a read, the pages are also mapped with
      write permissions (if their memory region is also writable). That would avoid
      getting yet another fault on the same page when a write occurs.
      
      This optimization only happens when you have a "struct page" backing the memory
      region. So also enable it for memory regions that do not have a "struct page".
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarKarimAllah Ahmed <karahmed@amazon.de>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      a340b3e2
    • Christoffer Dall's avatar
      KVM: arm/arm64: Fixup userspace irqchip static key optimization · cd15d205
      Christoffer Dall authored
      When I introduced a static key to avoid work in the critical path for
      userspace irqchips which is very rarely used, I accidentally messed up
      my logic and used && where I should have used ||, because the point was
      to short-circuit the evaluation in case userspace irqchips weren't even
      in use.
      
      This fixes an issue when running in-kernel irqchip VMs alongside
      userspace irqchip VMs.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Fixes: c44c232ee2d3 ("KVM: arm/arm64: Avoid work when userspace iqchips are not used")
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      cd15d205
    • Christoffer Dall's avatar
      KVM: arm/arm64: Fix userspace_irqchip_in_use counting · f1d7231c
      Christoffer Dall authored
      We were not decrementing the static key count in the right location.
      kvm_arch_vcpu_destroy() is only called to clean up after a failed
      VCPU create attempt, whereas kvm_arch_vcpu_free() is called on teardown
      of the VM as well.  Move the static key decrement call to
      kvm_arch_vcpu_free().
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      f1d7231c
    • Christoffer Dall's avatar
      KVM: arm/arm64: Fix incorrect timer_is_pending logic · 13e59ece
      Christoffer Dall authored
      After the recently introduced support for level-triggered mapped
      interrupt, I accidentally left the VCPU thread busily going back and
      forward between the guest and the hypervisor whenever the guest was
      blocking, because I would always incorrectly report that a timer
      interrupt was pending.
      
      This is because the timer->irq.level field is not valid for mapped
      interrupts, where we offload the level state to the hardware, and as a
      result this field is always true.
      
      Luckily the problem can be relatively easily solved by not checking the
      cached signal state of either timer in kvm_timer_should_fire() but
      instead compute the timer state on the fly, which we do already if the
      cached signal state wasn't high.  In fact, the only reason for checking
      the cached signal state was a tiny optimization which would only be
      potentially faster when the polling loop detects a pending timer
      interrupt, which is quite unlikely.
      
      Instead of duplicating the logic from kvm_arch_timer_handler(), we
      enlighten kvm_timer_should_fire() to report something valid when the
      timer state is loaded onto the hardware.  We can then call this from
      kvm_arch_timer_handler() as well and avoid the call to
      __timer_snapshot_state() in kvm_arch_timer_get_input_level().
      Reported-by: default avatarTomasz Nowicki <tn@semihalf.com>
      Tested-by: default avatarTomasz Nowicki <tn@semihalf.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      13e59ece
  7. 23 Jan, 2018 1 commit
    • James Morse's avatar
      KVM: arm/arm64: Handle CPU_PM_ENTER_FAILED · 58d6b15e
      James Morse authored
      cpu_pm_enter() calls the pm notifier chain with CPU_PM_ENTER, then if
      there is a failure: CPU_PM_ENTER_FAILED.
      
      When KVM receives CPU_PM_ENTER it calls cpu_hyp_reset() which will
      return us to the hyp-stub. If we subsequently get a CPU_PM_ENTER_FAILED,
      KVM does nothing, leaving the CPU running with the hyp-stub, at odds
      with kvm_arm_hardware_enabled.
      
      Add CPU_PM_ENTER_FAILED as a fallthrough for CPU_PM_EXIT, this reloads
      KVM based on kvm_arm_hardware_enabled. This is safe even if CPU_PM_ENTER
      never gets as far as KVM, as cpu_hyp_reinit() calls cpu_hyp_reset()
      to make sure the hyp-stub is loaded before reloading KVM.
      
      Fixes: 67f69197 ("arm64: kvm: allows kvm cpu hotplug")
      Cc: <stable@vger.kernel.org> # v4.7+
      CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      58d6b15e
  8. 16 Jan, 2018 2 commits
    • James Morse's avatar
      KVM: arm64: Handle RAS SErrors from EL1 on guest exit · 3368bd80
      James Morse authored
      We expect to have firmware-first handling of RAS SErrors, with errors
      notified via an APEI method. For systems without firmware-first, add
      some minimal handling to KVM.
      
      There are two ways KVM can take an SError due to a guest, either may be a
      RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
      or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.
      
      For SError that interrupt a guest and are routed to EL2 the existing
      behaviour is to inject an impdef SError into the guest.
      
      Add code to handle RAS SError based on the ESR. For uncontained and
      uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
      errors compromise the host too. All other error types are contained:
      For the fatal errors the vCPU can't make progress, so we inject a virtual
      SError. We ignore contained errors where we can make progress as if
      we're lucky, we may not hit them again.
      
      If only some of the CPUs support RAS the guest will see the cpufeature
      sanitised version of the id registers, but we may still take RAS SError
      on this CPU. Move the SError handling out of handle_exit() into a new
      handler that runs before we can be preempted. This allows us to use
      this_cpu_has_cap(), via arm64_is_ras_serror().
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      3368bd80
    • James Morse's avatar
      KVM: arm/arm64: mask/unmask daif around VHE guests · 4f5abad9
      James Morse authored
      Non-VHE systems take an exception to EL2 in order to world-switch into the
      guest. When returning from the guest KVM implicitly restores the DAIF
      flags when it returns to the kernel at EL1.
      
      With VHE none of this exception-level jumping happens, so KVMs
      world-switch code is exposed to the host kernel's DAIF values, and KVM
      spills the guest-exit DAIF values back into the host kernel.
      On entry to a guest we have Debug and SError exceptions unmasked, KVM
      has switched VBAR but isn't prepared to handle these. On guest exit
      Debug exceptions are left disabled once we return to the host and will
      stay this way until we enter user space.
      
      Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
      happen after the hosts VBAR value has been synchronised by the isb in
      __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
      setting KVMs VBAR value, but is kept here for symmetry.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      4f5abad9
  9. 15 Jan, 2018 2 commits
  10. 13 Jan, 2018 1 commit
  11. 12 Jan, 2018 1 commit
  12. 11 Jan, 2018 1 commit
  13. 08 Jan, 2018 6 commits