1. 15 May, 2018 4 commits
  2. 04 May, 2018 1 commit
  3. 27 Apr, 2018 1 commit
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI · 53692908
      Marc Zyngier authored
      Now that we make sure we don't inject multiple instances of the
      same GICv2 SGI at the same time, we've made another bug more
      If we exit with an active SGI, we completely lose track of which
      vcpu it came from. On the next entry, we restore it with 0 as a
      source, and if that wasn't the right one, too bad. While this
      doesn't seem to trouble GIC-400, the architectural model gets
      offended and doesn't deactivate the interrupt on EOI.
      Another connected issue is that we will happilly make pending
      an interrupt from another vcpu, overriding the above zero with
      something that is just as inconsistent. Don't do that.
      The final issue is that we signal a maintenance interrupt when
      no pending interrupts are present in the LR. Assuming we've fixed
      the two issues above, we end-up in a situation where we keep
      exiting as soon as we've reached the active state, and not be
      able to inject the following pending.
      The fix comes in 3 parts:
      - GICv2 SGIs have their source vcpu saved if they are active on
        exit, and restored on entry
      - Multi-SGIs cannot go via the Pending+Active state, as this would
        corrupt the source field
      - Multi-SGIs are converted to using MI on EOI instead of NPIE
      Fixes: 16ca6a60 ("KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid")
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
  4. 26 Apr, 2018 2 commits
  5. 20 Apr, 2018 1 commit
    • Marc Zyngier's avatar
      arm/arm64: KVM: Add PSCI version selection API · 85bd0ba1
      Marc Zyngier authored
      Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
      or 1.0 to a guest, defaulting to the latest version of the PSCI
      implementation that is compatible with the requested version. This is
      no different from doing a firmware upgrade on KVM.
      But in order to give a chance to hypothetical badly implemented guests
      that would have a fit by discovering something other than PSCI 0.2,
      let's provide a new API that allows userspace to pick one particular
      version of the API.
      This is implemented as a new class of "firmware" registers, where
      we expose the PSCI version. This allows the PSCI version to be
      save/restored as part of a guest migration, and also set to
      any supported version if the guest requires it.
      Cc: stable@vger.kernel.org #4.16
      Reviewed-by: default avatarChristoffer Dall <cdall@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
  6. 17 Apr, 2018 2 commits
    • Andre Przywara's avatar
      KVM: arm/arm64: vgic: Kick new VCPU on interrupt migration · bf9a4137
      Andre Przywara authored
      When vgic_prune_ap_list() finds an interrupt that needs to be migrated
      to a new VCPU, we should notify this VCPU of the pending interrupt,
      since it requires immediate action.
      Kick this VCPU once we have added the new IRQ to the list, but only
      after dropping the locks.
      Reported-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Marc Zyngier's avatar
      KVM: arm/arm64: Close VMID generation race · f0cf47d9
      Marc Zyngier authored
      Before entering the guest, we check whether our VMID is still
      part of the current generation. In order to avoid taking a lock,
      we start with checking that the generation is still current, and
      only if not current do we take the lock, recheck, and update the
      generation and VMID.
      This leaves open a small race: A vcpu can bump up the global
      generation number as well as the VM's, but has not updated
      the VMID itself yet.
      At that point another vcpu from the same VM comes in, checks
      the generation (and finds it not needing anything), and jumps
      into the guest. At this point, we end-up with two vcpus belonging
      to the same VM running with two different VMIDs. Eventually, the
      VMID used by the second vcpu will get reassigned, and things will
      really go wrong...
      A simple solution would be to drop this initial check, and always take
      the lock. This is likely to cause performance issues. A middle ground
      is to convert the spinlock to a rwlock, and only take the read lock
      on the fast path. If the check fails at that point, drop it and
      acquire the write lock, rechecking the condition.
      This ensures that the above scenario doesn't occur.
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarShannon Zhao <zhaoshenglong@huawei.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
  7. 26 Mar, 2018 2 commits
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic-its: Fix potential overrun in vgic_copy_lpi_list · 7d8b44c5
      Marc Zyngier authored
      vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting
      a given vcpu. We allocate the array containing the intids before taking
      the lpi_list_lock, which means we can have an array size that is not
      equal to the number of LPIs.
      This is particularly obvious when looking at the path coming from
      vgic_enable_lpis, which is not a command, and thus can run in parallel
      with commands:
      vcpu 0:                                        vcpu 1:
            intids = kmalloc_array(irq_count)
                                                     MAPI(lpi targeting vcpu 0)
              intids[i++] = irq->intid;
      At that stage, we will happily overrun the intids array. Boo. An easy
      fix is is to break once the array is full. The MAPI command will update
      the config anyway, and we won't miss a thing. We also make sure that
      lpi_list_count is read exactly once, so that further updates of that
      value will not affect the array bound check.
      Cc: stable@vger.kernel.org
      Fixes: ccb1d791 ("KVM: arm64: vgic-its: Fix pending table sync")
      Reviewed-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic: Disallow Active+Pending for level interrupts · 67b5b673
      Marc Zyngier authored
      It was recently reported that VFIO mediated devices, and anything
      that VFIO exposes as level interrupts, do no strictly follow the
      expected logic of such interrupts as it only lowers the input
      line when the guest has EOId the interrupt at the GIC level, rather
      than when it Acked the interrupt at the device level.
      THe GIC's Active+Pending state is fundamentally incompatible with
      this behaviour, as it prevents KVM from observing the EOI, and in
      turn results in VFIO never dropping the line. This results in an
      interrupt storm in the guest, which it really never expected.
      As we cannot really change VFIO to follow the strict rules of level
      signalling, let's forbid the A+P state altogether, as it is in the
      end only an optimization. It ensures that we will transition via
      an invalid state, which we can use to notify VFIO of the EOI.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Tested-by: default avatarEric Auger <eric.auger@redhat.com>
      Tested-by: default avatarShunyong Yang <shunyong.yang@hxt-semitech.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
  8. 19 Mar, 2018 22 commits
  9. 14 Mar, 2018 5 commits
    • Marc Zyngier's avatar
      kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3 · 27e91ad1
      Marc Zyngier authored
      On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
      force synchronization between the memory-mapped guest view and
      the system-register view that the hypervisor uses.
      This is incorrect, as the spec calls out the need for "a DSB whose
      required access type is both loads and stores with any Shareability
      attribute", while we're only synchronizing stores.
      We also lack an isb after the dsb to ensure that the latter has
      actually been executed before we start reading stuff from the sysregs.
      The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
      just after.
      Cc: stable@vger.kernel.org
      Fixes: f68d2b1b ("arm64: KVM: Implement vgic-v3 save/restore")
      Acked-by: default avatarChristoffer Dall <cdall@kernel.org>
      Reviewed-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid · 16ca6a60
      Marc Zyngier authored
      The vgic code is trying to be clever when injecting GICv2 SGIs,
      and will happily populate LRs with the same interrupt number if
      they come from multiple vcpus (after all, they are distinct
      interrupt sources).
      Unfortunately, this is against the letter of the architecture,
      and the GICv2 architecture spec says "Each valid interrupt stored
      in the List registers must have a unique VirtualID for that
      virtual CPU interface.". GICv3 has similar (although slightly
      ambiguous) restrictions.
      This results in guests locking up when using GICv2-on-GICv3, for
      example. The obvious fix is to stop trying so hard, and inject
      a single vcpu per SGI per guest entry. After all, pending SGIs
      with multiple source vcpus are pretty rare, and are mostly seen
      in scenario where the physical CPUs are severely overcomitted.
      But as we now only inject a single instance of a multi-source SGI per
      vcpu entry, we may delay those interrupts for longer than strictly
      necessary, and run the risk of injecting lower priority interrupts
      in the meantime.
      In order to address this, we adopt a three stage strategy:
      - If we encounter a multi-source SGI in the AP list while computing
        its depth, we force the list to be sorted
      - When populating the LRs, we prevent the injection of any interrupt
        of lower priority than that of the first multi-source SGI we've
      - Finally, the injection of a multi-source SGI triggers the request
        of a maintenance interrupt when there will be no pending interrupt
        in the LRs (HCR_NPIE).
      At the point where the last pending interrupt in the LRs switches
      from Pending to Active, the maintenance interrupt will be delivered,
      allowing us to add the remaining SGIs using the same process.
      Cc: stable@vger.kernel.org
      Fixes: 0919e84c ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
      Acked-by: default avatarChristoffer Dall <cdall@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Ard Biesheuvel's avatar
      KVM: arm/arm64: Reduce verbosity of KVM init log · 76600428
      Ard Biesheuvel authored
      On my GICv3 system, the following is printed to the kernel log at boot:
         kvm [1]: 8-bit VMID
         kvm [1]: IDMAP page: d20e35000
         kvm [1]: HYP VA range: 800000000000:ffffffffffff
         kvm [1]: vgic-v2@2c020000
         kvm [1]: GIC system register CPU interface enabled
         kvm [1]: vgic interrupt IRQ1
         kvm [1]: virtual timer IRQ4
         kvm [1]: Hyp mode initialized successfully
      The KVM IDMAP is a mapping of a statically allocated kernel structure,
      and so printing its physical address leaks the physical placement of
      the kernel when physical KASLR in effect. So change the kvm_info() to
      kvm_debug() to remove it from the log output.
      While at it, trim the output a bit more: IRQ numbers can be found in
      /proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
      informational either.
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarChristoffer Dall <cdall@kernel.org>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Christoffer Dall's avatar
      KVM: arm/arm64: Reset mapped IRQs on VM reset · 413aa807
      Christoffer Dall authored
      We currently don't allow resetting mapped IRQs from userspace, because
      their state is controlled by the hardware.  But we do need to reset the
      state when the VM is reset, so we provide a function for the 'owner' of
      the mapped interrupt to reset the interrupt state.
      Currently only the timer uses mapped interrupts, so we call this
      function from the timer reset logic.
      Cc: stable@vger.kernel.org
      Fixes: 4c60e360 ("KVM: arm/arm64: Provide a get_input_level for the arch timer")
      Signed-off-by: default avatarChristoffer Dall <cdall@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
    • Christoffer Dall's avatar
      KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN · e21a4f3a
      Christoffer Dall authored
      Calling vcpu_load() registers preempt notifiers for this vcpu and calls
      kvm_arch_vcpu_load().  The latter will soon be doing a lot of heavy
      lifting on arm/arm64 and will try to do things such as enabling the
      virtual timer and setting us up to handle interrupts from the timer
      Loading state onto hardware registers and enabling hardware to signal
      interrupts can be problematic when we're not actually about to run the
      VCPU, because it makes it difficult to establish the right context when
      handling interrupts from the timer, and it makes the register access
      code difficult to reason about.
      Luckily, now when we call vcpu_load in each ioctl implementation, we can
      simply remove the call from the non-KVM_RUN vcpu ioctls, and our
      kvm_arch_vcpu_load() is only used for loading vcpu content to the
      physical CPU when we're actually going to run the vcpu.
      Cc: stable@vger.kernel.org
      Fixes: 9b062471 ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl")
      Reviewed-by: default avatarJulien Grall <julien.grall@arm.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>