Skip to content
  • Ming Lei's avatar
    genirq/affinity: Spread irq vectors among present CPUs as far as possible · d3056812
    Ming Lei authored
    Commit 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
    tried to spread the interrupts accross all possible CPUs to make sure that
    in case of phsyical hotplug (e.g. virtualization) the CPUs which get
    plugged in after the device was initialized are targeted by a hardware
    queue and the corresponding interrupt.
    
    This has a downside in cases where the ACPI tables claim that there are
    more possible CPUs than present CPUs and the number of interrupts to spread
    out is smaller than the number of possible CPUs. These bogus ACPI tables
    are unfortunately not uncommon.
    
    In such a case the vector spreading algorithm assigns interrupts to CPUs
    which can never be utilized and as a consequence these interrupts are
    unused instead of being mapped to present CPUs. As a result the performance
    of the device is suboptimal.
    
    To fix this spread the interrupt vectors in two stages:
    
     1) Spread as many interrupts as possible among the present CPUs
    
     2) Spread the remaining vectors among non present CPUs
    
    On a 8 core system, where CPU 0-3 are present and CPU 4-7 are not present,
    for a device with 4 queues the resulting interrupt affinity is:
    
      1) Before 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
    	irq 39, cpu list 0
    	irq 40, cpu list 1
    	irq 41, cpu list 2
    	irq 42, cpu list 3
    
      2) With 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
    	irq 39, cpu list 0-2
    	irq 40, cpu list 3-4,6
    	irq 41, cpu list 5
    	irq 42, cpu list 7
    
      3) With the refined vector spread applied:
    	irq 39, cpu list 0,4
    	irq 40, cpu list 1,6
    	irq 41, cpu list 2,5
    	irq 42, cpu list 3,7
    
    On a 8 core system, where all CPUs are present the resulting interrupt
    affinity for the 4 queues is:
    
    	irq 39, cpu list 0,1
    	irq 40, cpu list 2,3
    	irq 41, cpu list 4,5
    	irq 42, cpu list 6,7
    
    This is independent of the number of CPUs which are online at the point of
    initialization because in such a system the offline CPUs can be easily
    onlined afterwards, while in non-present CPUs need to be plugged physically
    or virtually which requires external interaction.
    
    The downside of this approach is that in case of physical hotplug the
    interrupt vector spreading might be suboptimal when CPUs 4-7 are physically
    plugged. Suboptimal from a NUMA point of view and due to the single target
    nature of interrupt affinities the later plugged CPUs might not be targeted
    by interrupts at all.
    
    Though, physical hotplug systems are not the common case while the broken
    ACPI table disease is wide spread. So it's preferred to have as many
    interrupts as possible utilized at the point where the device is
    initialized.
    
    Block multi-queue devices like NVME create a hardware queue per possible
    CPU, so the goal of commit 84676c1f
    
     to assign one interrupt vector per
    possible CPU is still achieved even with physical/virtual hotplug.
    
    [ tglx: Changed from online to present CPUs for the first spreading stage,
      	renamed variables for readability sake, added comments and massaged
      	changelog ]
    
    Reported-by: default avatarLaurence Oberman <loberman@redhat.com>
    Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: linux-block@vger.kernel.org
    Cc: Christoph Hellwig <hch@infradead.org>
    Link: https://lkml.kernel.org/r/20180308105358.1506-5-ming.lei@redhat.com
    d3056812