1. 26 May, 2017 3 commits
  2. 20 May, 2017 1 commit
  3. 02 May, 2017 1 commit
  4. 21 Apr, 2017 3 commits
    • Keith Busch's avatar
      nvme/pci: Poll CQ on timeout · 7776db1c
      Keith Busch authored
      If an IO timeout occurs, it's helpful to know if the controller did not
      post a completion or the driver missed an interrupt. While we never expect
      the latter, this patch will make it possible to tell the difference so
      we don't have to guess.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
    • Helen Koike's avatar
      nvme: improve performance for virtual NVMe devices · f9f38e33
      Helen Koike authored
      This change provides a mechanism to reduce the number of MMIO doorbell
      writes for the NVMe driver. When running in a virtualized environment
      like QEMU, the cost of an MMIO is quite hefy here. The main idea for
      the patch is provide the device two memory location locations:
       1) to store the doorbell values so they can be lookup without the doorbell
          MMIO write
       2) to store an event index.
      I believe the doorbell value is obvious, the event index not so much.
      Similar to the virtio specification, the virtual device can tell the
      driver (guest OS) not to write MMIO unless you are writing past this
      FYI: doorbell values are written by the nvme driver (guest OS) and the
      event index is written by the virtual device (host OS).
      The patch implements a new admin command that will communicate where
      these two memory locations reside. If the command fails, the nvme
      driver will work as before without any optimizations.
        Eric Northup <digitaleric@google.com>
        Frank Swiderski <fes@google.com>
        Ted Tso <tytso@mit.edu>
        Keith Busch <keith.busch@intel.com>
      Just to give an idea on the performance boost with the vendor
      extension: Running fio [1], a stock NVMe driver I get about 200K read
      IOPs with my vendor patch I get about 1000K read IOPs. This was
      running with a null device i.e. the backing device simply returned
      success on every read IO request.
      [1] Running on a 4 core machine:
        fio --time_based --name=benchmark --runtime=30
        --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
        --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
        --rw=randread --blocksize=4k --randrepeat=false
      Signed-off-by: default avatarRob Nelson <rlnelson@google.com>
      [mlin: port for upstream]
      Signed-off-by: default avatarMing Lin <mlin@kernel.org>
      [koike: updated for upstream]
      Signed-off-by: default avatarHelen Koike <helen.koike@collabora.co.uk>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
    • Keith Busch's avatar
      nvme/pci: Don't set reserved SQ create flags · 81c1cd98
      Keith Busch authored
      The QPRIO field is only valid if weighted round robin arbitration is used,
      and this driver doesn't enable that controller configuration option.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
  5. 20 Apr, 2017 2 commits
    • Andy Lutomirski's avatar
      nvme: Adjust the Samsung APST quirk · ff5350a8
      Andy Lutomirski authored
      I got a couple more reports: the Samsung APST issues appears to
      affect multiple 950-series devices in Dell XPS 15 9550 and Precision
      5510 laptops.  Change the quirk: rather than blacklisting the
      firmware on the first problematic SSD that was reported, disable
      APST on all 144d:a802 devices if they're installed in the two
      affected Dell models.  While we're at it, disable only the deepest
      sleep state instead of all of them -- the reporters say that this is
      sufficient to fix the problem.
      (I have a device that appears to be entirely identical to one of the
      affected devices, but I have a different Dell laptop, so it's not
      the case that all Samsung devices with firmware BXW75D0Q are broken
      under all circumstances.)
      Samsung engineers have an affected system, and hopefully they'll
      give us a better workaround some time soon.  In the mean time, this
      should minimize regressions.
      See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184
      Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    • Christoph Hellwig's avatar
      nvme: split nvme status from block req->errors · 27fa9bc5
      Christoph Hellwig authored
      We want our own clearly defined error field for NVMe passthrough commands,
      and the request errors field is going away in its current form.
      Just store the status and result field in the nvme_request field from
      hardirq completion context (using a new helper) and then generate a
      Linux errno for the block layer only when we actually need it.
      Because we can't overload the status value with a negative error code
      for cancelled command we now have a flags filed in struct nvme_request
      that contains a bit for this condition.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  6. 18 Apr, 2017 1 commit
  7. 08 Apr, 2017 1 commit
  8. 05 Apr, 2017 1 commit
  9. 04 Apr, 2017 1 commit
  10. 31 Mar, 2017 1 commit
  11. 02 Mar, 2017 2 commits
    • Keith Busch's avatar
      nvme: Complete all stuck requests · 302ad8cc
      Keith Busch authored
      If the nvme driver is shutting down its controller, the drievr will not
      start the queues up again, preventing blk-mq's hot CPU notifier from
      making forward progress.
      To fix that, this patch starts a request_queue freeze when the driver
      resets a controller so no new requests may enter. The driver will wait
      for frozen after IO queues are restarted to ensure the queue reference
      can be reinitialized when nvme requests to unfreeze the queues.
      If the driver is doing a safe shutdown, the driver will wait for the
      controller to successfully complete all inflight requests so that we
      don't unnecessarily fail them. Once the controller has been disabled,
      the queues will be restarted to force remaining entered requests to end
      in failure so that blk-mq's hot cpu notifier may progress.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    • Shaohua Li's avatar
      nvme: allocate nvme_queue in correct node · d3af3ecd
      Shaohua Li authored
      nvme_queue is per-cpu queue (mostly). Allocating it in node where blk-mq
      will use it.
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  12. 23 Feb, 2017 1 commit
  13. 22 Feb, 2017 3 commits
  14. 17 Feb, 2017 2 commits
  15. 06 Feb, 2017 1 commit
  16. 31 Jan, 2017 2 commits
  17. 17 Jan, 2017 1 commit
  18. 13 Jan, 2017 1 commit
  19. 21 Dec, 2016 2 commits
  20. 19 Dec, 2016 1 commit
  21. 14 Dec, 2016 1 commit
  22. 09 Dec, 2016 1 commit
    • Christoph Hellwig's avatar
      block: improve handling of the magic discard payload · f9d03f96
      Christoph Hellwig authored
      Instead of allocating a single unused biovec for discard requests, send
      them down without any payload.  Instead we allow the driver to add a
      "special" payload using a biovec embedded into struct request (unioned
      over other fields never used while in the driver), and overloading
      the number of segments for this case.
      This has a couple of advantages:
       - we don't have to allocate the bio_vec
       - the amount of special casing for discard requests in the block
         layer is significantly reduced
       - using this same scheme for other request types is trivial,
         which will be important for implementing the new WRITE_ZEROES
         op on devices where it actually requires a payload (e.g. SCSI)
       - we can get rid of playing games with the request length, as
         we'll never touch it and completions will work just fine
       - it will allow us to support ranged discard operations in the
         future by merging non-contiguous discard bios into a single
       - last but not least it removes a lot of code
      This patch is the common base for my WIP series for ranges discards and to
      remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
      so it would be good to get it in quickly.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  23. 06 Dec, 2016 1 commit
  24. 05 Dec, 2016 1 commit
  25. 16 Nov, 2016 1 commit
  26. 15 Nov, 2016 1 commit
  27. 10 Nov, 2016 2 commits
    • Christoph Hellwig's avatar
      nvme: don't pass the full CQE to nvme_complete_async_event · 7bf58533
      Christoph Hellwig authored
      We only need the status and result fields, and passing them explicitly
      makes life a lot easier for the Fibre Channel transport which doesn't
      have a full CQE for the fast path case.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    • Christoph Hellwig's avatar
      nvme: introduce struct nvme_request · d49187e9
      Christoph Hellwig authored
      This adds a shared per-request structure for all NVMe I/O.  This structure
      is embedded as the first member in all NVMe transport drivers request
      private data and allows to implement common functionality between the
      The first use is to replace the current abuse of the SCSI command
      passthrough fields in struct request for the NVMe command passthrough,
      but it will grow a field more fields to allow implementing things
      like common abort handlers in the future.
      The passthrough commands are handled by having a pointer to the SQE
      (struct nvme_command) in struct nvme_request, and the union of the
      possible result fields, which had to be turned from an anonymous
      into a named union for that purpose.  This avoids having to pass
      a reference to a full CQE around and thus makes checking the result
      a lot more lightweight.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  28. 28 Oct, 2016 1 commit