1. 15 Jul, 2019 1 commit
  2. 19 Jun, 2019 1 commit
  3. 14 Jun, 2019 3 commits
  4. 09 Jun, 2019 1 commit
  5. 05 Jun, 2019 3 commits
  6. 30 May, 2019 1 commit
  7. 22 May, 2019 1 commit
  8. 21 May, 2019 1 commit
  9. 13 May, 2019 1 commit
    • Chris Wilson's avatar
      drm/i915: Seal races between async GPU cancellation, retirement and signaling · c36beba6
      Chris Wilson authored
      Currently there is an underlying assumption that i915_request_unsubmit()
      is synchronous wrt the GPU -- that is the request is no longer in flight
      as we remove it. In the near future that may change, and this may upset
      our signaling as we can process an interrupt for that request while it
      is no longer in flight.
      
      CPU0					CPU1
      intel_engine_breadcrumbs_irq
      (queue request completion)
      					i915_request_cancel_signaling
      ...					...
      					i915_request_enable_signaling
      dma_fence_signal
      
      Hence in the time it took us to drop the lock to signal the request, a
      preemption event may have occurred and re-queued the request. In the
      process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
      so reused the rq->signal_link that was in use on CPU0, leading to bad
      pointer chasing in intel_engine_breadcrumbs_irq.
      
      A related issue was that if someone started listening for a signal on a
      completed but no longer in-flight request, we missed the opportunity to
      immediately signal that request.
      
      Furthermore, as intel_contexts may be immediately released during
      request retirement, in order to be entirely sure that
      intel_engine_breadcrumbs_irq may no longer dereference the intel_context
      (ce->signals and ce->signal_link), we must wait for irq spinlock.
      
      In order to prevent the race, we use a bit in the fence.flags to signal
      the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
      For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
      quickly signals to any outside observer that the fence is indeed signaled.
      
      v2: Sketch out potential dma-fence API for manual signaling
      v3: And the test_and_set_bit()
      
      Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190508112452.18942-1-chris@chris-wilson.co.uk
      (cherry picked from commit 0152b3b3)
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      c36beba6
  10. 08 May, 2019 1 commit
    • Chris Wilson's avatar
      drm/i915: Seal races between async GPU cancellation, retirement and signaling · 0152b3b3
      Chris Wilson authored
      Currently there is an underlying assumption that i915_request_unsubmit()
      is synchronous wrt the GPU -- that is the request is no longer in flight
      as we remove it. In the near future that may change, and this may upset
      our signaling as we can process an interrupt for that request while it
      is no longer in flight.
      
      CPU0					CPU1
      intel_engine_breadcrumbs_irq
      (queue request completion)
      					i915_request_cancel_signaling
      ...					...
      					i915_request_enable_signaling
      dma_fence_signal
      
      Hence in the time it took us to drop the lock to signal the request, a
      preemption event may have occurred and re-queued the request. In the
      process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
      so reused the rq->signal_link that was in use on CPU0, leading to bad
      pointer chasing in intel_engine_breadcrumbs_irq.
      
      A related issue was that if someone started listening for a signal on a
      completed but no longer in-flight request, we missed the opportunity to
      immediately signal that request.
      
      Furthermore, as intel_contexts may be immediately released during
      request retirement, in order to be entirely sure that
      intel_engine_breadcrumbs_irq may no longer dereference the intel_context
      (ce->signals and ce->signal_link), we must wait for irq spinlock.
      
      In order to prevent the race, we use a bit in the fence.flags to signal
      the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
      For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
      quickly signals to any outside observer that the fence is indeed signaled.
      
      v2: Sketch out potential dma-fence API for manual signaling
      v3: And the test_and_set_bit()
      
      Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190508112452.18942-1-chris@chris-wilson.co.uk
      0152b3b3
  11. 24 Apr, 2019 2 commits
  12. 23 Apr, 2019 1 commit
  13. 19 Apr, 2019 1 commit
  14. 16 Apr, 2019 1 commit
  15. 01 Apr, 2019 1 commit
  16. 20 Mar, 2019 1 commit
  17. 27 Feb, 2019 1 commit
  18. 04 Jan, 2019 1 commit
  19. 24 Dec, 2018 1 commit
  20. 07 Dec, 2018 1 commit
  21. 03 Dec, 2018 1 commit
  22. 16 Nov, 2018 1 commit
  23. 26 Oct, 2018 1 commit
    • Chris Wilson's avatar
      dma-buf: Update reservation shared_count after adding the new fence · a590d0fd
      Chris Wilson authored
      We need to serialise the addition of a new fence into the shared list
      such that the fence is visible before we claim it is there. Otherwise a
      concurrent reader of the shared fence list will see an uninitialised
      fence slot before it is set.
      
        <4> [109.613162] general protection fault: 0000 [#1] PREEMPT SMP PTI
        <4> [109.613177] CPU: 1 PID: 1357 Comm: gem_busy Tainted: G     U            4.19.0-rc8-CI-CI_DRM_5035+ #1
        <4> [109.613189] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
        <4> [109.613252] RIP: 0010:i915_gem_busy_ioctl+0x146/0x380 [i915]
        <4> [109.613261] Code: 0b 43 04 49 83 c6 08 4d 39 e6 89 43 04 74 6d 4d 8b 3e e8 5d 54 f4 e0 85 c0 74 0d 80 3d 08 71 1d 00 00
        0f 84 bb 00 00 00 31 c0 <49> 81 7f 08 20 3a 2c a0 75 cc 41 8b 97 50 02 00 00 49 8b 8f a8 00
        <4> [109.613283] RSP: 0018:ffffc9000044bcf8 EFLAGS: 00010246
        <4> [109.613292] RAX: 0000000000000000 RBX: ffffc9000044bdc0 RCX: 0000000000000001
        <4> [109.613302] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: ffffffff822474a0
        <4> [109.613311] RBP: ffffc9000044bd28 R08: ffff88021e158680 R09: 0000000000000001
        <4> [109.613321] R10: 0000000000000040 R11: 0000000000000000 R12: ffff88021e1641b8
        <4> [109.613331] R13: 0000000000000003 R14: ffff88021e1641b0 R15: 6b6b6b6b6b6b6b6b
        <4> [109.613341] FS:  00007f9c9fc84980(0000) GS:ffff880227a40000(0000) knlGS:0000000000000000
        <4> [109.613352] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        <4> [109.613360] CR2: 00007f9c9fcb8000 CR3: 00000002247d4005 CR4: 00000000000606e0
      
      Fixes: 27836b64 ("dma-buf: remove shared fence staging in reservation object")
      Testcase: igt/gem_busy/close-race
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Junwei Zhang <Jerry.Zhang@amd.com>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181026080302.11507-1-chris@chris-wilson.co.uk
      a590d0fd
  24. 25 Oct, 2018 2 commits
  25. 14 Sep, 2018 1 commit
  26. 12 Sep, 2018 9 commits