Skip to content
  • Alexander Larsson's avatar
    gtkcairoblur: Unroll inner loop for common radius values · d0dc1f52
    Alexander Larsson authored
    This unrolls the inner blur loop for radius 1-10, allowing
    the compiler to use a divide-by-constant operation instead
    of a generic division.
    
    Here is the blur-performance output before:
    
    Radius  1: 124.95 msec, 32.01 kpixels/msec:
    Radius  2: 117.27 msec, 34.11 kpixels/msec:
    Radius  3: 123.57 msec, 32.37 kpixels/msec:
    Radius  4: 118.17 msec, 33.85 kpixels/msec:
    Radius  5: 119.32 msec, 33.52 kpixels/msec:
    Radius  6: 124.17 msec, 32.21 kpixels/msec:
    Radius  7: 121.04 msec, 33.05 kpixels/msec:
    Radius  8: 130.64 msec, 30.62 kpixels/msec:
    Radius  9: 119.47 msec, 33.48 kpixels/msec:
    Radius 10: 117.95 msec, 33.91 kpixels/msec:
    Radius 11: 122.38 msec, 32.68 kpixels/msec:
    Radius 12: 121.92 msec, 32.81 kpixels/msec:
    Radius 13: 125.45 msec, 31.89 kpixels/msec:
    Radius 14: 121.63 msec, 32.89 kpixels/msec:
    Radius 15: 120.18 msec, 33.28 kpixels/msec:
    
    And after:
    
    Radius  1: 42.26 msec, 94.65 kpixels/msec:
    Radius  2: 59.15 msec, 67.62 kpixels/msec:
    Radius  3: 60.29 msec, 66.35 kpixels/msec:
    Radius  4: 64.53 msec, 61.99 kpixels/msec:
    Radius  5: 60.07 msec, 66.59 kpixels/msec:
    Radius  6: 62.43 msec, 64.07 kpixels/msec:
    Radius  7: 60.36 msec, 66.27 kpixels/msec:
    Radius  8: 59.59 msec, 67.13 kpixels/msec:
    Radius  9: 76.17 msec, 52.51 kpixels/msec:
    Radius 10: 79.41 msec, 50.37 kpixels/msec:
    Radius 11: 118.92 msec, 33.64 kpixels/msec:
    Radius 12: 121.31 msec, 32.97 kpixels/msec:
    Radius 13: 118.30 msec, 33.81 kpixels/msec:
    Radius 14: 116.82 msec, 34.24 kpixels/msec:
    Radius 15: 116.99 msec, 34.19 kpixels/msec:
    
    I.e. almost double performance for the unrolled radius values.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=746468
    d0dc1f52