Improve rendering speed with ASpeed video
Rework the DRI rendering path to improve speed on ASpeed graphics:
- Find the first DRI device with at least one display. ASpeed devices often have an integrated graphics device as well, which has no outputs.
- The DRI device can be overridden with environment variable
FBWHIPTAIL_DRI_DEV
(mainly for testing), or disabled by setting it empty (to test the Linux FB path). - The Linux FB renderer can be forced into single-buffer mode by setting
FBWHIPTAIL_LINUXFB_SINGLE
nonempty (for testing). - Use a shadow buffer for the DRI renderer instead of double buffering. This is preferred rather than directly drawing into video memory, a single-pass bulk copy to video memory is much faster.
- Use AVX2 for the bulk copy when available (needed to exceed 10 FPS on ASpeed).
- Additionally, superfluous keystrokes at the top/bottom of a menu are ignored without redrawing. Holding "down" to go to the bottom of a long menu no longer results in a long delay while the extra keystrokes are redrawn with no effect.
Before, ASpeed graphics got about 2-5 FPS, depending on the size of the menu, and there were long initial delays of about 1 second on the first two frames. With these improvements, it is now fast enough to keep up with keyboard repeats, >10 FPS on Aspeed.
This is also the rendering path used for i915. That was fast enough before (presumably because the framebuffer is in system memory), but this method is also as fast, so I did not keep two rendering paths. DRI drivers can indicate whether they prefer drawing in a shadow buffer or a backbuffer, and i915 does say it prefers a shadow buffer.