[edited based on feedback in replies]
Types of video acceleration:
- drawing 2D graphics (filling and blitting and GUI stuff)
- drawing 3D scenes (polygons and textures and transforms)
- shoving raw pixels at the display quickly (from some other source)
- video compression/decompression (H.264, etc)
- video editing and filtering (camera lens correction, virtual backgrounds, etc)
- compositing (overlaying and blending "layers" of all the above)
- non-video GPU stuff like neural nets (out of scope here)
- the ARM CPU (various generations) that runs Linux and general code
- the "VideoCore" GPU (VideoCore IV on Pi <=3, VideoCore VI on Pi 4) includes
- the "VPU", a general-ish processor that manages the system and runs proprietary "firmware" with its own OS
- compression/decompression helpers (motion estimation, prediction, etc.)
- video capture helpers (lens correction, pixel dematrixing, etc)
- "QPU" units which run shaders for 3D rendering
- the Hardware Video Scaler (HVS) that does output-time scaling and compositing
- Pixel Valves and encoders that turn HVS output into video signals
- VideoCore firmware, proprietary code that runs on the VPU, manages the GPU overall and includes some support for OpenGL and video compression/decompression
- VideoCore APIs:
- DispmanX, a VideoCore-specific C API to control framebuffer layering and compositing
- Multi-Media Abstraction Layer (MMAL), a VideoCore-specific C API for audio/video processing
- OpenMAX (OMX), a standard but deprecated C API for audio/video; the Pi's OMX Integration Layer (IL) wraps MMAL
- EGL, a standard C API for OpenGL "backends"; the Pi has two implementations
- Broadcom libbrcmEGL (VideoCore IV / Pi <=3 only) proxies to proprietary VideoCore VPU firmware
- Mesa libEGL, which is not VideoCore/Pi-specific (see Mesa below)
- OpenVG, a standard but deprecated C API for 2D graphics; the Pi version (VideoCore IV / Pi <=3 only) wraps Broadcom's EGL
- the Linux kernel, with various subsystems
- the Direct Rendering Manager (DRM) gives access to video hardware
- the "vc4" DRM driver does modesetting/2D for both VideoCore IV and VI (despite the name), and 3D for the VideoCore IV
- the "v3d" DRM driver does 3D for the VideoCore VI (RPi 4) only
- Kernel Mode Setting (KMS), part of DRM, manages video modes and output-time compositing ("overlay planes")
- Video4Linux 2 (V4L2) manages video capture and also hardware video compression/decompression
- DMA-buf shares memory between subsystems, eg. between V4L2 and DRM ("DRM PRIME")
- the Direct Rendering Manager (DRM) gives access to video hardware
- Linux user frameworks and tools
- X windows and its whole ecosystem, including:
- Direct Rendering Infrastructure (DRI), allows clients to share hardware for direct rendering (using DRM, OpenGL, etc)
- GL Extension to X (GLX), allows clients to send GL commands ("indirect rendering") or access hardware ("direct rendering")
- Mesa, an OpenGL implementation that can use a variety of hardware and software backends
- the "vc4" Mesa driver uses the vc4 DRM driver to support OpenGL 2.1 on the VideoCore IV (RPi <= 3)
- the "v3d" Mesa driver uses the v3d DRM driver (and also vc4 for modesetting) to support OpenGL 2.1 and OpenGL ES 3.1 on the VideoCore VI (RPi 4)
- Video Acceleration API (VA-API), a standard interface to hardware accelerated video compression/decompression
- FFmpeg, a set of tools and software libraries for audio/video processing
- GStreamer, a system for audio/video pipelines using hardware and software building blocks
- VLC (and libvlc), a popular video player
- Kodi, popular home theater software with Raspberry Pi support
- omxplayer, a Pi-specific video player that uses OMX
- pngview, a Pi-specific image viewer that uses DispmanX
- (tons of other libraries and tools and programs, obviously...)
- X windows and its whole ecosystem, including:
Raspberry Pi OS offers several major ways to put these pieces together, with various pros and cons:
- "Legacy non-GL"
- This mode is supported but deprecated.
- In this mode, DRM and KMS are not available.
- V4L2 provides hardware video compression/decompression using MMAL internally.
- The kernel knows little about video hardware except how to make a simple (non accelerated) framebuffer and render console text.
- The X server uses DispmanX to switch modes (??), and renders directly to the framebuffer with no hardware acceleration (but using the ARM-optimized "fbturbo" renderer??). (Does anyone use the "rpi" driver for accelerated 2D?)
- Despite the "non-GL" name, OpenGL is available (except on the Pi 4), via Mesa on top of Broadcom's EGL library.
- DispmanX, MMAL, and OMX are available. Programs like omxplayer work well, and can layer on top of X windows.
- VLC uses MMAL for hardware decompression, and runs reasonably well.
- GStreamer can use V4L2, MMAL, or OMX for compression/decompression/display.
- Chrome uses MMAL for video playback (YouTube, etc) and runs reasonably well.
- GL with "Fake" (or "Firmware") KMS (FKMS) ("dtoverlay=vc4-fkms-v3d" in config.txt)
- This an intermediate legacy mode, the default on Buster.
- In this mode, DRM and KMS are available, using DispmanX for mode setting and framebuffer management (hence "fake", though it's still in the kernel).
- V4L2 wraps MMAL as above.
- The X server renders to DRM and uses KMS to switch modes, using "fbturbo" as above.
- OpenGL is available via the "vc4" (VideoCore IV / Pi <=3) and "v3d" (VideoCore VI / Pi 4) Mesa drivers, which use VideoCore resources via DRM without proprietary VPU firmware.
- KMS/DRM can be used for hardware compositing (overlay planes), but only by one display client at a time.
- DispmanX, MMAL, and OMX are available. Programs like omxplayer work (with some caveats?).
- VLC uses MMAL for hardware decompression, and runs reasonably well.
- GStreamer can use V4L2, MMAL, OMX, or DRM for compression/decompression/display.
- Chrome can use MMAL for video playback but it needs to be enabled.
- GL with Full KMS ("dtoverlay=vc4-kms-v3d" in config.txt)
- This is the recommended setting and the default for Bullseye.
- In this mode, DRM and KMS directly manage the GPU without relying on proprietary VideoCore VPU firmware.
- V4L2 still wraps MMAL as above.
- The X server renders to DRM and uses KMS to switch modes, using "fbturbo" as above.
- OpenGL is available via the "VC4" and "V3D" Mesa drivers as above.
- KMS/DRM can be used for hardware compositing (overlay planes), but only by one display client at a time.
- DispmanX and OMX are NOT available, as the kernel has taken over the GPU. Programs like omxplayer do not run.
- MMAL is still available (?? somehow ??).
- VLC uses MMAL for hardware decompression, and runs reasonably well.
- Chrome can use MMAL for video playback but it needs to be enabled.
- Prefer "GL with Full KMS". Fall back to "Fake KMS" or "Legacy non-GL" as needed to solve compatibility problems.
- For development, target general APIs (OpenGL, X11, DRI/DRM/KMS, V4L2) or libraries that use them (gstreamer, ffmpeg). Run under Full KMS, which is the most future-proof and open source interface.
- If you need Broadcom proprietary APIs, prefer MMAL to OMX. They're both deprecated and you're unlikely to use either one anywhere else, so you might as well use the most "native" library which is MMAL.
- For 3D, use OpenGL and prefer Mesa's vc4/v3d drivers (not the Broadcom EGL layer).
- For desktop video playback, prefer VLC (and report bugs) rather than omxplayer or other players; it's likely to get the most Pi-specific attention going forward.
- To use the hardware super efficiently (zero copy decoding/display/compositing), you may need to target DRM/KMS directly; see discussion below. Maybe people will make tools to do this. Desktop software tends to use OpenGL for compositing and pixel-pushing which works but isn't ideal for performance or power.
- Hardware H.264 de/compression up to 1080p (not 4K) is supported through MMAL, OMX, and "stateful" V4L2 APIs.
- Hardware H.265 (aka HEVC) decompression up to 4K/30Hz is supported through "stateless" V4L2 APIs only. (No legacy proprietary MMAL/OMX!) As of Nov 2021 this decoder needs to be enabled in Raspberry Pi OS (dtoverlay=rpivid-v4l2), but is on by default in the LibreELEC OS for the Kodi media player.
- Kernel interfaces and especially the stateless V4L2 interfaces are not for the faint of heart. Use a friendly framework like gstreamer or ffmpeg (RPi branch) if at all possible.
- See hello_drmprime for example code to decode H.264/H.265 (via ffmpeg using V4L2) with zero-copy display via DRM/KMS.
- Blog post: "VC4 and V3D OpenGL drivers for Raspberry Pi: an update" (Oct 2019)
- Forum thread: "Full KMS vs Fake KMS vs Legacy driver" (Oct 2019)
- Forum thread: "Questions about FKMS" (Jun 2019)
- Forum thread: "Questions on GL driver and omxplayer" (Dec 2018)
- Forum thread: "Video acceleration on the Raspberry Pi 4" (Mar 2020)
- Forum thread: "GPU theory" (Jun 2020)
- Linux kernel docs for VC4 (VideoCore IV / Pi <= 3) and V3D (VideoCore VI / Pi 4) GPU drivers
- VideoCore IV 3D Architecture Reference Guide (2013)
- Reverse engineered VideoCore IV documentation (2015)
- The RPI Open Firmware project, especially the hardware docs, such as the output pipeline summary (Oct 2020)
- Book: Raspberry Pi GPU Audio Video Programming (2015, but comprehensive at the time)
- Wiki page: "Raspbery Pi VideoCore APIs" (last updated 2015)