For those of you interested in our recent video offloading / zero-copy playback work: I quickly put together some #livi #flatpak s to make it easy to test stuff already. Compositor offloading should work on all semi-recent Intel/AMD and a variety of ARM64 devices.

If you trust the sandbox you can get them here:
https://cloud.silentundo.org/s/r8733siTjP4yRJp

I expect quite a few people hitting driver bugs, so please help tracking those down :)

#LinuxMobile #gtk #GStreamer #Wayland #GNOME

livi-public

Nextcloud - a safe home for all your data

Nextcloud

Things should generally work on #gnome45, #kde6, #sway and #weston - not sure about other compositors.

I haven't tried #kde myself yet as NV12 support was only added recently. But tomorrow there's a local #kde release party where I hope to convince some people to try on their devices.

Note that offloading currently requires HW decoding - something I hope we can change next cycle. #livi will display a little icon in the top-left if hardware decoding and thus offloading is not used.

Whenever HW decoding and offloading works I'd expect the player to be competitive to whatever other favorite player you have performance wise.

Everything in the Flatpak is either already upstream or close to it, the missing bits being

https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5890

and

https://gitlab.gnome.org/guidog/livi/-/merge_requests/35

I hope to have both shipping with the upcoming spring release distros.

v4l2codecs: decoders: Add DMA_DRM caps support (!5890) · Merge requests · GStreamer / gstreamer · GitLab

In order to simplify caps negotiations for clients and, notably, be more compatible with va* decoders. Crucially this allows clients to know ahead of time whether buffers will...

GitLab
For those that are confused by the pictures in the first post because they know about the hardware limitations: yes, correct, both of them actually don't show zero-copy playback :P
One semi-surprising finding of the offloading work was that compositor offloading often pays off even when not hitting a full zero-copy path / hardware plane scanout.

In theory that gap shouldn't exist and we more or less have all APIs in place to let clients scale and pre-rotate content in a optimal way - i.e. so the compositor doesn't have to do a copy and we keep things to one single copy wherever zero-copy is impossible.

In practice there's lots of room for improvements in various stacks. With compositor offload we now have a baseline of the minimal required GPU work (assuming compositors are fully optimized).

For those using HW that still uses the stateful V4L2 API for decoding - such as the #RaspberryPi4 - I uploaded another build to the link in the first post that includes a #GStreamer patch that is *not* close to landing, but works well enough to make playback work.

With that I can play 1080p30fps videos (the decoder limit) on my big screen smoothly, which otherwise not possible (apart from reducing the resolution).

https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6114

#RaspberryPi #Wayland #gtk

Draft: v4l2: videodec: DMA_DRM caps support PoC (!6114) · Merge requests · GStreamer / gstreamer · GitLab

This is a proof-of-concept and not yet ready for production This allows the Raspberry Pi 4 H264 decoder to negotiate caps with the new DMA_DRM...

GitLab

P.S.: please ignore the glitches and missing elements, that's temporary issues :P

Edit: updated the build with a workaround for some of the glitches.

Edit2: in case you want to follow:
https://gitlab.gnome.org/GNOME/gtk/-/issues/6498
https://gitlab.gnome.org/GNOME/gtk/-/issues/6499

ngl: Missing elements on V3D / RPi4 (#6498) · Issues · GNOME / gtk · GitLab

On my Raspberry Pi 4 / V3D ngl unfortunate fails to render certain elements like close buttons. This happens on Mesa 23.3 and main (and has for...

GitLab

@rmader That reminds me that this seems like a regression with the switch away from the proprietary Broadcom GL stack and the OpenMAX IL/MMAL-based decoders.

With those (even on the RPi 1/2/3!) it was possible to decode and render 1080p30 smoothly by getting EGLImages from the decoder and rendering them via GLES2.

With the new Mesa/VC4 GL stack and the v4l2-based stateful decoder this is only possible by directly rendering the dmabufs via KMS or Wayland instead of going via anything GL-based.

I didn't investigate this further.

@slomo I have to say that when I reduce the screen resolution to 1080p it probably work if it wasn't for https://gitlab.freedesktop.org/mesa/mesa/-/issues/10306 (direct scanout broken for GL apps).

But for anything above I'd assume things to get problematic without hardware plane upscaling. Maybe it would still manage 30fps, but likely with an extra frame latency on a 60fps screen.

Raspberry Pi4 performance regression with Mesa 23.3. May effect other V3D based Pi devices. (#10306) · Issues · Mesa / mesa · GitLab

System information OS: batocera.linux GPU: V3D Kernel version: 6.1.64-v8 Mesa version: 3.1 Mesa 23.3.1 Xserver...

GitLab
@rmader Ah that's a good hint. Thanks!
@rmader what is compositor offloading?
Collabora (@[email protected])

#FOSDEM: While efficient video playback has long been possible in the embedded #Linux world, desktop applications have been lagging behind. Here's a look at the state of video offloading on the Linux desktop, by Robert Mader: https://www.youtube.com/watch?v=SMCMZwAiw2w&list=PLZjq3una5SrCAdJiHl9FyE6GLpekJ66Mx&index=2 #GStreamer #GTK #Chromium

FLOSS.social