While I'm finding JWZ's attempts to get Xscreensaver working with Wayland an interesting deep-dive into the complexities and deficiencies of that system, I can't help but think that he's working at this problem from the wrong end.
So what we want out of this is:
1. Robust screen locking that fails safe - i.e. some random component can't crash and reveal the user's desktop or unlock the screen. If we do have a catastrophic error, we terminate the user's session.
2. An unlock window with a "new session" / "switch user" button that supports accessibility tools, fingerprints, smart cards, face unlock, etc.
3. Pretty things ("hacks") to look at after it locks and under the unlock window
4. Hacks can use a screenshot of the desktop if the user is ok with that
5. Robust idle detection and user presence detection
6. Display power management
7. Looks slick (fading between states at a minimum)
With my minimal understanding of how this should work, the logical solution to this is to implement this in the compositor and have plugin executables for the "pretty things" and maybe the unlock window too.
I'm envisaging an architecture like this:
1. Compositor detects idle (and no apps keeping the display on, e.g. videos playing)
2. After a while, the compositor dims the screen (screen brightness or just dim it in software)
3. After a while, the compositor fades the screen to black, and flags it as locked
4. If configured, the compositor runs the plugin that implements the Hacks and gives it a specific window handle to render to which is on top of the black screen, and optionally some portal or something to allow the hack to grab a screenshot of the desktop as it was pre-locking
5. After a while, the compositor turns off the display, takes a screenshot of the hack and terminates / pauses the hack
(6. After a while, the compositor asks the system to suspend, hibernate, etc.)
Then when the user hits a key / moves the mouse / whatever:
1. The display comes to life
2. The unlock window plugin is run with a specific window handle that renders over (a screenshot of) the hack. This is given keyboard focus and can be interacted with using the mouse
3. This unlock window plugin sends user input to the compositor or produces some unforgeable or out-of-band signal that the user has authenticated themselves
4. The compositor fades back to the user's desktop
I feel this architecture has advantages over the traditional "single-ish user app" model that was used with X11 in that:
- The thing that controls whether the session is locked or not is also the parent process of the session / renderer of it - if it crashes then the user's session is terminated, so crashing bugs cannot accidentally unlock the user's session. Also, if the hack crashes, the user's desktop isn't revealed.
- Hacks are isolated in their own process away from the locking, so hacks crashing is a non-event (either we fade to black or restart it)
- The unlock window can also be in it's own process away from the locking, so if there's some insane input related crash, we don't unlock. Also, if it needs an IME or something that can run in the same (isolated) space.
- All the hardware actions are being performed by the app that's managing the hardware
- All input snooping is done by a process that's already managing input
I don't think the whole "app" model that worked on X11 - a display manager / renderer that didn't really have any real support for this sort of thing - works in a "less trusting" environment like Wayland and splitting this whole thing up into components managed by the compositor makes much more sense.