> Removing PulseAudio..continuing the shift to PipeWire
My #GoPiGo3 robot just shuddered in fear of becoming deaf and mute.
> Removing PulseAudio..continuing the shift to PipeWire
My #GoPiGo3 robot just shuddered in fear of becoming deaf and mute.
Working with TTS via espeak in Ubuntu has reminded me once again how, like, bad open source can be.
Believe it or not, I don't mean that as a negative. More just an observation on the ecosystem. Defaults build up, first-movers have huge advantage, and we more often than not end up in a place where the easiest solution is the worst solution.
The default espeak voice (at least what you get if you install espeak or espeak-ng-espeak from the package repo) is ear-bleedingly bad. It reminds me of state-of-the-art ca. late-90s / early-00s, except I don't think Apple ever put out anything that sounded quite so ass. No shame on the people who built the model originally: text-to-speech is a hard problem and if not for their work, there'd be nothing. But it's 2025; there are much better speech-model options now (though I haven't found anything nearly as crisp or clean as what Google is doing for special sauce yet), but of course we can't just switch to those out of the box because muh defaults.
So many processes in the Linux ecosystem are like this. "Here's the command to get started. And here's the six commands to make it not suck."
Apple and Microsoft, in competition with each other, are incentivized to either not ship experiences that are inferior to the alternative or to deprecate them when they fall behind the alternative. This sucks when your thing is the thing being deprecated, but it does mean that when I do the exact same experiment on Windows (install espeak and fire some text into it without selecting a voice), the default doesn't try to claw your eyeballs out through your ear canals. There's something to that, you know?
(Credit where it's due: projects like Ubuntu improved the status quo by being an alternate platform so they weren't beholden to other distros' defaults. They went and built their own distro with blackjack and hookers Wayland and systemd and it worked.)
All of that having been said: the thing I'm trying to do right now I wouldn't try on Windows precisely because when I hit a wall I won't be able to find an alternative, even if it takes me six tries to find it. There's power in that flexibility, even when it makes some of the labor mandatory.
Working on #adacompliance by trying to integrate a #texttospeech engine into #AOSP
#espeak has been deemed too low quality, and #sherpaTTS too slow. #RHVoice silently fails on our hardware, and I'm running out of options. Does @GrapheneOS have a TTS engine that would be commercially friendly enough for us to push it via OTA update? Or would anyone recommend one?
We work with a sensitive population and lack of a screen reader can make things even more difficult for some.
You don't like the artificial sound of #espeak? May you like this better?
https://f-droid.org/de/packages/org.woheller69.ttsengine/index.html
https://github.com/woheller69/ttsEngine
#sherpatts #tts #opensource #foss #speechsynthesis #ai #android
So, I finally hit the jackpot! Using the Tech Freedom TTS engine, set to use the eSpeak TTS engine, I have absolutely no lag where the TTS engine is concerned. Of course, TalkBack can still lag when swiping or scrolled, but I'm surprisingly getting used to that. So the issue I have with using Espeak by itself is that sometimes it'll switch dialects from US to UK English, and that is rather jarring to me. Also, I tried this other TTS engine on FDroid, which uses Piper voices, and it's far more responsive than the other one. That TTS engine is called SherpaTTS