Here's a demonstration of Richard, one of the two new Blastbay formant-based neural text to speech voices. In this demo, I've deliberately installed the trial version on one of my computers so you get to hear the degraded audio once the one-minute trial period ends. Transcript in alt text.
@jaybird110127 Why do we need these? Legit, honest question. Why is he selling these, there aren't enough voices out there? What's the benefit of this?
@remixman @jaybird110127 Because, as much as some refuse to admit it, Eloquence has been living on borrowed time for years, and it's only a matter of time before it goes completely belly-up. How long can it keep getting patched, repatched, etc. before it just doesn't work? Dectalk has had a hell of a revival, but it's about as much of a legal grayarea as a fanfic. Besides, if we want any hope of mainstream accessibility in embedded devices - think kitchen appliances...
@jackf723 @jaybird110127 Sure. This is nothing like Eloquence, though, sadly.
@remixman @jackf723 See the original Audiogames thread. It's actually a lot like it in design, as I understand it, but thanks to modern technology, it sounds much more human.
@jaybird110127 @remixman It's also the principal of the matter. How long's it been since we've had a speech synthesizer truly built for performance hit the commercial market, much less one entirely developed, end-to-end, by someone with more than two decades of screen-reader use, who has not only r&d but all that lived experience with screen-readers to go off of in the build process? You had Dolphin Apolo/Juno/Orpheus as an example of a screen-reader company developing
@jaybird110127 @remixman their own end-to-end synthesis, and that was about it. Speakout/SoundingBoard worked off of existing end-to-end infrastructure. We almost had speechplayer, but if that used Espeak's phonemizer, it wouldn't be embeddable in commercial devices without opensourcing firmware. With this newly-released speech, we have,, possibly for the first time, a speech synthesizer built for us, by us, that truly has a fair shot at mainstream acceptance.
@remixman @jaybird110127 UEFI, etc. then there needs to be a modern, licensable, significantly lighter-weight tts that has the benefit of being formant-based while baring the good parts of neural synthesis, so that you have a synth that's a Eloquence/Dectalk levels of responsiveness for screen-readers/embedded devices, yet at the same time quite listenable for a general audience.
@jackf723 @remixman @jaybird110127 And just wait until NVDA goes all 64-bit after they end Windows 10 support. Unless somebody really goes the extra mile for it, IBM TTS will no longer be an option.
@sclower @remixman @jaybird110127 And even given someone with enough knowhow to develop a bitbridge, you'd need access to at least some of the Eloquence codebase to make it run well, and even then no one, even from Cerence, has the inclination or will to truly refactor it. And if Assistive Technology ever were a Price is Right/Sale-of-the-Century category, my money is on the guess that licensing costs Aple paid to Cerence would be in the tens of thousands. Lol.