TGSpeechBox v3.0-beta3 is out! 19 bug fixes, 8 new features, 5 language pack improvements.
The big ones: stop consonants now use research-based burst spectral templates from Stevens & Blumstein — alveolar, velar, and labial stops each get their own shape, so /d/ vs /g/ and /t/ vs /k/ are clearly distinct. Stop clusters in words like "locked" and "kept" now properly unrelease the first stop, the way natural speech does.
MOUTH diphthong onset was only 30 Hz from schwa — "outside" sounded like "ertside." Fixed with Hillenbrand GenAm data. Per-diphthong duration scaling replaces the old global knob, so PRICE gets the time it needs without bloating GOAT. Diphthong rate compensation keeps bare "I" and "Y" from losing identity at high speech rates.
New Fujisaki clause-type overrides let language pack authors tune question/exclamation intonation in YAML. Spanish gets proper Castilian vs Latin American approximant splits. Australian English recovers its hand-tuned vowels.
Clause-final sonorants no longer clip. Cascade resonator pops, gone. Tap timing, fixed three ways.
And yes — we know en-gb PRICE still sounds a bit Stewie Griffin. The glide doesn't curve down the way it should yet. We hear you, it's on the workbench.
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b3.nvda-addon
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSBPhonemeEditor-v300b3.zip
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechSapiSetup-v300b3.exe
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b4.apk
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-aarch64-v-300b3.tar.gz
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-x86_64-v-300b3.tar.gz
https://testflight.apple.com/join/jvvGY6Fz
@Tamasg Rate 60, pitch mode impulse, US English, read the words found out, or thousand.
@tspivey yeah, that's the glide retuning work. I feel like it's almost there, but a bit too flat. Hmm. But then I can't quite describe it, because in words like about, it's sounding better than, sound. So for sure needs more work. You have a sharp ear as well for noticing, I appreciate that. Ultimately I want it to have that same "aow" glide that Eloquence has for it.
@Tamasg @tspivey This segment:
"work. I feel like it's almost there, "
The I sounds like it's trying to say Uh I with an extremely short occurrence of the uh sound.
@jackf723 @tspivey yeah, also sounds like some sort of Diphthong collapse thing! Also at faster rates I'm noticing it happens more. So both of these feel like a diphthong type problem, if you listen to that phrase around rate 60-65 that thickening of the vowel doesn't happen. Almost like durations aren't scaling well enough with rates.