TGSpeechBox v3.0-beta3 is out! 19 bug fixes, 8 new features, 5 language pack improvements.
The big ones: stop consonants now use research-based burst spectral templates from Stevens & Blumstein — alveolar, velar, and labial stops each get their own shape, so /d/ vs /g/ and /t/ vs /k/ are clearly distinct. Stop clusters in words like "locked" and "kept" now properly unrelease the first stop, the way natural speech does.
MOUTH diphthong onset was only 30 Hz from schwa — "outside" sounded like "ertside." Fixed with Hillenbrand GenAm data. Per-diphthong duration scaling replaces the old global knob, so PRICE gets the time it needs without bloating GOAT. Diphthong rate compensation keeps bare "I" and "Y" from losing identity at high speech rates.
New Fujisaki clause-type overrides let language pack authors tune question/exclamation intonation in YAML. Spanish gets proper Castilian vs Latin American approximant splits. Australian English recovers its hand-tuned vowels.
Clause-final sonorants no longer clip. Cascade resonator pops, gone. Tap timing, fixed three ways.
And yes — we know en-gb PRICE still sounds a bit Stewie Griffin. The glide doesn't curve down the way it should yet. We hear you, it's on the workbench.
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b3.nvda-addon
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSBPhonemeEditor-v300b3.zip
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechSapiSetup-v300b3.exe
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b4.apk
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-aarch64-v-300b3.tar.gz
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-x86_64-v-300b3.tar.gz
https://testflight.apple.com/join/jvvGY6Fz
The big ones: stop consonants now use research-based burst spectral templates from Stevens & Blumstein — alveolar, velar, and labial stops each get their own shape, so /d/ vs /g/ and /t/ vs /k/ are clearly distinct. Stop clusters in words like "locked" and "kept" now properly unrelease the first stop, the way natural speech does.
MOUTH diphthong onset was only 30 Hz from schwa — "outside" sounded like "ertside." Fixed with Hillenbrand GenAm data. Per-diphthong duration scaling replaces the old global knob, so PRICE gets the time it needs without bloating GOAT. Diphthong rate compensation keeps bare "I" and "Y" from losing identity at high speech rates.
New Fujisaki clause-type overrides let language pack authors tune question/exclamation intonation in YAML. Spanish gets proper Castilian vs Latin American approximant splits. Australian English recovers its hand-tuned vowels.
Clause-final sonorants no longer clip. Cascade resonator pops, gone. Tap timing, fixed three ways.
And yes — we know en-gb PRICE still sounds a bit Stewie Griffin. The glide doesn't curve down the way it should yet. We hear you, it's on the workbench.
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b3.nvda-addon
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSBPhonemeEditor-v300b3.zip
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechSapiSetup-v300b3.exe
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/TGSpeechBox-v300b4.apk
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-aarch64-v-300b3.tar.gz
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300b3/tgspeechbox-linux-x86_64-v-300b3.tar.gz
https://testflight.apple.com/join/jvvGY6Fz