Mastodawn

TGSpeechBox 3.0 is out. After seven betas, two release candidates, and more commits than I care to count — I'm calling it done.
I've been building this synthesizer for 5 months now, and 3.0 is the first release where I feel like it's genuinely a different piece of software from what came before. If you put 2.99 and 3.0 side by side, the difference isn't subtle. The vowels, the diphthongs, the stops, the prosody, the way connected speech actually flows, it's night and day. The Fujisaki pitch model not being a mechanical bull, the diphthong collapse system, the dictionary system fully done, the prominence and multiple-pitch pass pipeline. All of it came together in this cycle.
Every platform got real work. Linux is a first-class citizen now! A native Speech Dispatcher module, proper installer, PipeWire and ALSA auto-detection. No more pipes, no more shimmer. Android is on Google Play. iOS and macOS are on the App Store. Windows SAPI has a full settings UI. The pronunciation dictionary system ships on all of them.
None of this happened alone. This release belongs to the testers, the issue reporters, the dictionary contributors, and everyone who sent feedback across the betas. You shaped it.
3.0 is a milestone. I hope you hear it.
https://apps.apple.com/us/app/tgspeechbox/id6759512621
https://play.google.com/store/apps/details?id=com.tgspeechbox.tts
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/TGSpeechBox-300.nvda-addon
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/TGSBPhonemeEditor-v300.zip
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/TGSpeechSapiSetup-v300.exe
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/TGSpeechBox-v300.apk
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/tgspeechbox-linux-x86_64-v-300.tar.gz
https://github.com/tgeczy/TGSpeechBox/releases/download/v-300/tgspeechbox-linux-aarch64-v-300.tar.gz
https://play.google.com/apps/testing/com.tgspeechbox.tts
https://testflight.apple.com/join/Y8RBtGBY

TGSpeechBox App - App Store

Download TGSpeechBox by Tamas Geczy on the App Store. See screenshots, ratings and reviews, user tips, and more apps like TGSpeechBox.

App Store

Show thread

Pratik Patel 2d ago

@Tamasg Congratulations. This is major.

Show thread

Tamas G 2d ago

@ppatel ha thanks! I did say a week for the RC and sort of feature freeze stage, and was able to keep by that promise!
Then I saw this 72-hour window where no new issues kept dripping in, which is generally a good sign people are happy with the place it is at. Once the big issue 72 Linux stuff sorted out and I realized I needed our own proper speech-dispatcher module, that was it.
I also now have local testbeds for ARMV64 and X86 Linux distros so doing any kind of debugging on all of them is easy! Huge win! Proper workbench for engineering. Android on Windows, Linux on Pi and 2 VMs, M4 Mac for XCode. I plan to really look into learning more test-driven development because I practice it a lot at work and there's a lot of language-specific edge cases testing would help me catch, so getting better and doing more TDD post-3.0 is on the books for sure.

Show thread

Pratik Patel 2d ago

@Tamasg I've gotten to a point where I do nothing but TDD. If you're starting off with it, I suggest you tell the AI to do red-green TDD. I've developed a skill that reenforces it. Just adding /tdd gets me a long way. It's not going to catch everything, especially things that rely on external testing. But adding my own refactoring and elegant code skills catches most bugs and makes my own job much easier when reading code.

Show thread

Tamas G

@ppatel oooh I'm super glad you're a proponent of it! At work one of my Android colleagues was very into it and he really got me doing a lot of tests especially in the React component world, and Python tests for things like an API upload script that delivered payloads. Without it I may have uploaded buckets of issues wrong and ran needless CI, so for sure about it helping catch that early. When it's mission-critical, not sure, wouldn't engineer without it but with C++ and speech synthesis, it was all new to me. My goal is to have tests that thoroughly go through each pass and the pipelines with sample IPA, phrases, do a lot of good runs during CI builds just like I do at my job. Seeing actual tests passing or failing on passes and language sets would be so sweet, and I have the GitHub Workflow + actions knowledge partially there from work now.

Show thread

Pratik Patel 2d ago

@Tamasg I even have a CI pipeline for one of my projects that does automated integration testing. I managed to run out of 2k CI minutes when I implemented that. Lol. In retrospect, it was not a good idea to let the tests run on their own all the time.

Show thread

Pratik Patel 2d ago

@Tamasg If you want to go back to some of your code and go through designing tests, pick an area of the code you want to work on, tell the LLM to do a linear exploration while designing red-green tests for it. Do it for the code where you don't feel confident about it. You'll be amazed at what the AI gives you. It catches its own mistakes lot more frequently.

Show thread

Tamas G 2d ago

@ppatel Interesting! See I know Jest, some Espresso tests for Android. But C++? I guess there's one called Catch2. I'm also looking at G-Test, since I know it can do assertion-based tests for C++ too. Funny though about the 2000 minute thing, similarly I had a work test CI run for 200 minutes before realizing I should probably go stop it and see what's causing it to hang. AI (at least Claude here) has done really good though about monitoring workfflows in realtime so if something hangs I can nudge it along, even if it doesn't always know the proper signal to realize it's actually hanging. Ha.

Show thread

Pratik Patel 2d ago

@Tamasg I've used Catch2 for C++ before. It's quite good. I made a mistake when writing my workflow. I didn't catch the mistake because I was sleepy when I did it. Apparently, it ran the integration tests every time I committed. Basically, it was supposed to download a bunch of PDF files from a corpus to test against my library. It was about five to six minutes of run time. Tests were a success. But, when I was writing and commiting, I didn't check my emails.

Show thread

Tamas G 2d ago

@ppatel Ha! At least they passed, not failed! That's like the only optimism there.
I'll make the same mental note tonight! There's a 109,000-word stress dictionary and an eSpeak phonemizer. If I accidentally wired that into every-commit CI instead of tag-only, that's 4 hours of GitHub Actions burning through my free minutes. Catch2 for the fast unit tests on every push, heavy integration stuff on PR/tag only. Lesson learned vicariously through you! :D and some of my own work mistakes around that.

Show thread

Pratik Patel 2d ago

@Tamasg Happy to serve as a guinae pig. The full suite of tests don't need to be run by Github. I'm running all of my tests on my local machine as it is. I've learned to assign background agents to run particular tests if I think I'll need them. With a major push, the full suite runs In my case, Claude will do it automatically with a major push. With commits, it will run tests for the session. You can obviously change this behavior.

Show thread

Pratik Patel 2d ago

@Tamasg Oh and fuzz testing could be a good friend when hunting down weird input bugs. Here's a good but simple explanation if you don't know the technique.

https://about.gitlab.com/topics/devsecops/what-is-fuzz-testing/

What is fuzz testing?

Fuzz testing, also called fuzzing, is a way to find bugs other software testing methodologies can’t."

about.gitlab.com