So there's this TGSpeechbox thing which is developed by a person from Hungary, or at least as a hungarian, the name sounds like someone among us. I just don't really get why it doesn't sound right in hungarian at all when it's being developed by a native. Does anyone at least know how to tweak this thing so I could tweak it to sound right? I studied linguistics for a long time but it was mainly Russian linguistics and not at the level of an expert, so all the alveolar and other things sound chinese to me 😂
@destranis You want to talk to @Tamasg, there is a phoneme editor so making improvements is very much encouraged. The reason it sounds the way it does is it started off based on speech player, a synth NVAccess was working on for a while as an ESpeak alternative, and speech player in turn was based on Dennis Klatt's research, the man who made dectalk which was more optimised for english. Now every language has to be tweaked into shape bit by bit.
@pitermach @Tamasg Oh, that sounds fine, got you. I really would like to help but since I have no idea what tweaks exactly what, I would just slow the project down 😊
@destranis @pitermach oh, no worries at all about setting tuning, it can be overwhelming looking at that list without like, much context on how they will impact the language in other ways. Many people have helped just by filing issues with very good word examples and their expectation on what it should sound like. It's really how Spanish has come together, and once we get more Polish tweaks, creating phonemes that are specific is easier. Hardest part is getting all the data together for normalization rules. But knowing, "this word swallows the R" or "when I hear it, it sounds like he's saying the vowel with his mouth too open" is even still useful. Then I can compare Espeak's current pronunciation and work from there with the IPA on the word itself.
@Tamasg @pitermach Are there any ready tweaks which make hungarian sound a bit more, ehm, hungarian? Or what should we start with? It sounds really promising, and to be honest I've even asked AI to give me some tips and suggestions, but I don't think it got everything right.
@destranis @pitermach Hungarian has improved bit by bit, I'm natively able to speak it so to me it is one of the easier ones to tune :D Geminates were the trickiest to get there. Thankfully many of what I improve in Hungarian can be cross-applied to Finnish. Polish, not so much, different language family entirely.
@Tamasg @pitermach If you need hungarian testers, I'll be glad to help, I'm also a native 😊
@destranis @pitermach oooh wait really! Ok that's super cool - we need more members for all the languages either way, Polish and Hungarian both. Polish there's maybe 2-3 people who have given feedback, tuned phonemes. Hungarian? None. I don't even think there's much awareness into it over there yet. I got fed up with Profivox being the only option, and horrible Nuance Mariska, so my goal with Speechbox was also to give international blind communities a new engine if they are lacking or there's no free alternatives that are modern. All the TTS in Hungary got comercialized too much and someone using NVDA either buys that Nuance voice or sticks to Espeak.
@Tamasg @pitermach Exactly. I fully agree. We with a small team created some hungarian voices for RHVoice, Katalin, Imre and Anna. Imre and Anna were created using the piper tts database, Katalin uses a free database from kagle. So people can use these voices as well both for NVDA and android, but of course they also have some errors that we weren't able to fix yet.
@destranis @pitermach oooh RH Voice! Yes! The new one. I actually like it quite a lot! Not as good at fast rates, but that team is doing God's work tuning languages. And while the synthesis methods are different between them, maybe there's cross-sharing of lexicon rules we might even be able to do. Thanks for pointing to it, keep forgetting RHVoice exists but promise it's only because it is a newer one that recently got a huge revitalization push. Existed for years, well over a decade, but it wasn't until 2023-2024 that I really heard rumblings about more languages adding. Glad that Neural TTS is able to help the project like that!
@Tamasg @destranis @pitermach I'd love RHVoice to have an inflection parameter. the Portuguese voice is great, but it lacks such feature to make it less monotonous.
@clv1 @Tamasg @pitermach That would be awesome though! Of course now intonation depends on the database you train, so for example our Katalin has a rather hilarious intonation, while Anna sounds very unamused and robotic, but it would be cool to have a slider for it.