For any fans of retro speech synthesis, a legend has spoken at last - well at least a legend for me. In the early 00's, Wirtualna Polska, one of the leading web portals providing free-of-charge mailboxes, daily news, TV schedule and many things that made Internet attractive at that time, has released their own speech synthesizer dubbed, very creatively, Syntezator Mowy WP. The main goal of it was to read out messages and contact status changes in their instant messaging app, WP Kontakt, later Spik, pronounced like the English word "speak". It was an exe running in the system tray, waiting to be sent text. It wasn't compatible with any SAPI version and other than WP Kontakt, it was only ever used in various instant messaging software, mostly through third-party plugins. It's based off Festival but the voice base belonged to WP. Years have passed, the instant messaging scene in Poland, once ripe with local products, was taken over by the international solutions the likes of Facebook Messenger and Whatsapp and so Spik and the accompanying TTS ceased being developed further. The installers for both the male and female voices are hard to come by (the male one is still hosted by several general purpose websites with installers for all sorts of things but I still pulled it off one of the old copies of the original website that the Internet Archive has to offer). The link for the female voice is dead there and the only single lead I got by Googling the original installer's file name is a web folder of someone on a Polish file hosting website - sadly, this one's expired too, so unless someone's got a copy of it locally, it may very well be lost media. The original software can be installed but won't speak. Thanks to one friend tinkering a little, it started speaking through all sorts of modern things and the attached recording is the male voice reading huge numbers - it could actually go upto heptillions. #RetroTech #Accessibility #Blind #TextToSpeech

Теперь silero-tts v5 на русском языке умеет задавать вопросы

Мы недавно писали про обновление нашего публичного синтеза, silero-tts . В прошлый раз мы существенно увеличили скорость, качество и добавили поддержку омографов. В этот раз мы хотим вас порадовать особенной фичей, которая в большинстве случаев стабильно не работает даже в моделях синтеза, которые требуют для своей работы на 3-4 порядка больше вычислительных ресурсов и современные серверные видеокарты (наш синтез запускается даже на слабых процессорах). Как вы догадались, эта фича — это постановка вопросов . Хочу послушать вопросы

https://habr.com/ru/articles/1015942/

#silero #синтез_речи #tts #texttospeech #нейросети #синтезатор_речи #русский_язык #ударение #омографы #вопросы

Теперь silero-tts v5 на русском языке умеет задавать вопросы

Созрел вопрос Мы недавно писали про обновление нашего публичного синтеза, silero-tts . В прошлый раз мы существенно увеличили скорость, качество и добавили поддержку омографов. В этот раз мы хотим вас...

Хабр

Stop Robotic AI🚀 Transform Any Text To Human Voice In Seconds!
Most AI voices sound like a GPS from 2010. 🤖

We’ve all been there. You’re watching a potentially great video, but the moment the voiceover starts, you cringe. It’s that robotic, stuttering, “GPS-style” voice that immediately screams low quality.

For years, creators were stuck in a catch-22: either pay $200+ per script for a professional voice actor on Fiverr or spend your entire weekend recording and re-recording your own voice, only to end up with background noise and “umms.”

But a quiet shift is happening in the industry. We just found the tool that’s changing everything for content creators. Imagine professional, studio-quality voiceovers in any language, generated in seconds. No more expensive freelancers, no more ‘umms’ and ‘ahhs’, and no more robotic monotone.

A new AI technology is finally crossing the “uncanny valley,” and the results are indistinguishable from a human in a professional studio.

https://www.nbloglinks.com/stop-robotic-ai-transform-any-text-to-human-voice-in-seconds/

#texttospeech #aitexttospeech #humanvoiceover #professionalvoiceover #software #AI #AIsoftware #AItools

Stop Robotic AI! Transform Any Text To Human Voice In Seconds! – nbloglinks

Most AI voices sound like a GPS from 2010. 🤖 We’ve all been there. You’re watching a potentially great video, but the moment the voiceover starts, you cringe. I

nbloglinks
Most AI voices sound like a GPS from 2010. 🤖 A new AI technology is finally crossing the "uncanny valley," and the results are indistinguishable from a human in a professional studio. www.nbloglinks.com/stop-robotic... #texttospeech #aitexttospeech #humanvoice #professionalvoiceover #software #AI

Stop Robotic AI! Transform Any...
Stop Robotic AI! Transform Any Text To Human Voice In Seconds! – nbloglinks

Most AI voices sound like a GPS from 2010. 🤖 We’ve all been there. You’re watching a potentially great video, but the moment the voiceover starts, you cringe. I

nbloglinks

田中義弘 | taziku CEO / AI × Creative (@taziku_co)

GPU 없이도 동작하는 오픈소스 음성 합성 모델 Kitten TTS V0.8이 소개됐다. 최소 14M 파라미터, 25MB 미만의 경량 TTS로 CPU 실행이 가능하며, 표현력도 높다. 스마트폰, 장난감, 차량용 등 엣지 디바이스 배포 가능성이 큰 주목할 만한 기술이다.

https://x.com/taziku_co/status/2035851486765408692

#texttospeech #opensource #edgeai #speechai #cpu

田中義弘 | taziku CEO / AI × Creative (@taziku_co) on X

GPUが無しで音声AIを実現。 Kitten TTS V0.8は最小14M、25MB未満のオープンソースTTS。 CPU実行、しかも表現力が高い。 エッジ配備まで見えるTTSは、実はかなり少ない。 スマホ、玩具、車載まで、可能性は大きく広がりそう。 リンクは🧵から

X (formerly Twitter)
GitHub - KittenML/KittenTTS: State-of-the-art TTS model under 25MB 😻

State-of-the-art TTS model under 25MB 😻 . Contribute to KittenML/KittenTTS development by creating an account on GitHub.

GitHub

Remember when computer-generated voices and virtual pop idols were cool and cute and a completely new and exciting music genre and not the constant background noise of our horrifying computerized dystopia? Pepperidge Farm remembers. Man this is a bop, even a decade and a half later.

https://www.youtube.com/watch?v=duPJqfKiA78

#hatsunemiku #vocaloid #texttospeech #baka #triplebaka #music #synthesizer #electro #jpop

Fish Audio has open-sourced S2, a #texttospeech model that supports fine-grained inline control of prosody and emotion using natural-language tags like [laugh], [whispers], and [super happy]

https://github.com/fishaudio/fish-speech

#AI

GitHub - fishaudio/fish-speech: SOTA Open Source TTS

SOTA Open Source TTS. Contribute to fishaudio/fish-speech development by creating an account on GitHub.

GitHub

#Business #Guides
Your browser can already speak a page · How to activate read-aloud features on web pages https://ilo.im/16b5hy

_____
#Reading #Audio #Accessibility #TextToSpeech #Text #Content #Webpages #Browsers

Your Browser Can Already Speak a Page

Users can customize the features built into the browser, something not often available from third-party approaches. Is an “AI” company offering to provide spoken versions of your pages for users? Is an overlay company promising to make your content more accessible by its overlay speaking it? Is some other vendor…

Adrian Roselli

Google AI Studio — The Only App Builder You’ll Ever Need

https://peertube.eqver.se/w/w7GqLAE9VKoEauQJZ6y2bA

red_027_en

PeerTube