@cwebber

Yeah. You're referring to Mozilla Common Voice ( https://commonvoice.mozilla.org )

I believe that The HomeAssistant voice assistant uses it in a couple of ways, some non-obvious, ie not just for TTS. From memory, part of their voice recognition training comes from taking CommonVoice samples and adding all kinds of noise and distortion to them that would likely also happen in the real world (background noises, being muffled by objects etc), and then training on those distorted samples.

I contribute occasionally. There's an app on Fdroid that makes it quick and easy.

It's a great project and one I wish Mozilla would focus more on instead of all the other junk.

#Mozilla #CommonVoice #HomeAssistant #TTS #STT

Mozilla Common Voice

Went to see this Sound (/experience?) installation ZYX, Anna Barham https://www.mattsgallery.org/exhibitions/zyxv. The actual experience is an almost hallucinatory journey for 15 minutes.
I also appreciate the ideas behind the misunderstanding of speech to text synthesis and text as medium which changes you similar to Stiegler’s (taken from Aristotle) Pharmakon of the techne of writing.
#art #london #pharmakon #STT

AI Speech Technologies

This page is a collection of notes and links related to AI speech technologies, including Text-to-Speech (TTS), Speech-to-Text (STT), voice synthesis, voice cloning, and other rela(...)

#ai #cloning #speech #stt #synthesis #tts #voice #whisper

https://taoofmac.com/space/ai/speech?utm_content=atom&utm_source=mastodon&utm_medium=social

Schwerpunkt 1
Lokales Speech to Text in Linux Mint einrichten: Auf Knopfdruck (beliebiger Shortkey) das Diktat starten & beenden, in jeder beliebigen Anwendung. Keine Cloud, kein mithörender Datenkrake, nur 4-6 GB im RAM.

Schwerpunkt 2
Wie mir KI (GPT) geholfen hat, das alles hinzubekommen, inkl. kompletter Projektdokumentation.

https://blog.derbrumme.de/lokales-speech-to-text-in-linux-mint-einrichten/

#Linux #OpenSource #SpeechToText #STT #Vosk #Privacy #SelfHosting #KI #AI #NerdDictation

AI Speech Technologies

This page is a collection of notes and links related to AI speech technologies, including Text-to-Speech (TTS), Speech-to-Text (STT), voice synthesis, voice cloning, and other rela(...)

#ai #cloning #speech #stt #synthesis #tts #voice #whisper

https://taoofmac.com/space/ai/speech?utm_content=atom&utm_source=mastodon&utm_medium=social

Gibt es eine Sprache zu Text Umwandlung für #Android?
Ich möchte gerne #Sprachnachrichten die ich bekomme in Text umwandeln.

#stt

Ok, est ce que je viens d'être bluffé par la saisie vocal de #Outspoke ?

Modèle hors ligne, application open-source, support du français et autres langues européennes, intégration avec le clavier, nettoyage des "hum, heu..."

Je continu ?
C'est la bonne découverte ! https://apt.izzysoft.de/fdroid/index/apk/dev.brgr.outspoke
#stt #opensource #keyboard #speechtotext

„Outspoke“ – IzzyOnDroid F-Droid Repository

On-device speech-to-text keyboard powered by Parakeet - no cloud, no tracking.

IzzyOnDroid Repo Browser

Голосовой агент — это не чатбот с телефоном: 40 часов экономии и $100, сожженные на ботах

Я однажды примерно за сутки сжег около $100 на голосовом агенте. Не на большом запуске. Не на огромной базе. Не на хитрой рекламной кампании. Просто на небольшом пуле холодных контактов, где агент периодически попадал на voicemail, IVR, секретарей и других ботов. В какой-то момент два не очень умных голосовых процесса могли довольно долго вежливо говорить друг другу что-то в духе:

https://habr.com/ru/articles/1031148/

#голосовые_агенты #voice_agents #LLM #Twilio #ElevenLabs #Retell #OpenClaw #STT #TTS #latency

Голосовой агент — это не чатбот с телефоном: 40 часов экономии и $100, сожженные на ботах

Я однажды примерно за сутки сжег около $100 на голосовом агенте. Не на большом запуске. Не на огромной базе. Не на хитрой рекламной кампании. Просто на небольшом пуле холодных контактов, где агент...

Хабр

Ever wanted #openclaw 🦞 to make phone calls? 📲
Now you can: https://codingjoe.dev/VoIP/mcp/

#voip #python #sip #tts #stt #vibe #vibecoding #voice #vibevoice

MCP Server - Python VoIP

Update:

Eleven Labs (Scribe v2): 20,251
Aqua (Avalon 1.5): 18,899
Cohere: 19,885
Grok: 19,611
AssemblyAI (Universal 3 Pro): 19,530
Apple: 10,907

Also Grok comes out on top, with the overall quality of the output, while being the cheapest (Well, except for Apple's local model)

#AI #STT #Voice