GitHub - wealotwang/voice-inpu...
RE: https://bsky.app/profile/did:plc:daexpe52ebb4bwh3ybzyvmkz/post/3mofaimbuqv2k
RE: https://bsky.app/profile/did:plc:daexpe52ebb4bwh3ybzyvmkz/post/3mofaimbuqv2k
RE: https://hear-me.social/@debby/115588019309116671
Thank you for this great overview of #speechrecognition tools! I am looking for an open source tool that I can train on my own voice and that can perform locally and accurately on a not too powerful laptop. If anyone out there has experience with any of these tools let me know!
Hello, je cherche un moyen de faire de la commande vocale en #js, en continu, idéalement avec une lib la plus légère possible. SpeechRegognition existe depuis des années mais pas activé par défaut sous Firefox... Une solution nécessitant un serveur pourrait convenir si le côté serveur est opensource et que je peux l'héberger moi-même (même si j'ai peur que ce soit moins réactif...)
Quelqu'un connaitrait une solution ?

Voice dictation became the rare AI tool that changed how people write. As the speech models commoditize, the fight has moved to the layer above them, where a free app built by a developer who cannot type competes with the $81 million leader and the operating systems closing in.
@cwebber #Mozilla working more on speech models could also help them finally release the web #SpeechRecognition API in #Firefox. I know it's a hard problem but now's a good time to get funding for machine learning, and Mozilla is promoting "AI" which this accessibility infrastructure could be called.
Right now, as far as I know and have tested, only proprietary browsers support that API out of the box, and Firefox has an in-progress but non-functional implementation. Thank you so much to everyone who has done some of the partial work on the API in a libre browser; your work is *so* appreciated!
Surprisingly several free, libre, and open-source tools related with FLOSS movements (#BigBlueButton's main live captioning plugin; MidCamp) currently rely on that API and state they only support Chrome (not Chromium) -- in the default setup that sends live audio to Google's servers which I'm pretty sure then run proprietary models.
GrapheneOS Speech Services version 2 released
https://discuss.grapheneos.org/d/36001-grapheneos-speech-services-version-2-released
#HackerNews #GrapheneOS #Speech #Services #SpeechRecognition #TechNews #OpenSource #Privacy
#UnplugBigTech Tipp 5: Open-Source-Sprachassistent
Verabschiede dich von Alexa und anderen Sprachassistenten, die deine Gespräche mithören und auswerten. Nutze stattdessen eine datenschutzfreundliche Alternative wie OpenVoiceOS, ein Open-Source-Sprachassistent, der von einer aktiven Community weiterentwickelt wird und auf einem RaspberryPi läuft. So behältst du die Kontrolle über deine Daten.
#Alexa #OpenVoiceOS #Sprachassistent #VoiceControl #SpeechRecognition #datenschutz #privacy
Govorun PC: переносим офлайн-диктовку с Android на Windows за один вечер (с Claude)
На Android у меня живёт Govorun Lite — офлайн-диктовка на русском. Нажал кнопку, сказал, текст вставился. Никаких облаков, никакой отправки голоса на серверы. Работает через GigaAM v2 от Сбера. Проблема одна: на ПК такого нет. Встроенная Windows-диктовка — онлайн. Whisper — либо медленный, либо требует видеокарту. Сторонние сервисы — снова облако. Я решил портировать Govorun на Windows, и для ускорения взял Claude как пару-программиста. Что из этого вышло — в этой статье.
https://habr.com/ru/articles/1031240/
#python #speechrecognition #onnx #windows #llm #голосовой_ввод