GrapheneOS Speech Services version 2 released
https://discuss.grapheneos.org/d/36001-grapheneos-speech-services-version-2-released
#HackerNews #GrapheneOS #Speech #Services #SpeechRecognition #TechNews #OpenSource #Privacy
@cwebber #Mozilla working more on speech models could also help them finally release the web #SpeechRecognition API in #Firefox. I know it's a hard problem but now's a good time to get funding for machine learning, and Mozilla is promoting "AI" which this accessibility infrastructure could be called.
Right now, as far as I know and have tested, only proprietary browsers support that API out of the box, and Firefox has an in-progress but non-functional implementation. Thank you so much to everyone who has done some of the partial work on the API in a libre browser; your work is *so* appreciated!
Surprisingly several free, libre, and open-source tools related with FLOSS movements (#BigBlueButton's main live captioning plugin; MidCamp) currently rely on that API and state they only support Chrome (not Chromium) -- in the default setup that sends live audio to Google's servers which I'm pretty sure then run proprietary models.
GrapheneOS Speech Services version 2 released
https://discuss.grapheneos.org/d/36001-grapheneos-speech-services-version-2-released
#HackerNews #GrapheneOS #Speech #Services #SpeechRecognition #TechNews #OpenSource #Privacy
#UnplugBigTech Tipp 5: Open-Source-Sprachassistent
Verabschiede dich von Alexa und anderen Sprachassistenten, die deine Gespräche mithören und auswerten. Nutze stattdessen eine datenschutzfreundliche Alternative wie OpenVoiceOS, ein Open-Source-Sprachassistent, der von einer aktiven Community weiterentwickelt wird und auf einem RaspberryPi läuft. So behältst du die Kontrolle über deine Daten.
#Alexa #OpenVoiceOS #Sprachassistent #VoiceControl #SpeechRecognition #datenschutz #privacy
Govorun PC: переносим офлайн-диктовку с Android на Windows за один вечер (с Claude)
На Android у меня живёт Govorun Lite — офлайн-диктовка на русском. Нажал кнопку, сказал, текст вставился. Никаких облаков, никакой отправки голоса на серверы. Работает через GigaAM v2 от Сбера. Проблема одна: на ПК такого нет. Встроенная Windows-диктовка — онлайн. Whisper — либо медленный, либо требует видеокарту. Сторонние сервисы — снова облако. Я решил портировать Govorun на Windows, и для ускорения взял Claude как пару-программиста. Что из этого вышло — в этой статье.
https://habr.com/ru/articles/1031240/
#python #speechrecognition #onnx #windows #llm #голосовой_ввод
Amical - Open-source AI dictation app
Cossmology Profile: https://dub.sh/Vk7tPkn
Key People: Haritabh Singh, Naomi Chopra
Deepgram released Flux Multilingual, a speech recognition model that handles 10 languages with real-time switching during conversations. The system detects language changes mid-call and processes conversational turns in under 400ms. Available as cloud API or self-hosted at the same price as English-only versions. Could simplify multilingual voice applications that previously required separate detection and routing systems.

Deepgram launched Flux Multilingual, a conversational speech recognition model supporting 10 languages with real-time detection and mid-call code-switching. Uses conversational turn detection at under 400ms. Available as cloud API or self-hosted with EU endpoint support.
Non-lexical sounds impact ASR in clinical documentation.
🔊 NLCS: 2.4% of total words, conveying key clinical info
😷 Google's WER: 40.8%, Amazon's: 57.2% (all NLCS)
❌ Error rates for clinically relevant NLCS: Google 94.7%, Amazon 98.7%
📝 Total words: 135,647; 3284 NLCS; 76 conveyed critical data
🗣️ Described implications on documentation accuracy
#ASR #ClinicalDocumentation #SpeechRecognition #AI #NLPSolutions #Pub2Post https://tnyp.me/Npmiz0F4/m
Learn the basics of neural networks and backpropagation: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
#video #tutorial #deepLearning #LLMs #recognition #speechRecognition #visualRecognition #neuralNeworks #machineLearning