Chrome extension adjusts video speed based on how fast the speaker is talking
https://github.com/ywong137/speech-speed
#HackerNews #ChromeExtension #VideoSpeed #SpeechRecognition #TechInnovation #OpenSource
Chrome extension adjusts video speed based on how fast the speaker is talking
https://github.com/ywong137/speech-speed
#HackerNews #ChromeExtension #VideoSpeed #SpeechRecognition #TechInnovation #OpenSource
Hands on with AI audio generation: GAI voice, music, and sound effects
This is the second post in a series exploring the multimodal possibilities of generative AI. This series will take a detailed, hype-free look at text, image, audio, video, and code generation and explore the creative potential as well as the ethical concerns of GAI. Although Generative AI isn't a new technology, it's definitely been having a hype moment since the release of ChatGPT in November 2022. Unfortunately, the focus has been squarely on the text-based chatbot at the exclusion of […]https://winbuzzer.com/2026/03/16/ibm-granite-4-1b-speech-tops-openasr-leaderboard-xcxwbn/
IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard
#AI #AIModels #IBM #SpeechRecognition #OpenSourceAI #EnterpriseAI #EdgeComputing #AITranslation #OpenASRLeaderboard
Nico Martin (@nic_o_martin)
MistralAI의 Voxtral과 Transformers.js, WebGPU 조합으로 브라우저에서 실시간 음성 전사가 가능해졌다는 발표입니다. 다양한 언어를 지원하며 문장 중간에 언어가 바뀌어도 인식하는 기능을 강조하여 웹 기반 ASR(자동 음성인식)의 저지연·다국어 적용 사례로 의미가 큽니다.
https://x.com/nic_o_martin/status/2032087412462022663
#mistralai #voxtral #transformersjs #webgpu #speechrecognition
ElevenLabs: Audio to Text. New Version

Y Combinator (@ycombinator)
Dylan Fox가 2017년에 설립한 AssemblyAI는 AI 붐 이전에 시작되어 시장이 따라오는데 5년이 걸렸다고 회고합니다. 현재 AssemblyAI는 수천 개 회사의 음성 기능을 지원하며 매년 수억 시간의 오디오를 처리하고 있고, Dylan이 Snowmaker에 합류했다는 소식이 포함되어 있습니다.

In 2017, Dylan Fox started @AssemblyAI— years before the AI boom. It took 5 years for the market to catch up. Now, AssemblyAI powers voice features for thousands of companies and processes hundreds of millions of hours of audio every year. Dylan joined @snowmaker for a Founder
みゅみゅ (@miyumiyuna5)
음성 인식 부분에 문제가 있어 TeloPon의 v1.01b를 릴리스했다고 알림. 이 업데이트로 호출을 무시하는(응답하지 않는) 현상이 개선될 것으로 기대하며, 관련 GitHub 릴리스 링크가 포함되어 있습니다.
Speech Recognition Not Working in Windows 11? Fix the “Listening…” Error
🎙️ Learn the 5 Reddit-approved fixes to restore voice typing and Voice Access in Windows 11 fast. From microphone permissions to speech services, we’ve covered it all.