Mastodawn

OpenAI Developers (@OpenAIDevs)

음성 에이전트 기능이 크게 향상됐다. GPT-Realtime-2는 추론과 행동 수행이 가능한 음성 에이전트를 지원하고, GPT-Realtime-Translate는 70개 입력 언어를 13개 출력 언어로 번역한다. GPT-Realtime-Whisper는 더 빠른 전사를 제공해 음성 AI 개발에 중요한 업데이트로 보인다.

https://x.com/OpenAIDevs/status/2052440907933474954

#voiceagents #openai #gptrealtime #translation #speechtotext

OpenAI Developers (@OpenAIDevs) on X

Voice agents are getting more capable. Here’s what’s new: • GPT-Realtime-2 for voice agents that reason and take action • GPT-Realtime-Translate enabling translation from 70 input languages into 13 output languages • GPT-Realtime-Whisper, making transcription even faster

X (formerly Twitter)

sayzard Apr 15

Mati Staniszewski (@matiii)

@collision에서 AI 오디오 연구와 음성 에이전트의 다음 방향에 대해 이야기했다. 음성 AI와 오디오 인공지능 분야의 연구 동향과 향후 제품화 가능성을 시사하는 내용이다.

https://x.com/matiii/status/2044037387358265673

#ai #audio #voiceagents #research #collison

Mati Staniszewski (@matiii) on X

Great to speak to @collision about what’s next in AI audio research and voice agents!

X (formerly Twitter)

PPC Land Apr 13

W3C's smart voice agents report flags fragmentation, privacy gaps: W3C's February 2026 voice agents workshop flagged eight unresolved standards gaps covering interoperability, privacy, hallucination control, and accessibility. https://ppc.land/w3cs-smart-voice-agents-report-flags-fragmentation-privacy-gaps/ #W3C #VoiceAgents #SmartTechnology #Interoperability #Privacy

W3C's smart voice agents report flags fragmentation, privacy gaps

W3C's February 2026 voice agents workshop flagged eight unresolved standards gaps covering interoperability, privacy, hallucination control, and accessibility.

PPC Land

sayzard Mar 27

Logan Kilpatrick (@OfficialLoganK)

Gemini 3.1 Flash Live가 공개되었다. 음성·비전 에이전트를 만들기 위한 새로운 실시간 모델로, 1년 이상 모델·인프라·경험을 개선해 품질, 신뢰성, 지연시간이 크게 향상되었다고 밝혔다.

https://x.com/OfficialLoganK/status/2037187750005240307

#gemini #realtime #voiceagents #vision #ai

Logan Kilpatrick (@OfficialLoganK) on X

Introducing Gemini 3.1 Flash Live, our new realtime model to build voice and vision agents!! We have spent more than a year improving the model + infra + experience, the results? A step function improvement in quality, reliability, and latency.

X (formerly Twitter)

sayzard Mar 24

Rohan Paul (@rohanpaul_ai)

Smallest AI가 실시간 음성 에이전트용 TTS 문제를 해결하기 위해 Lightning v3.1을 출시했다. 기존 TTS가 텍스트를 얼마나 잘 읽는지에 집중했다면, 이 모델은 말하는 도중에도 자연스럽게 응답하는 실시간 대화 품질을 개선하는 데 초점을 둔다.

https://x.com/rohanpaul_ai/status/2036487328571728000

#tts #voiceagents #aivoice #modelrelease #realtime

Rohan Paul (@rohanpaul_ai) on X

The whole TTS industry has been optimizing for how well a voice reads text, while voice agents live or die on how well a voice talks in real time. Smallest AI just launched Lightning v3.1 to solve that problem, speaking naturally when the model is still figuring out what it

X (formerly Twitter)

sayzard Feb 18

Mati Staniszewski (@matiii)

ElevenLabs로 제작한 음성 에이전트가 이제 사람 상담원과 동일한 방식으로 보험 적용을 받을 수 있게 되었다는 발표입니다. 업계 최초 사례로, 까다로운 엣지 케이스까지 포함해 실질적 위험 보장과 책임성(accountability)을 추가해 음성 AI 배포의 법적·재무적 리스크 관리에 변화를 줄 가능성이 큽니다.

https://x.com/matiii/status/2024147154005012591

#elevenlabs #voiceai #aiinsurance #voiceagents

Mati Staniszewski (@matiii) on X

Voice agents built with ElevenLabs can now be covered with insurance - in the same way that human agents can! A first of its kind - adding real risk coverage and accountability, even for the toughest edge cases. https://t.co/5IKndEXHKI

X (formerly Twitter)

sayzard Feb 18

Artificial Analysis (@ArtificialAnlys)

AA-WER v2.0 음성 인식(Speech-to-Text) 정확도 벤치마크와 음성 에이전트(voice agents)에 초점을 맞춘 신규 비공개 데이터셋 AA-AgentTalk를 발표했습니다. AA-AgentTalk은 음성 에이전트에 중요한 발화에 집중한 홀드아웃 데이터로, 음성비서류 모델 평가의 신뢰도와 실용성을 높이기 위해 설계되었습니다.

https://x.com/ArtificialAnlys/status/2024157398139883729

#speechtotext #benchmark #dataset #aawer #voiceagents

Artificial Analysis (@ArtificialAnlys) on X

Announcing AA-WER v2.0 Speech to Text accuracy benchmark, and AA-AgentTalk, a new proprietary dataset focused on speech directed at voice agents AA-AgentTalk focuses on the speech that matters most to voice agents. As a held-out, proprietary dataset, AA-AgentTalk also mitigates

X (formerly Twitter)

NERDS.xyz – Real Tech News for Real Nerds [Unofficial]Feb 11

Deepgram triples default concurrency limits as voice agents quietly move from pilot to production

https://fed.brid.gy/r/https://nerds.xyz/2026/02/deepgram-triples-default-concurrency-limits/

sayzard Feb 10

Mati Staniszewski (@matiii)

ElevenLabs가 Boston Consulting Group(BCG)과 전략적 파트너십을 발표했습니다. 양사는 전 세계 기업 고객에게 음성 에이전트를 제공하기 위해 협력할 예정이며, 엔터프라이즈 확장을 목표로 한다는 내용입니다.

https://x.com/matiii/status/2020856684067820009

#elevenlabs #bcg #voiceagents #enterprise #partnership

Mati Staniszewski (@matiii) on X

ElevenLabs + BCG. Excited to announce our strategic partnership with Boston Consulting Group to bring voice agents to enterprises globally. To match the pace of our ambition, we know we can’t do it alone.

X (formerly Twitter)

sayzard Jan 31

AssemblyAI (@AssemblyAI)

455명의 빌더들이 사용자들이 음성 에이전트를 포기하는 이유에 대해 답한 내용을 볼 수 있도록 안내하는 게시물입니다. 음성 인터페이스의 이탈 원인 관련 설문/인사이트 공유를 목적으로 한 영상 또는 요약 자료를 소개하고 있어 음성 에이전트 UX 개선에 유용한 실무자 인사이트를 제공합니다.

https://x.com/AssemblyAI/status/2017398006216274263

#voiceagents #userexperience #conversationalai

AssemblyAI (@AssemblyAI) on X

Watch to see what 455 builders told us about why users abandon voice agents 👇

X (formerly Twitter)