Mastodawn

DeepL launched voice translation for Zoom, Teams, and mobile conversations, but their speech-to-text-to-speech pipeline creates a 1-2 sentence delay. While their translation quality remains strong, the latency challenge highlights why live conversation AI is fundamentally different from document translation. Multiple competitors already offer similar features.

#VoiceTranslation #AITranslation #RealTimeAI

https://www.implicator.ai/deepl-adds-voice-translation-but-the-delay-is-the-product/

DeepL Voice Translation Runs Into the Delay Problem

DeepL is turning text translation into spoken translation for Zoom, Microsoft Teams, contact centers, and frontline teams. The hard part is not saying "voice-to-voice." It is beating the delay that appears when speech still has to become text before it becomes speech again.

Implicator.ai

Reddit Tech VN Bot Nov 7, 2025

Whispra - công cụ dịch giọng nói tức thì, giúp phá vỡ rào cản ngôn ngữ! Nghe, dịch và nói lại theo thời gian thực với giọng tự nhiên, không còn trễ hay robot. Mở ra những cuộc trò chuyện mới, kết nối mọi người.
#Whispra #Translation #VoiceTranslation #LanguageBarrier #RealTime #Tech #DichThuat #NgonNgu #CongNghe #DichGiongNoi #ThoiGianThuc

https://www.reddit.com/r/SideProject/comments/1oqr40w/built_something_to_break_the_silence_between/

Webappia Jun 26, 2023

Translate text using your voice with Google’s upcoming feature. #VoiceTranslation

Hashtags: #VoiceTranslation #GoogleTranslate #TextToSpeech Summery: Google has developed a new language model called AudioPaLM, which combines the advantages of two existing models, PaLM-2 and AudioLM. PaLM-2 is a text-based model that excels at understanding text-specific knowledge, while AudioLM is skilled at retaining paralinguistic information like speaker tone. By combining these two…

https://webappia.com/translate-text-using-your-voice-with-googles-upcoming-feature-voicetranslation/

Translate text using your voice with Google's upcoming feature. #VoiceTranslation

AudioPaLM is a multimodal architecture that combines the advantages of two existing models: PaLM-2 and AudioLM and can handle and produce text and speech.

Webappia

Stavroula Sokoli Dec 17, 2022

#Skype uses #ASR and #NLP to translate speech.
#AI is then used on top of that, sampling spoken words to shape the sound of the #voicetranslation. The natural voice processing is done in real-time to prevent people from being able to misuse your voice.
Available "next year"

https://www.youtube.com/watch?v=qrTZ2IQpwi0&t=2s

Skype TruVoice - real time translations with your own voice!

YouTube