Design Arena (@Designarena)
Audio Arena 리더보드가 업데이트되어 음성-음성(speech-to-speech) 모델 상위 3개를 공개했다. 1위는 Ultravox v0.7, 2위는 Gemini 2.5 Flash Audio, 3위는 Grok Realtime이며, 오픈소스 6개 멀티턴 벤치마크로 평가했다고 밝혔다.
https://x.com/Designarena/status/2041334891854565743
#audiomodels #benchmark #speechtospeech #opensource #leaderboard

Design Arena (@Designarena) on X
Audio Arena Leaderboard Update!
Congrats to the top 3 speech-to-speech models:
- #1 Ultravox v0.7 by @ultravox_dot_ai
- #2 Gemini 2.5 Flash Audio by @GoogleDeepMind
- #3 Grok Realtime by @xai
We evaluated each model on our open source suite of 6 static multi-turn benchmarks
X (formerly Twitter)ElevenLabs erweitert seine Plattform um einen Music Marketplace für KI-generierte Musikstücke. Ersteller partizipieren über ein Lizenzmodell direkt an den Verkäufen. Parallel ermöglicht die Funktion Music Finetunes eine detaillierte Steuerung der KI-Modelle für stilistisch konsistente Ergebnisse.
#ElevenLabs #KI #MusicGeneration #AudioModels #News
https://www.all-ai.de/news/news26top/elevenlabs-music-marketplace

Mit KI-Musik lässt sich ab sofort Geld verdienen
ElevenLabs startet den Music Marketplace für Audio-Tracks. Anwender erhalten einen direkten Anteil an den Verkäufen.
All-AI.de
What's new in Microsoft Foundry | Dec 2025 & Jan 2026 | Microsoft Foundry Blog
Microsoft Foundry Dec 2025-Jan 2026 update: GPT-5.2 & Codex Max now GA, new reasoning models, agent memory in preview, MCP server, and major SDK consolidation.
Microsoft Foundry BlogIntroducing GPT-4o Audio Models in Microsoft Foundry: A Practical Guide for Developers | Microsoft Foundry Blog
How to get started with Azure OpenAI's next-generation GPT-4o audio models for transcription and text-to-speech applications.
Microsoft Foundry BlogNgười dùng tìm kiếm mô hình Text-to-Speech (TTS) và công cụ cho 8GB VRAM. Một số mô hình audio như Kokoro, coqui-XTTS, Chatterbox, Dia, VibeVoice. Bạn đang sử dụng mô hình và công cụ nào? #TextToSpeech #TTS #AudioModels #GGUF #8GBVRAM #Windows11 #MôHìnhTextToSpeech #CôngCụTextToSpeech
https://www.reddit.com/r/LocalLLaMA/comments/1opxb1r/texttospeech_tts_models_tools_for_8gb_vram/