Design Arena (@Designarena)

Audio Arena 리더보드가 업데이트되어 음성-음성(speech-to-speech) 모델 상위 3개를 공개했다. 1위는 Ultravox v0.7, 2위는 Gemini 2.5 Flash Audio, 3위는 Grok Realtime이며, 오픈소스 6개 멀티턴 벤치마크로 평가했다고 밝혔다.

https://x.com/Designarena/status/2041334891854565743

#audiomodels #benchmark #speechtospeech #opensource #leaderboard

Design Arena (@Designarena) on X

Audio Arena Leaderboard Update! Congrats to the top 3 speech-to-speech models: - #1 Ultravox v0.7 by @ultravox_dot_ai - #2 Gemini 2.5 Flash Audio by @GoogleDeepMind - #3 Grok Realtime by @xai We evaluated each model on our open source suite of 6 static multi-turn benchmarks

X (formerly Twitter)

ElevenLabs erweitert seine Plattform um einen Music Marketplace für KI-generierte Musikstücke. Ersteller partizipieren über ein Lizenzmodell direkt an den Verkäufen. Parallel ermöglicht die Funktion Music Finetunes eine detaillierte Steuerung der KI-Modelle für stilistisch konsistente Ergebnisse.

#ElevenLabs #KI #MusicGeneration #AudioModels #News
https://www.all-ai.de/news/news26top/elevenlabs-music-marketplace

Mit KI-Musik lässt sich ab sofort Geld verdienen

ElevenLabs startet den Music Marketplace für Audio-Tracks. Anwender erhalten einen direkten Anteil an den Verkäufen.

All-AI.de
What's new in Microsoft Foundry | Dec 2025 & Jan 2026 | Microsoft Foundry Blog

Microsoft Foundry Dec 2025-Jan 2026 update: GPT-5.2 & Codex Max now GA, new reasoning models, agent memory in preview, MCP server, and major SDK consolidation.

Microsoft Foundry Blog
Introducing GPT-4o Audio Models in Microsoft Foundry: A Practical Guide for Developers | Microsoft Foundry Blog

How to get started with Azure OpenAI's next-generation GPT-4o audio models for transcription and text-to-speech applications.

Microsoft Foundry Blog

Người dùng tìm kiếm mô hình Text-to-Speech (TTS) và công cụ cho 8GB VRAM. Một số mô hình audio như Kokoro, coqui-XTTS, Chatterbox, Dia, VibeVoice. Bạn đang sử dụng mô hình và công cụ nào? #TextToSpeech #TTS #AudioModels #GGUF #8GBVRAM #Windows11 #MôHìnhTextToSpeech #CôngCụTextToSpeech

https://www.reddit.com/r/LocalLLaMA/comments/1opxb1r/texttospeech_tts_models_tools_for_8gb_vram/