Mastodawn

Fish Audio launches S2-Pro, a new TTS model with absurdly controllable emotion. The Dual-AR system pairs a 4B parameter language model with a 400M acoustic model for high-fidelity 44.1kHz audio. Supports zero-shot voice cloning from 10-30 second clips and inline emotional tags. Achieves sub-150ms latency on NVIDIA H200. https://www.marktechpost.com/2026/03/10/fish-audio-releases-fish-audio-s2-a-new-generation-of-expressive-text-to-speech-tts-with-absurdly-controllable-emotion/ #AIagent #AI #GenAI #VoiceAI #FishAudio