Four open source models exist right now that do something the previous generation struggled with. They do not just generate speech. They clone a voice from a short audio sample and produce output that is genuinely difficult to compare from the original speaker.
The gap between open source and commercial TTS has been closing for a while. These four models suggest it has effectively closed for voice cloning specifically.
https://firethering.com/open-source-tts-voice-cloning/

4 Open-Source TTS Models That Can Clone Voices and Actually Sound Human
Voice cloning used to mean expensive studio software, proprietary APIs with per-character pricing, or models so heavy they needed server infrastructure just to run. That changed quietly over the last few months. Four open source models exist right now that do something the previous generation struggled with. They do not just generate speech. They clone a voice from a short audio sample and produce output that is genuinely difficult to compare from the original speaker. The gap between open source and commercial TTS has been closing for a while. These four models suggest it has effectively closed for voice cloning specifically. Here is what each one actually does and who it is for.




