Mastodawn

michabbb Nov 6, 2024

🎯 #OuteTTS introduces a novel approach to text-to-speech synthesis using pure #languagemodeling
🔧 Built on #LLaMa architecture with just 350M parameters, featuring:

Zero-shot #voicecloning capability
Integration with #WavTokenizer (75 tokens/sec)
Local deployment via #llamacpp
#GGUF format compatibility

🔍 Technical Implementation:

Audio tokenization process
CTC forced alignment
Structured prompt system
Temperature-adjustable outputs

⚠️ Current Limitations:

Limited vocabulary range
String-only input support
Best performance with shorter sentences
Variable temperature sensitivity

https://github.com/edwko/OuteTTS
https://huggingface.co/OuteAI/OuteTTS-0.1-350M

GitHub - edwko/OuteTTS: Interface for OuteTTS models.

Interface for OuteTTS models. Contribute to edwko/OuteTTS development by creating an account on GitHub.

GitHub