Started a side project Speak, a education oriented TTS inference server. It's using model Spark-TTS at this moment, supports voice clone, streaming, long text.
Project Conscious has added feature flashcard with the FSRS algorithm, and integrated the Speak API to support multi-speaker speech. You can now listen to your flashcard with as many type of voices as you need.