The ultimate goal of the Citrus Project is to develop a model that approaches Kokoro's performance while maintaining reproducibility and fine-tuning capability, and ready for anime-style speakers.
But first let's take things step by step...
TODO
TODO
As is well known, Kokoro has not released the fine-tuning scripts, and the author has also closed the Community on HF.
For other models, if your hardware is decent, you can use Chatterbox Multilingual - it's zero-shot is good enough.
That said, once I find a model suitable for fine-tuning, I may also release a Project Citrus fine-tuned version.
I plan to start with StyleTTS2-lite and kokoro_training, modifying it to more closely resemble the original Kokoro structure, and then begin training.
Licensed under the Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0