F5-TTS
Open-source zero-shot voice cloning using flow matching
About
F5-TTS is an open-source text-to-speech model that performs zero-shot voice cloning from short audio samples. It utilizes a flow matching architecture with diffusion transformers to generate fluent and faithful speech.