E2/F5 TTS

This is an online demo for F5-TTS with advanced batch processing support. This app supports the following TTS models:

  • F5-TTS (A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching)
  • E2 TTS (Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS)

The checkpoints currently support English and Chinese.

If you're having issues, try converting your reference audio to WAV or MP3, clipping it to 12s with ✂ in the bottom right corner (otherwise might have non-optimal auto-trimmed result).

NOTE: Reference text will be automatically transcribed with Whisper if not provided. For best results, keep your reference clips short (<12s). Ensure the audio is fully uploaded before generating.

Choose TTS Model

Batched TTS

Check to use a random seed for each generation. Uncheck to use the seed specified.

If undesired long silence(s) produced, turn on to automatically detect and crop.

0.3 2
4 64
0 1