Cantonese Text to Speech with CosyVoice
Turn text into natural Cantonese — and 18 other Chinese dialects including Sichuan, Shanghai, Dongbei and Tianjin. CosyVoice captures authentic dialect pronunciation and tone.
Authentic dialect synthesis
Native Cantonese
Generate fluent Cantonese with correct tones and natural rhythm.
18 Chinese dialects
Beyond Cantonese: Sichuanese, Shanghainese, Dongbei, Tianjin, Chongqing, Xi’an and more.
Dialect voice cloning
Clone a speaker and have them speak a chosen dialect with the same identity.
Pronunciation hotfix
Correct ambiguous characters and homophones for accurate dialect output.
Dialect TTS use cases
Localized media
Dub videos and ads for regional Chinese audiences.
Culture & education
Preserve and teach dialects with spoken examples.
Regional assistants
Build voice assistants that speak a user’s local dialect.
Entertainment
Create dialect characters and comedic content.
Cantonese & dialect TTS FAQ
Does CosyVoice support Cantonese text to speech?
Yes. CosyVoice generates natural Cantonese speech with correct tones, and you can try it free in the playground above.
Which Chinese dialects are supported?
CosyVoice supports 18 Chinese dialects, including Cantonese, Sichuanese, Shanghainese, Dongbei, Tianjin, Chongqing and Xi’an.
Can I clone a voice that speaks a dialect?
Yes. CosyVoice can combine zero-shot voice cloning with dialect synthesis, so a cloned voice can speak a chosen dialect.
Is the dialect TTS free?
CosyVoice is open source under Apache-2.0. You can self-host it for free or try dialect synthesis online here.