CMsmartvoice/One-Shot-Voice-Cloning

:relaxed: One Shot Voice Cloning base on Unet-TTS

38
/ 100
Emerging

Combines a U-Net architecture with Adaptive Instance Normalization (AdaIN) layers to enable robust speaker and style transfer from a single reference audio sample, automatically estimating duration statistics without manual annotation. Built on TensorFlowTTS, it uses a three-stage pipeline (duration model, acoustic model, vocoder) trained exclusively on neutral speech corpus to synthesize arbitrary text in cloned voices. Supports both Python inference and Google Colab notebooks, with pre-trained models available for immediate use.

245 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 20 / 25

How are scores calculated?

Stars

245

Forks

43

Language

Jupyter Notebook

License

Last pushed

Mar 22, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/CMsmartvoice/One-Shot-Voice-Cloning"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.