met4citizen/HeadTTS

HeadTTS: Free neural text-to-speech (Kokoro) with timestamps and visemes for lip-sync. Runs in-browser (WebGPU/WASM) or on local Node.js WebSocket/REST server (CPU).

60
/ 100
Established

Leverages transformers.js with ONNX Runtime for client-side model execution, supporting both WebGPU acceleration and WASM fallback with configurable quantization levels (fp32/fp16/q8/q4). Provides phoneme-level timing data and Oculus-compatible visemes for precise lip-sync animation, with adjustable timing offsets for integration with 3D avatar frameworks like TalkingHead. Supports flexible endpoint configuration with automatic fallback between in-browser and Node.js server backends, enabling graceful degradation across browsers and deployment scenarios.

112 stars and 375 monthly downloads. Available on npm.

Maintenance 6 / 25
Adoption 15 / 25
Maturity 24 / 25
Community 15 / 25

How are scores calculated?

Stars

112

Forks

16

Language

JavaScript

License

MIT

Last pushed

Dec 08, 2025

Monthly downloads

375

Commits (30d)

0

Dependencies

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/met4citizen/HeadTTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.