sdsds222/Unitale

一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪，集成多角色 TTS 语音合成（可智能分析音色并使用Qwen3TTS语音设计模型从音色描述文本生成音色），支持音效(SFX)、背景音乐(BGM)混音及实时台词音频滤波器的自动插入和匹配，可直接在浏览器导出 wav 成品，本工具本体无需配置环境即可跨平台在浏览器使用。

/ 100

Emerging

Implements an LLM-driven pipeline that automatically decomposes scripts into dialogue segments, infers emotional context, and generates character voice descriptions for Qwen3TTS voice design synthesis. The architecture chains multiple AI models—LLM for narrative analysis and emotional inference, IndexTTS/Qwen3TTS for multi-character speech synthesis, and custom heuristics for real-time SFX/BGM insertion timing and scene-aware audio filter application (phone calls, internal monologue, etc.). Runs entirely client-side in the browser with no backend setup required, supporting project serialization via JSON for preservation of audio libraries, filter chains, and script state.

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 9 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

HTML

License

MIT

Related tools

EmZod/Speak-Turbo

Ultra-fast local TTS for AI agents. ~90ms to first sound.

HCID274/JianYan

基于 SenseVoice 的 Windows 本地语音转文字工具，支持 OpenAI 格式 API 润色，低延迟，高精度。

Explore Voice AI Tools

All categories Trending Voice AI directory Insights