SEPIA-Framework/sepia-stt-server

SEPIA server to support open-source speech recognition via WebSocket connection.

/ 100

Emerging

# Technical Summary Full-duplex Python FastAPI server supporting multiple pluggable open-source ASR engines (Vosk, Coqui, Deepspeech, Scribosermo) with standardized WebSocket API for streaming audio and receiving real-time partial/final transcriptions. Features modular architecture enabling per-engine configuration, optional post-processing, speaker identification, grammar constraints, confidence scores, and word timestamps—all configurable on-the-fly via HTTP REST and WebSocket events. Includes Docker multi-architecture support (x86-64, ARM 32/64-bit) optimized for resource-constrained devices like Raspberry Pi 4, with token-based user authentication and tight integration with SEPIA Framework clients.

136 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

136

Forks

Language

Python

License

MIT

Higher-rated alternatives

shibing624/parrots

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成，支持多语言，准确率高

MainRo/deepspeech-server

A testing server for a speech to text service based on coqui.ai

altunenes/parakeet-rs

very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust

thewh1teagle/pyannote-rs

pyannote audio diarization in rust

PaddlePaddle/Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS,...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights