SEPIA-Framework/sepia-stt-server
SEPIA server to support open-source speech recognition via WebSocket connection.
# Technical Summary Full-duplex Python FastAPI server supporting multiple pluggable open-source ASR engines (Vosk, Coqui, Deepspeech, Scribosermo) with standardized WebSocket API for streaming audio and receiving real-time partial/final transcriptions. Features modular architecture enabling per-engine configuration, optional post-processing, speaker identification, grammar constraints, confidence scores, and word timestamps—all configurable on-the-fly via HTTP REST and WebSocket events. Includes Docker multi-architecture support (x86-64, ARM 32/64-bit) optimized for resource-constrained devices like Raspberry Pi 4, with token-based user authentication and tight integration with SEPIA Framework clients.
136 stars. No commits in the last 6 months.
Stars
136
Forks
23
Language
Python
License
MIT
Category
Last pushed
Nov 07, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/SEPIA-Framework/sepia-stt-server"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/parrots
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高
MainRo/deepspeech-server
A testing server for a speech to text service based on coqui.ai
altunenes/parakeet-rs
very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust
thewh1teagle/pyannote-rs
pyannote audio diarization in rust
PaddlePaddle/Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS,...