linto-stt and linto-diarization
The speech recognition API and speaker diarization service are complements, as the diarization service likely processes the output of the speech recognition API to attribute spoken text to individual speakers.
About linto-stt
linto-ai/linto-stt
An automatic speech recognition API
Supports multiple interchangeable STT engines (NeMo, Whisper, Kaldi, Kyutai) deployed across three operational modes—HTTP for batch file processing, WebSocket for real-time streaming, and Celery task queues for async microservices architectures. Built with pluggable engine architecture and optional post-processing via recasepunc models for punctuation/capitalization on untrained outputs. Containerized with single Dockerfile parametrization and GPU acceleration support for compute-intensive backends.
About linto-diarization
linto-ai/linto-diarization
Speaker diarization service
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work