whisperX-FastAPI and whisper.api
These are competitors offering alternative API wrapper implementations around Whisper-based speech-to-text models, with the first built on WhisperX (optimized for speaker diarization) and the second on a fine-tuned Whisper variant, serving similar use cases of exposing ASR functionality via HTTP endpoints.
About whisperX-FastAPI
pavelzbornik/whisperX-FastAPI
FastAPI service on top of WhisperX
Provides modular speech processing services including transcription, speaker diarization, and transcript alignment via individual endpoints, with async SQLAlchemy task persistence supporting SQLite or PostgreSQL backends. Configurable Whisper model selection and compute precision (float16/int8) enables deployment across CUDA and CPU environments. Includes Kubernetes-ready health probes and Swagger UI documentation for integration into broader audio/video processing pipelines.
About whisper.api
innovatorved/whisper.api
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.
Implements asynchronous transcription with built-in concurrency control and request queuing via a FastAPI-based HTTP API, supporting quantized model variants (tiny.en.q5, base.en.q5) for efficient inference. Includes ffmpeg audio processing, token-based authentication for user access management, and Docker containerization for self-hosted deployment. The architecture uses uvicorn as the ASGI server with configurable parallel job limits via environment variables.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work