salute-developers/GigaAM

Foundational Model for Speech Recognition Tasks

74
/ 100
Verified

Built on Conformer architecture with 220-240M parameters, GigaAM provides SSL-pretrained encoders (via Wav2vec 2.0 and HuBERT-CTC) fine-tuned with CTC and RNN-T decoders for Russian ASR, plus end-to-end variants supporting punctuation and text normalization. The v3 variant scales pretraining to 700K hours with integrated emotion recognition, achieving 30% WER reduction on new domains and 70:30 win rates against Whisper in independent evaluations. Supports inference via native Python APIs, HuggingFace transformers integration, and ONNX export for edge deployment with optional long-form transcription via pyannote VAD.

504 stars and 2,236 monthly downloads. Actively maintained with 7 commits in the last 30 days. Available on PyPI.

Maintenance 17 / 25
Adoption 18 / 25
Maturity 18 / 25
Community 21 / 25

How are scores calculated?

Stars

504

Forks

76

Language

Python

License

MIT

Last pushed

Feb 12, 2026

Monthly downloads

2,236

Commits (30d)

7

Dependencies

10

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/salute-developers/GigaAM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.