salute-developers/GigaAM
Foundational Model for Speech Recognition Tasks
Built on Conformer architecture with 220-240M parameters, GigaAM provides SSL-pretrained encoders (via Wav2vec 2.0 and HuBERT-CTC) fine-tuned with CTC and RNN-T decoders for Russian ASR, plus end-to-end variants supporting punctuation and text normalization. The v3 variant scales pretraining to 700K hours with integrated emotion recognition, achieving 30% WER reduction on new domains and 70:30 win rates against Whisper in independent evaluations. Supports inference via native Python APIs, HuggingFace transformers integration, and ONNX export for edge deployment with optional long-form transcription via pyannote VAD.
504 stars and 2,236 monthly downloads. Actively maintained with 7 commits in the last 30 days. Available on PyPI.
Stars
504
Forks
76
Language
Python
License
MIT
Category
Last pushed
Feb 12, 2026
Monthly downloads
2,236
Commits (30d)
7
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/salute-developers/GigaAM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
NotAbhinavGamerz/emotion-aware-automatic-speech-recognition
🎤 Enhance speech recognition by detecting emotions in spoken language, combining OpenAI's...
jsugg/ser
The AI-powered ser Python package is a tool for recognizing and analyzing emotions in speech....
saky-semicolon/Emotion-Aware-AI-Support-System
A smart AI-powered platform that detects emotions from student voice input, classifies their...
AkishinoShiame/Chinese-Speech-Emotion-Datasets
Datasets of A Deep Convolutional Neural Network Based Virtual Elderly Companion Agent.