alphacep/vosk

VOSK Speech Recognition Toolkit

61
/ 100
Established

Audio fingerprinting and LSH-based indexing enable training on massive speech datasets (100k+ hours) without neural networks, with incremental model improvement through direct sample addition. The system segments audio into chunks, stores them in a hash-indexed database for fast lookup during decoding, and integrates with Kaldi for phoneme alignment and segmentation. Supports lifelong learning paradigms with built-in verification tools to identify and correct recognition gaps.

493 stars and 335,415 monthly downloads. Used by 6 other packages. No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 0 / 25
Adoption 25 / 25
Maturity 18 / 25
Community 18 / 25

How are scores calculated?

Stars

493

Forks

56

Language

C

License

Apache-2.0

Last pushed

Jul 13, 2022

Monthly downloads

335,415

Commits (30d)

0

Dependencies

5

Reverse dependents

6

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/alphacep/vosk"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.