k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

/ 100

Established

Built on ncnn for lightweight neural network inference, it enables streaming ASR, text-to-speech synthesis, and VAD entirely on-device without external dependencies like PyTorch. Provides bindings across eight programming languages (C++, Python, JavaScript, Go, Swift, Kotlin, C#, WebAssembly) and compiles statically for minimal system requirements, making it suitable for embedded and resource-constrained environments from Raspberry Pi to RISC-V platforms.

1,648 stars and 9,383 monthly downloads. Available on PyPI.

Maintenance 6 / 25

Adoption 19 / 25

Maturity 18 / 25

Community 21 / 25

How are scores calculated?

Stars

1,648

Forks

210

Language

C++

License

Apache-2.0

Related tools

FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity...

phuc-nt/my-translator

Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only

pot-app/pot-desktop

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

Blaizzy/mlx-audio-swift

A modular Swift SDK for audio processing with MLX on Apple Silicon

soniqo/speech-swift

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights