kulsoom-abdullah/Qwen2-VL-Audio-Adapter

Architecture grafting: Extending Qwen2-VL with Whisper encoder for speech recognition (WER 3.6%, 18 GPU-hrs).

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

Stars

—

Forks

—

Language

Jupyter Notebook

License

—

Category

Last pushed

Mar 05, 2026

Commits (30d)

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kulsoom-abdullah/Qwen2-VL-Audio-Adapter"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

biodatlab/thonburian-whisper

Thonburian Whisper: Open models for fine-tuned Whisper in Thai. Try our demo on Huggingface space:

Arkapravo-Ghosh/speech-to-text

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI

haiodo/oaitt

An OpenAI compatible transcriber using transformers and whisperx.

purvanshjoshi/IndiVoice-DeepASR

Deep Learning framework for Indian-accented Speech-to-Text using Whisper and LoRA. Includes...

boned-fruitwood759/whisperx-asr-with-fastapi

🎤 Enable real-time speech recognition with WhisperX using FastAPI for efficient, scalable audio...