vllm-project/semantic-router
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Implements signal-driven routing using semantic classification models to intelligently dispatch requests across heterogeneous model ensembles, capturing missing signals from requests, responses, and context. Built on a modular architecture with extensible LoRA-based classifiers and real-time hallucination detection, integrating with vLLM's production stack for optimized inference across cloud, datacenter, and edge deployments.
3,373 stars. Actively maintained with 159 commits in the last 30 days.
Stars
3,373
Forks
568
Language
Go
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
159
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/vllm-project/semantic-router"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.