openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

67
/ 100
Established

C++ implementation optimized for Intel hardware, exposing models via gRPC or REST with OpenAI-compatible APIs for text generation, embeddings, image generation, and speech processing. Supports model composition through directed acyclic graph (DAG) pipelines with custom nodes, dynamic batching, and multi-framework model loading (TensorFlow, ONNX, PaddlePaddle). Integrates with KServe and TensorFlow Serving protocols while enabling model storage from local, object storage, or HuggingFace sources, with deployment flexibility across Docker, bare metal, Kubernetes, and Windows.

836 stars. Actively maintained with 35 commits in the last 30 days.

No Package No Dependents
Maintenance 23 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 25 / 25

How are scores calculated?

Stars

836

Forks

241

Language

C++

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

35

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/openvinotoolkit/model_server"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.