tensorflow/serving

A flexible, high-performance serving system for machine learning models

/ 100

Established

Supports multi-model and multi-version serving with zero-downtime model updates, canary deployments, and A/B testing. Exposes gRPC and REST APIs while featuring a request batching scheduler that groups inference calls for efficient GPU execution with configurable latency bounds. Natively integrates TensorFlow SavedModels but extends to non-TensorFlow models, embeddings, and feature transformations through a modular architecture.

6,349 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

6,349

Forks

2,200

Language

C++

License

Apache-2.0

Compare

serving and simple_tensorflow_serving

Related frameworks

modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

basetenlabs/truss

The simplest way to serve AI/ML models in production

Lightning-AI/LitServe

A minimal Python framework for building custom AI inference servers with full control over...

labmlai/labml

🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱

deepjavalibrary/djl-serving

A universal scalable machine learning model deployment solution

Explore ML Frameworks

All categories Trending ML Framework directory Insights