sql-machine-learning/elasticdl

Kubernetes-native Deep Learning Framework

48
/ 100
Emerging

Leverages Kubernetes API calls to dynamically manage worker and parameter server pods, enabling fault-tolerant training where job failures don't interrupt execution—workers can be preempted or killed without requiring checkpoint recovery. Supports TensorFlow (Estimator and Keras) and PyTorch with a minimal CLI interface that automatically distributes training across clusters while integrating with Kubernetes priority-based preemption for elastic resource scheduling.

746 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

746

Forks

116

Language

Python

License

MIT

Last pushed

Jan 26, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/sql-machine-learning/elasticdl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.