microsoft/batch-inference

Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.

/ 100

Emerging

106 stars. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 7 / 25

Stars

106

Forks

Language

Python

License

MIT

Category

Last pushed

Aug 14, 2024

Commits (30d)

Dependencies

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/microsoft/batch-inference"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac...

b4rtaz/distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM...

armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple...

kolinko/effort

An implementation of bucketMul LLM inference

mrdbourke/mac-ml-speed-test

A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.