microsoft/batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
106 stars. No commits in the last 6 months. Available on PyPI.
Stars
106
Forks
5
Language
Python
License
MIT
Category
Last pushed
Aug 14, 2024
Commits (30d)
0
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/microsoft/batch-inference"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac...
b4rtaz/distributed-llama
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM...
armbues/SiLLM
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple...
kolinko/effort
An implementation of bucketMul LLM inference
mrdbourke/mac-ml-speed-test
A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.