kolinko/effort
An implementation of bucketMul LLM inference
227 stars. No commits in the last 6 months.
Stars
227
Forks
16
Language
Swift
License
MIT
Category
Last pushed
Jul 01, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kolinko/effort"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac...
b4rtaz/distributed-llama
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM...
armbues/SiLLM
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple...
microsoft/batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
mrdbourke/mac-ml-speed-test
A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.