akivasolutions/tightwad

Pool your CUDA + ROCm GPUs into one OpenAI-compatible API. Speculative decoding proxy gives you 2-3x faster inference — for free, using hardware you already own. Stop renting GPU clouds. Be a tightwad.

/ 100

Established

Available on PyPI.

Maintenance 13 / 25

Adoption 9 / 25

Maturity 18 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Category

apple-silicon-llm-inference

Last pushed

Mar 28, 2026

Monthly downloads

330

Commits (30d)

Dependencies

GitHub PyPI

Apple Silicon LLM Inference · 55 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/akivasolutions/tightwad"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Related tools

jundot/omlx

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the...

jordanhubbard/nanolang

A tiny experimental language designed to be targeted by coding LLMs

waybarrios/vllm-mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models...

josStorer/RWKV-Runner

A RWKV management and startup tool, full automation, only 8MB. And provides an interface...

petrukha-ivan/mlx-swift-structured

Structured output generation in Swift

Explore LLM Tools

All categories Trending LLM Tool directory Insights