JoelHJames1/Nexus-Inference-Engine-

NEXUS: Production C++ inference engine for Apple Silicon. Run 400B+ LLMs on your Mac via layer streaming, Metal GPU compute, TurboQuant KV compression, NXF format, MoE routing, and Neural Engine speculative decoding. Faster than AirLLM, more capable than llama.cpp.

29
/ 100
Experimental
No License No Package No Dependents
Maintenance 13 / 25
Adoption 3 / 25
Maturity 1 / 25
Community 12 / 25

How are scores calculated?

Stars

3

Forks

1

Language

C++

License

Last pushed

Apr 09, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-inference/JoelHJames1/Nexus-Inference-Engine-"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.