JoelHJames1/Nexus-Inference-Engine-

NEXUS: Production C++ inference engine for Apple Silicon. Run 400B+ LLMs on your Mac via layer streaming, Metal GPU compute, TurboQuant KV compression, NXF format, MoE routing, and Neural Engine speculative decoding. Faster than AirLLM, more capable than llama.cpp.

/ 100

Experimental

No License No Package No Dependents

Maintenance 13 / 25

Adoption 3 / 25

Maturity 1 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

C++

License

—

Last pushed

Apr 09, 2026

Commits (30d)

GitHub

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-inference/JoelHJames1/Nexus-Inference-Engine-"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

nordwestt/ollama-ai-provider-v2

Vercel AI Provider for running LLMs locally using Ollama

OEvortex/Webscout

Webscout is the all-in-one search and AI toolkit you need. Discover insights with Yep.com,...

superagentxai/superagentx

Move from idea to production in hours with policy-driven autonomous AI agents. Unified Control...

ArvinLovegood/go-stock

🦄🦄🦄AI赋能股票分析：AI加持的股票分析/选股工具。股票行情获取，AI热点资讯分析，AI资金/财务分析，涨跌报警推送。支持A股，港股，美股。支持市场整体/个股情绪分析，AI辅助选股等。数据全部...

kubernetes-sigs/lws

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication

Explore LLM Inference Engines

All categories Trending LLM Inference directory Insights