madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
678 stars and 2,273 monthly downloads. Actively maintained with 17 commits in the last 30 days. Available on PyPI.
Stars
678
Forks
84
Language
Python
License
MIT
Category
Last pushed
Mar 10, 2026
Monthly downloads
2,273
Commits (30d)
17
Dependencies
15
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/madroidmaq/mlx-omni-server"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Related tools
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...
taco-group/OpenEMMA
OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.
generative-computing/mellea
Mellea is a library for writing generative programs.