mlx-vlm and vllm-mlx
MLX-VLM provides the core inference and fine-tuning library for vision language models on Apple Silicon, while vllm-mlx wraps that functionality (or similar MLX-based inference) in an OpenAI-compatible server interface, making them complements that can be used together in a stack.
About mlx-vlm
Blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
This project helps you understand images, audio, and video content by describing or answering questions about them. You provide a visual, audio, or multi-modal input and a question or prompt, and the tool generates a textual response. It's designed for anyone working with multimedia content on a Mac who needs to extract information or generate descriptions.
About vllm-mlx
waybarrios/vllm-mlx
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
This project helps developers and engineers working with AI applications to run large language models and vision-language models on their Apple Silicon Macs much faster. It takes various inputs like text, images, videos, or audio, processes them using different AI models, and produces outputs such as generated text, image descriptions, audio transcriptions, or embeddings. It's designed for anyone building or experimenting with AI solutions who needs to deploy models locally on Apple hardware.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work