mlx-vlm and vllm-mlx

MLX-VLM provides the core inference and fine-tuning library for vision language models on Apple Silicon, while vllm-mlx wraps that functionality (or similar MLX-based inference) in an OpenAI-compatible server interface, making them complements that can be used together in a stack.

mlx-vlm
81
Verified
vllm-mlx
58
Established
Maintenance 20/25
Adoption 15/25
Maturity 25/25
Community 21/25
Maintenance 22/25
Adoption 10/25
Maturity 5/25
Community 21/25
Stars: 2,287
Forks: 293
Downloads:
Commits (30d): 41
Language: Python
License: MIT
Stars: 579
Forks: 87
Downloads:
Commits (30d): 113
Language: Python
License:
No risk flags
No License No Package No Dependents

About mlx-vlm

Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

This project helps you understand images, audio, and video content by describing or answering questions about them. You provide a visual, audio, or multi-modal input and a question or prompt, and the tool generates a textual response. It's designed for anyone working with multimedia content on a Mac who needs to extract information or generate descriptions.

multimedia-analysis content-understanding image-description audio-analysis document-processing

About vllm-mlx

waybarrios/vllm-mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

This project helps developers and engineers working with AI applications to run large language models and vision-language models on their Apple Silicon Macs much faster. It takes various inputs like text, images, videos, or audio, processes them using different AI models, and produces outputs such as generated text, image descriptions, audio transcriptions, or embeddings. It's designed for anyone building or experimenting with AI solutions who needs to deploy models locally on Apple hardware.

AI-development machine-learning-engineering LLM-deployment multimodal-AI Apple-Silicon-optimization

Scores updated daily from GitHub, PyPI, and npm data. How scores work