mcp-image-extractor and luma-mcp
These two MCP servers for image analysis are direct competitors, offering similar "image-generation-mcp" capabilities for LLMs but with different underlying vision models and feature sets.
About mcp-image-extractor
ifmelate/mcp-image-extractor
MCP server which allow LLM in agent mode to analyze image whenever it needs
Implements three extraction tools—from local files, URLs, and base64 data—with automatic image resizing to 512x512 pixels to optimize LLM context usage. Built as an MCP server that integrates with Cursor and Claude via stdio transport, enabling AI agents to dynamically fetch and analyze images during multi-step tasks like test result review. Handles playwright screenshots and other image-based workflows where agents need on-demand visual context without pre-loading.
About luma-mcp
JochenYang/luma-mcp
Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash等。为不支持图片理解的 AI 编码模型提供视觉处理能力。
Implements an MCP server with pluggable vision model backends (Zhipu, SiliconFlow, Qwen, Volcengine, Hunyuan), automatically handling image preprocessing including compression, multi-crop tiling for dense text scenes, and format normalization across local files, URLs, and Data URIs. Exposes a single `image_understand` tool that integrates with Claude Desktop, Cline, and Claude Code, with configurable thinking mode and adaptive cropping strategies optimized for code/UI/OCR screenshots.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work