JochenYang/luma-mcp

Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash等。为不支持图片理解的 AI 编码模型提供视觉处理能力。

50
/ 100
Established

Implements an MCP server with pluggable vision model backends (Zhipu, SiliconFlow, Qwen, Volcengine, Hunyuan), automatically handling image preprocessing including compression, multi-crop tiling for dense text scenes, and format normalization across local files, URLs, and Data URIs. Exposes a single `image_understand` tool that integrates with Claude Desktop, Cline, and Claude Code, with configurable thinking mode and adaptive cropping strategies optimized for code/UI/OCR screenshots.

Available on npm.

Maintenance 10 / 25
Adoption 8 / 25
Maturity 22 / 25
Community 10 / 25

How are scores calculated?

Stars

48

Forks

5

Language

TypeScript

License

MIT

Last pushed

Mar 06, 2026

Commits (30d)

0

Dependencies

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mcp/JochenYang/luma-mcp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.