modelscope/MCPBench
The evaluation benchmark on MCP servers
Evaluates MCP servers across Web Search, Database Query, and GAIA tasks by measuring task completion accuracy, latency, and token consumption under consistent LLM/Agent configurations. Supports both local stdio-based servers (launched via npx) and remote SSE-connected servers, with automatic tool detection eliminating manual configuration. Includes curated datasets (600 WebSearch QA pairs, database query benchmarks) and provides standardized evaluation scripts for comparative analysis of implementations like Brave Search and DuckDuckGo.
241 stars. No commits in the last 6 months.
Stars
241
Forks
15
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 03, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/mcp/modelscope/MCPBench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mitulgarg/env-doctor
Debug your GPU, CUDA, and AI stacks across local, Docker, and CI/CD (CLI and MCP server)
SonarSource/sonarqube-mcp-server
SonarQube MCP Server
benzsevern/goldenmatch
Entity resolution toolkit — deduplicate, match, and create golden records. 27 MCP tools on...
cqfn/aibolit-mcp-server
MCP Server for Aibolit Java Static Analyzer: Helping Your AI Agent Identify Hotspots for Refactoring
kevinlin/spec-coding-mcp
An MCP server that brings AI spec-driven development workflow to any AI-powered IDE besides Kiro