mcp-tool-bench/MCPToolBenchPP
MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability
Comprehensive benchmark for evaluating LLM tool-use capabilities across 45+ MCP server categories (browser automation, file systems, search, maps, payments, finance) with 4k+ instances covering single and multi-step tool calls. Evaluation uses standardized metrics (AST and Pass@K) with an LLM-as-judge approach, supporting major models like GPT-4o, Qwen, and Claude across multilingual scenarios. Integrates with MCP ecosystem servers and the OneKey MCP Router for simplified API access to commercial services like Google Maps and Perplexity.
Stars
41
Forks
8
Language
Python
License
—
Category
Last pushed
Dec 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/mcp/mcp-tool-bench/MCPToolBenchPP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
toolsdk-ai/toolsdk-mcp-registry
MCPSDK.dev(ToolSDK.ai)'s Awesome MCP Servers and Packages Registry and Database with Structured...
Dicklesworthstone/mcp_agent_mail
Asynchronous coordination layer for AI coding agents: identities, inboxes, searchable threads,...
last9/last9-mcp-server
Last9 MCP Server
burugo/one-mcp
A centralized reverse-proxy platform for MCP servers — manage, group, and export as Skills from...
LSTM-Kirigaya/openmcp-client
All in one vscode plugin for mcp developer