agentscope-ai/OpenJudge
OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards
# Initialize grader grader = Grader.load("Relevance") # Evaluate response response = { "query": "What is machine learning?", "response": "Machine learning is a subset of AI that enables systems to learn from data.", } score = grader.evaluate(response) print(score) # Output: # { # "name": "Relevance", # "score": 0.95, # "reasoning": "The response directly addresses the query with an accurate definition of machine learning." # } ``` ### Batch Evaluation Example Evaluate multiple responses at scale: ```python # Load test data test_data = [ {"query": "What is AI?", "response": "AI is artificial intelligence."}, {"query": "What is ML?", "response": "Machine learning is
459 stars.
Stars
459
Forks
37
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/agentscope-ai/OpenJudge"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Related agents
StonyBrookNLP/appworld
🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and...
qualifire-dev/rogue
AI Agent Evaluator & Red Team Platform
future-agi/ai-evaluation
Evaluation Framework for all your AI related Workflows
microsoft/WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of...
SparkBeyond/agentune
Tune your AI Agent to best meet its KPI with a cyclic process of analyze, improve and simulate