devxiongmao/llm-scorecaster
LLM-Scorecaster is a Python-based system designed to evaluate and analyze LLM-generated responses. It calculates a variety of metric scores (either synchronously or async) for LLM responses against user-persisted inputs, then emits the results. Ideal for NLP researchers and developers looking to assess LLM accuracy and performance with precision.
Stars
—
Forks
—
Language
Python
License
MIT
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/devxiongmao/llm-scorecaster"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google/langfun
OO for LLMs
tanaos/artifex
Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.
vulnerability-lookup/VulnTrain
A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.
DataScienceUIBK/HintEval
HintEvalš”: A Comprehensive Framework for Hint Generation and Evaluation for Questions
microsoft/LMChallenge
A library & tools to evaluate predictive language models.