Maluuba/nlg-eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Implements nine metrics spanning n-gram overlap (BLEU, METEOR, ROUGE, CIDEr, SPICE) and semantic similarity approaches (SkipThought, GloVe embeddings, greedy matching). The toolkit provides both CLI and Python APIs (functional and object-oriented) for corpus-level and sentence-level evaluation, with pre-trained models and embeddings downloaded automatically during setup.
1,391 stars. No commits in the last 6 months.
Stars
1,391
Forks
227
Language
Python
License
—
Category
Last pushed
Aug 20, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Maluuba/nlg-eval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google/langfun
OO for LLMs
tanaos/artifex
Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.
vulnerability-lookup/VulnTrain
A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.
DataScienceUIBK/HintEval
HintEvalš”: A Comprehensive Framework for Hint Generation and Evaluation for Questions
microsoft/LMChallenge
A library & tools to evaluate predictive language models.