pyladiesams/eval-llm-based-apps-jan2025

Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.

38
/ 100
Emerging

This project helps developers build reliable LLM-based applications by providing a framework for continuous evaluation. It takes your LLM application's code and test data, and outputs performance metrics and insights for improvement. Developers, ML engineers, and data scientists working on production AI systems would use this.

No commits in the last 6 months.

Use this if you are developing an LLM-based application and need a robust way to test, evaluate, and monitor its performance throughout its lifecycle.

Not ideal if you are looking for a pre-built, ready-to-deploy LLM application or if you lack basic Python and ML testing knowledge.

LLM development AI testing MLOps application monitoring production AI
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

8

Forks

6

Language

Jupyter Notebook

License

MIT

Last pushed

May 06, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/pyladiesams/eval-llm-based-apps-jan2025"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.