ollama-benchmark and LLMeBench
About ollama-benchmark
aidatatools/ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs)
This tool helps you quickly understand the real performance of your local Large Language Models (LLMs) running via Ollama. It takes your existing local LLM setup and provides a clear tokens-per-second metric. AI/ML practitioners, researchers, or anyone experimenting with local LLMs can use this to assess different models and hardware configurations.
About LLMeBench
qcri/LLMeBench
Benchmarking Large Language Models
This framework helps you objectively compare how well different large language models (LLMs) perform on specific language tasks, regardless of their source (like OpenAI or HuggingFace). You provide a dataset and a task (such as sentiment analysis or question answering), and it outputs a detailed report on each model's accuracy and behavior. It's designed for AI researchers, data scientists, and language model evaluators who need to rigorously test and select the best LLM for their application.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work