rungalileo/hallucination-index
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
Evaluates 22 models across three RAG scenarios (short/medium/long context) using ChainPoll—a multi-polling chain-of-thought technique—to quantify hallucinations and contextual adherence. Tests both open and closed-source models against variable context lengths (5k-100k tokens) and prompting strategies like Chain-of-Note. Includes custom LLM-based evaluation for factual accuracy and position-bias analysis across 10,000 domain-specific documents.
116 stars. No commits in the last 6 months.
Stars
116
Forks
9
Language
—
License
—
Category
Last pushed
Jul 28, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/rungalileo/hallucination-index"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
onestardao/WFGY
WFGY: open-source reasoning and debugging infrastructure for RAG and AI agents. Includes the...
KRLabsOrg/verbatim-rag
Hallucination-prevention RAG system with verbatim span extraction. Ensures all generated content...
iMoonLab/Hyper-RAG
"Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation"...
frmoretto/clarity-gate
Stop LLMs from hallucinating your guesses as facts. Clarity Gate is a verification protocol for...
chensyCN/LogicRAG
Source code of LogicRAG at AAAI'26.