IAAR-Shanghai/UHGEval-dataset

The full pipeline of creating UHGEval hallucination dataset

/ 100

Experimental

This project provides a comprehensive pipeline for creating a dataset specifically designed to evaluate factual 'hallucinations' in AI-generated news continuations. It takes raw news articles, processes them, generates potential AI-written continuations, and then labels which of these continuations contain factual errors, ultimately producing a curated dataset for AI model evaluation. It's intended for researchers or developers working on large language models (LLMs) and natural language generation (NLG) in news contexts.

No commits in the last 6 months.

Use this if you need a structured, pre-processed dataset to rigorously test and improve the factual accuracy of AI models that summarize or extend news content.

Not ideal if you are looking for a tool to generate news articles or summaries directly, or if your primary interest is in evaluating general text generation quality rather than factual accuracy.

AI-evaluation news-analysis natural-language-generation large-language-models dataset-creation

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

vectara/hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

PKU-YuanGroup/Hallucination-Attack

Attack to induce LLMs within hallucinations

amir-hameed-mir/Sirraya_LSD_Code

Layer-wise Semantic Dynamics (LSD) is a model-agnostic framework for hallucination detection in...

NishilBalar/Awesome-LVLM-Hallucination

up-to-date curated list of state-of-the-art Large vision language models hallucinations...

intuit/sac3

Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via...

Explore LLM Tools

All categories Trending LLM Tool directory Insights