UKPLab/PeerQA
Code and Data for PeerQA: A Scientific Question Answering Dataset from Peer Reviews, NAACL 2025
Constructed from peer review questions answered by paper authors, the dataset contains 579 QA pairs spanning document-level scientific texts averaging 12k tokens—designed to evaluate evidence retrieval, answerability classification, and long-context answer generation. The project integrates GROBID 0.8 for PDF-to-text extraction and establishes baselines using BM25 (Pyserini), dense retrievers, and cross-encoder rerankers, with decontextualization techniques shown to improve retrieval across architectures. Available on HuggingFace Datasets with evaluation scripts supporting transformer-based models including DeepSeek-R1 and Qwen variants.
Stars
12
Forks
4
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 02, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/UKPLab/PeerQA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
GrapeCity-AI/gc-qa-rag
A RAG (Retrieval-Augmented Generation) solution Based on Advanced Pre-generated QA Pairs. 基于高级...
Vbj1808/Dokis
Lightweight RAG provenance middleware. Verifies every claim in an LLM response is grounded in a...
Arfazrll/RAG-DocsInsight-Engine
Retrieval Augmented Generation (RAG) engine for intelligent document analysis. integrating LLM,...
pcastiglione99/RAGify-Search
RAGify is designed to enhance search capabilities using Retrieval-Augmented Generation (RAG). By...
Adii2202/RAG-AI-Voice-assistant-
Performing a RAG (Retrieval Augmented Generation) assessment using voice-to-voice query...