Structured Data Inference NLP Tools
Datasets and benchmarks for NLI, table understanding, text-to-SQL, and instruction-following tasks involving structured or semi-structured data. Does NOT include general sentiment analysis, classification tasks without structured reasoning components, or commonsense knowledge resources without explicit inference evaluation.
There are 74 structured data inference tools tracked. The highest-rated is ymcui/cmrc2018 at 42/100 with 451 stars.
Get all 74 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=structured-data-inference&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
ymcui/cmrc2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018) |
|
Emerging |
| 2 |
princeton-nlp/DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021:... |
|
Emerging |
| 3 |
thunlp/MultiRD
Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model" |
|
Emerging |
| 4 |
IndexFziQ/KMRC-Papers
A list of recent papers regarding knowledge-based machine reading comprehension. |
|
Emerging |
| 5 |
danqi/rc-cnn-dailymail
CNN/Daily Mail Reading Comprehension Task |
|
Emerging |
| 6 |
declare-lab/CIDER
This repository contains the dataset and the pytorch implementations of the... |
|
Emerging |
| 7 |
maastrichtlawtech/gdsr
🕸️ A graph-augmented dense statute retriever. (EACL 2023) |
|
Emerging |
| 8 |
intfloat/SimKGC
ACL 2022, SimKGC: Simple Contrastive Knowledge Graph Completion with... |
|
Emerging |
| 9 |
zjunlp/MKG_Analogy
[ICLR 2023] Multimodal Analogical Reasoning over Knowledge Graphs |
|
Emerging |
| 10 |
ShiZhengyan/StepGame
[AAAI 2022] Dataset and pytorch codes for the paper titled "StepGame: A New... |
|
Emerging |
| 11 |
shmsw25/AmbigQA
An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous... |
|
Emerging |
| 12 |
GeekDream-x/IDOL
Repo for paper "IDOL: Indicator-oriented Logic Pre-training for Logical... |
|
Emerging |
| 13 |
IndexFziQ/MSMARCO-MRC-Analysis
Analysis on the MS-MARCO leaderboard regarding the machine reading... |
|
Emerging |
| 14 |
utahnlp/knowledge_infotabs
Repository containing code for the NAACL 2021 paper (Incorporating External... |
|
Emerging |
| 15 |
yuweihao/reclor
Code for "ReClor: A Reading Comprehension Dataset Requiring Logical... |
|
Experimental |
| 16 |
XingLuxi/KMRC-Research-Archive
🗂 Research about Knowledge-based Machine Reading Comprehension |
|
Experimental |
| 17 |
phanxuanphucnd/Active-learning-in-NLP
Active learning in NLP |
|
Experimental |
| 18 |
FeiWang96/GTR
[SIGIR 2021] Retrieving Complex Tables with Multi-Granular Graph... |
|
Experimental |
| 19 |
amazon-science/pizza-semantic-parsing-dataset
The PIZZA dataset continues the exploration of task-oriented parsing by... |
|
Experimental |
| 20 |
anshitag/memit_csk
Source repository for Editing Common Sense in Transformers (EMNLP 2023) |
|
Experimental |
| 21 |
webis-de/acl22-revisiting-uncertainty-based-query-strategies-for-active-learning-with-transformers
Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers |
|
Experimental |
| 22 |
marceljahnke/negative-cache
PyTorch Implementation of the Paper "Efficient Training of Retrieval Models... |
|
Experimental |
| 23 |
amazon-science/wqa-multi-sentence-inference
This repository contains code used for our Multi Sentence Inference NAACL'22 paper. |
|
Experimental |
| 24 |
ymcui/expmrc
ExpMRC: Explainability Evaluation for Machine Reading Comprehension |
|
Experimental |
| 25 |
sherlcok314159/ChineseMRC-Data
收集了目前为止中文领域的MRC抽取式数据集 |
|
Experimental |
| 26 |
thunlp/CokeBERT
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced... |
|
Experimental |
| 27 |
acidAnn/semeval2022_task7_starter_kit
:bulb: Starter kit for SemEval 2022 Task 7: Identifying Plausible... |
|
Experimental |
| 28 |
USSiamaboat/polytuplet-loss
A Reverse Approach to Training Reading Comprehension and Logical Reasoning Models |
|
Experimental |
| 29 |
humanlab/rare-class-AL
AL for rare class strategies compared in the paper "Transfer and Active... |
|
Experimental |
| 30 |
ict-bigdatalab/CorpusBrain
CIKM 2022: CorpusBrain: Pre-train a Generative Retrieval Model for... |
|
Experimental |
| 31 |
ai-systems/tg2022task_premise_retrieval
TextGraphs Shared Task on Natural Language Premise Selection |
|
Experimental |
| 32 |
Jordy-VL/uncertainty-bench
Code repository for **Benchmarking Scalable Predictive Uncertainty in Text... |
|
Experimental |
| 33 |
semeval-2026-kclarity/clarity
Code release for KCLarity at SemEval-2026 Task 6: Encoder and Zero-Shot... |
|
Experimental |
| 34 |
Dibyakanti/AutoTNLI-code
This repository contains the official code for the paper : Realistic Data... |
|
Experimental |
| 35 |
testzer0/AmbiQT
Code and Assets for "Benchmarking and Improving Text-to-SQL Generation Under... |
|
Experimental |
| 36 |
psunlpgroup/XSemPLR
Data and code for ACL 2023 paper XSemPLR: Cross-Lingual Semantic Parsing in... |
|
Experimental |
| 37 |
ZeinabAghahadi/Syllogistic-Commonsense-Reasoning
Deductive Commonsense Reasoning |
|
Experimental |
| 38 |
pietrolesci/anchoral
This is the official PyTorch implementation for our NAACL 2024 paper:... |
|
Experimental |
| 39 |
krystalan/Multi-hopRC
:notebook_with_decorative_cover: notes for Multi-hop Reading Comprehension... |
|
Experimental |
| 40 |
minnesotanlp/infoVerse
Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for... |
|
Experimental |
| 41 |
Pzoom522/xANLG
Data and code for "Understanding Linearity of Cross-Lingual Word Embedding... |
|
Experimental |
| 42 |
cognitiveailab/tg2021task
Participant Kit for the TextGraphs-15 Shared Task on Explanation Regeneration |
|
Experimental |
| 43 |
INK-USC/RiddleSense
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic... |
|
Experimental |
| 44 |
phosseini/GisPy
GisPy: A Tool for Measuring Gist Inference Score in Text... |
|
Experimental |
| 45 |
THU-KEG/COPEN
The official code and dataset for EMNLP 2022 paper "COPEN: Probing... |
|
Experimental |
| 46 |
MultimodalGeo/GeoText-1652
An offical repo for ECCV 2024 Towards Natural Language-Guided Drones:... |
|
Experimental |
| 47 |
ZhengZixiang/MRCPapers
Worth-reading paper list and other awesome resources on Machine Reading... |
|
Experimental |
| 48 |
mariomeissner/AmbiNLI
This is the code for the paper "Embracing Ambiguity: Shifting the Training... |
|
Experimental |
| 49 |
yul091/UnBED
Codebase for the ACL 2023 paper: "Uncertainty-Aware Bootstrap Learning for... |
|
Experimental |
| 50 |
MSR-LIT/Splash
Release of SPLASH: Dataset for semantic parse correction with natural... |
|
Experimental |
| 51 |
rycolab/evidence-probing
Code and data for the ACL 2022 paper "Probing as Quantifying Inductive Bias". |
|
Experimental |
| 52 |
royxlead/self-diagnosing-neural-models-python
Self-Diagnosing Neural Networks: models that quantify their own uncertainty... |
|
Experimental |
| 53 |
Advancing-Machine-Human-Reasoning-Lab/transformer-psychometrics
Code to reproduce experiments in our *SEM 2021 Paper |
|
Experimental |
| 54 |
maastrichtlawtech/fusion
🔗 Hybrid retrieval in the legal domain |
|
Experimental |
| 55 |
salesforce/FewXC
Official code and data release for Efficiently Aligned Cross-Lingual... |
|
Experimental |
| 56 |
Raising-hrx/MetGen
An implementation for MetGen: A Module-Based Entailment Tree Generation... |
|
Experimental |
| 57 |
naver/ms-marco-shift
A Fine-Grained Analysis of Distribution Shifts in MSMARCO (MS-Shift).... |
|
Experimental |
| 58 |
LaVi-Lab/C2LEVA
[Findings of ACL 2025] "C2LEVA: Toward Comprehensive and Contamination-Free... |
|
Experimental |
| 59 |
Nativeatom/FRoG
Fuzzy reasoning of Generalized Quantifiers (EMNLP 2024) |
|
Experimental |
| 60 |
megagonlabs/ambignlg
:dog: Data for AmbigNLG: Addressing Task Ambiguity in Instruction for NLG... |
|
Experimental |
| 61 |
fajri91/discourse_probing
Discourse Probing of Pretrained Language Models. In Proceedings of NAACL 2021. |
|
Experimental |
| 62 |
nlp-waseda/dcsg-ja
Dialogue Commonsense Graph in Japanese |
|
Experimental |
| 63 |
megagonlabs/xatu
🕊️ Code and Data for XATU: A Fine-grained Instruction-based Benchmark for... |
|
Experimental |
| 64 |
collapseindex/ci-curation
CI-Guided Data Curation: Using prediction instability to detect label noise.... |
|
Experimental |
| 65 |
gianluigilopardo/anchors_text_theory
Code for the paper "A Sea of Words: An In-Depth Analysis of Anchors for Text... |
|
Experimental |
| 66 |
amazon-science/resource-constrained-naturalized-semantic-parsing
This repository is made public for reproducibility of our recent work on... |
|
Experimental |
| 67 |
zhengyima/Anchors
Source code of CIKM2021 Paper 'Pre-training for Ad-hoc Retrieval: Hyperlink... |
|
Experimental |
| 68 |
XInfoTabS/dataset
The Official dataset for "XINFOTABS: Evaluating Multilingual Tabular Natural... |
|
Experimental |
| 69 |
INK-USC/ER-Test
Code for ER-Test, accepted to the Findings of EMNLP 2022 |
|
Experimental |
| 70 |
putmanmodel/putman-model-paper
Preprint + pseudocode for the PUTMAN Model (relational meaning graphs,... |
|
Experimental |
| 71 |
HKUST-KnowComp/atomic-conceptualization
Code and data for the paper Acquiring and Modelling Abstract Commonsense... |
|
Experimental |
| 72 |
IndexFziQ/IIE-NLP-Eyas-SemEval2021
Code of IIE-NLP-Eyas Team for ReCAM (Task 4) @SemEval2021... |
|
Experimental |
| 73 |
Nativeatom/PRESQUE
The repository for "Pragmatic Reasoning Unlocks Quantifier Semantics for... |
|
Experimental |
| 74 |
dyan-dy/Baidu-LIC2021-MRC
models and codes for baiduAI LIC 2021 MRC tasks, based on paddlenlp |
|
Experimental |