Task-Oriented Dialogue Systems NLP Tools

Datasets, frameworks, and evaluation tools for building goal-oriented conversational agents (e.g., task completion, dialogue state tracking, multi-domain dialogue). Does NOT include open-domain chitchat, emotion/sentiment analysis in dialogue, or general conversational AI without task-specific goals.

There are 43 task-oriented dialogue systems tools tracked. 1 score above 70 (verified tier). The highest-rated is gunthercox/chatterbot-corpus at 76/100 with 1,411 stars and 9,123 monthly downloads. 2 of the top 10 are actively maintained.

Get all 43 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=task-oriented-dialogue-systems&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	gunthercox/chatterbot-corpus A multilingual dialog corpus	76	Verified	1,411	Python
2	EdinburghNLP/awesome-hallucination-detection List of papers on hallucination detection in LLMs.	53	Established	1,060	—
3	jfainberg/self_dialogue_corpus The Self-dialogue Corpus - a collection of self-dialogues across music,...	38	Emerging	107	Python
4	jkkummerfeld/irc-disentanglement Dataset and model for disentangling chat on IRC	36	Emerging	58	Python
5	Tomiinek/MultiWOZ_Evaluation Unified MultiWOZ evaluation scripts for the context-to-response task.	35	Emerging	59	Python
6	tae898/multimodal-datasets Multimodal datasets.	33	Emerging	34	Python
7	IBM/permuted-bAbI-dialog-tasks Dataset for 'Learning End-to-End Goal-Oriented Dialog with Multiple Answers'...	32	Emerging	18	—
8	hsgodhia/hred Implements the paper " Building End-To-End Dialogue Systems Using Generative...	31	Emerging	116	Python
9	doheejin/ProTACT This repository is the implementation of the ProTACT architecture,...	31	Emerging	23	Python
10	IBM/modified-bAbI-dialog-tasks 'Learning End-to-End Goal-Oriented Dialog with maximal User task success and...	30	Emerging	10	—
11	anaistack/ai-teacher-test Source code and data for the EDM 2022 paper	29	Experimental	12	Python
12	vaskonov/negochat_corpus Negochat Corpus - a dialogue corpus in the negotiation domain	29	Experimental	9	—
13	Sensente/Security-Attacks-on-LCCTs Security Attacks on LLM-based Code Completion Tools (AAAI 2025)	28	Experimental	21	Python
14	LCS2-IIITD/SPARTA_WSDM2022 This repository contains the code and dataset for our paper titled Speaker...	27	Experimental	55	Python
15	GiovanniTRA/UDCG Code and Data of the paper: "Redefining Retrieval Evaluation in the Era of LLMs"	26	Experimental	12	Python
16	TianboJi/Dialogue-Eval Code and data for paper "Achieving Reliable Human Assessment of Open-Domain...	26	Experimental	8	Python
17	Reason-Wang/DialogueGLP Code for our EACL 2023 findings paper "Global-Local Modeling with...	25	Experimental	4	Python
18	yukyunglee/Awesome-Dialogue-State-Tracking Dialogue State Tracking (DST) Papers, Datasets, Resources 🤩	25	Experimental	196	—
19	chiachienhung/Multi2WOZ Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for...	25	Experimental	3	Python
20	umass-ml4ed/tiktoc Official repo for "Test Case-Informed Knowledge Tracing for Open-ended...	23	Experimental	3	Python
21	bryanwilie/pick PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue...	23	Experimental	2	Python
22	passing2961/EmpGPT-3 Official code for our COLING 2022 paper: In-Context Learning for Empathetic...	23	Experimental	20	Python
23	zengyan-97/Transformer-DST A Generative Dialogue State Tracking Model	22	Experimental	22	Python
24	mrzjy/GenshinDialog Extracting character conversations in Genshin Project	22	Experimental	75	Python
25	convei-lab/BotsTalk 🤖 Code for our EMNLP 2022 paper: "BotsTalk: Machine-sourced Framework for...	20	Experimental	16	Python
26	mrzjy/hoyo_public_wiki_parser Parsing Hoyoverse game text corpus from public wikipedia	20	Experimental	12	Python
27	kaustubhdhole/natural-dont-know Code for the paper: Saying No is An Art: Contextualized Fallback Responses...	19	Experimental	19	JavaScript
28	abhi1nandy2/yesbut_dataset YesBut - Multimodal Satire Comprehension Dataset	19	Experimental	18	Jupyter Notebook
29	nu-dialogue/real-persona-chat RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own...	17	Experimental	63	—
30	UEC-InabaLab/KokoroChat ロールプレイで収集した日本語のカウンセリング対話データセット	17	Experimental	19	—
31	skywalker023/prosocial-dialog 🐥 Code and Dataset for our EMNLP 2022 paper - "ProsocialDialog: A Prosocial...	17	Experimental	65	Python
32	mrzjy/StarrailDialog A project that extracts Honkai: Star Rail text corpus	14	Experimental	35	Python
33	M0gician/RaccoonBench [ACL 2024] Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications	14	Experimental	14	Python
34	TUM-NLPLab-2022/PARL-A-Dialog-System-Framework-with-Prompts-as-Actions-for-Reinforcement-Learning This is the offical repo for the paper "PARL: A Dialog System Framework with...	13	Experimental	8	Python
35	mrzjy/WutheringDialog Extracting character conversations in Wuthering Waves	13	Experimental	5	Python
36	mrzjy/ZZZDialog A project that extracts ZenlessZoneZero text corpus	13	Experimental	6	Python
37	miguel-kjh/Improving-Dialogue-Management A detailed analysis of task-oriented dialogue systems, emphasizing the...	12	Experimental	4	Python
38	valeria-izvoreanu/LLM-Hallucination-Detection-SemEval2024 Semi-supervised pipeline to detect LLM hallucinations. Uses Mistral-7B for...	12	Experimental	1	Jupyter Notebook
39	mrzjy/HonkaiImpact3rdDialog A project that collects Honkai Impact 3rd text corpus	12	Experimental	3	Python
40	zenquiorra/M3LS M3LS : Multi-lingual Multi-modal summarization dataset	11	Experimental	2	Python
41	kimdanny/user-simulation-t5 Official Code for SIGIR 2022 "A Multi-task Based Neural Model to Simulate...	11	Experimental	37	Python
42	evelynkyl/xRAD_multilingual_dialog_systems Codes for master's thesis investigating approaches for building a...	10	Experimental	1	Python
43	minsik-ai/PK-ICR Persona-Knowledge Interactive Multi-Context Retrieval for Grounded Dialogue...	10	Experimental	3	Python

Comparisons in this category

chatterbot-corpus and self_dialogue_corpus (76 vs 38) chatterbot-corpus and negochat_corpus (76 vs 29)