princeton-nlp/DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624
Enables multi-granularity retrieval across phrases, sentences, passages, and documents using dense vectors indexed over billions of Wikipedia phrases, supporting downstream tasks like entity linking and knowledge-grounded dialogue. Built on transformer-based encoders with pre-trained models available via Hugging Face, it indexes phrase-level representations for real-time retrieval at scale while maintaining flexibility to aggregate results at different semantic levels. The system integrates with open-domain QA pipelines and demonstrates effectiveness in specialized applications including slot filling and document retrieval without requiring task-specific fine-tuning.
606 stars. No commits in the last 6 months.
Stars
606
Forks
75
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 15, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/princeton-nlp/DensePhrases"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ymcui/cmrc2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
thunlp/MultiRD
Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model"
IndexFziQ/KMRC-Papers
A list of recent papers regarding knowledge-based machine reading comprehension.
danqi/rc-cnn-dailymail
CNN/Daily Mail Reading Comprehension Task
ShiZhengyan/StepGame
[AAAI 2022] Dataset and pytorch codes for the paper titled "StepGame: A New Benchmark for Robust...