liushunyu/awesome-direct-preference-optimization
A Survey of Direct Preference Optimization (DPO)
Curates 250+ peer-reviewed papers organized by a novel taxonomy that decomposes DPO methodologies across four dimensions: data strategy, learning framework, constraint mechanisms, and model properties. Provides systematic categorization of DPO variations spanning data quality and preference feedback approaches, learning paradigms and objectives, reference model constraints and safety mechanisms, and generation/optimization properties. Bridges foundational DPO work with recent extensions including heterogeneous preference handling, dynamic weighting schemes, and robustness improvements.
No commits in the last 6 months.
Stars
91
Forks
—
Language
—
License
—
Category
Last pushed
Jul 04, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/liushunyu/awesome-direct-preference-optimization"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
codelion/pts
Pivotal Token Search
RLHFlow/Directional-Preference-Alignment
Directional Preference Alignment
dannylee1020/openpo
Building synthetic data for preference tuning
DtYXs/Pre-DPO
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Rahulkumar010/microDPO
microDPO: A minimalist, pure PyTorch implementation of Direct Preference Optimization. Inspired...