Rahulkumar010/microDPO

microDPO: A minimalist, pure PyTorch implementation of Direct Preference Optimization. Inspired by nanoGPT, it strips away massive RLHF libraries to reveal the elegant math behind AI alignment. Demystify how LLMs learn human preferences with a single, highly readable file. Train a tiny aligned model on your laptop in minutes.

/ 100

Experimental

No Package No Dependents

Maintenance 13 / 25

Adoption 1 / 25

Maturity 9 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Category

direct-preference-optimization

Last pushed

Mar 16, 2026

Commits (30d)

GitHub

Direct Preference Optimization · 12 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Rahulkumar010/microDPO"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

codelion/pts

Pivotal Token Search

RLHFlow/Directional-Preference-Alignment

Directional Preference Alignment

dannylee1020/openpo

Building synthetic data for preference tuning

DtYXs/Pre-DPO

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

pspdada/Uni-DPO

[ICLR 2026] Official repository of "Uni-DPO: A Unified Paradigm for Dynamic Preference...

Explore LLM Tools

All categories Trending LLM Tool directory Insights