jackaduma/ChatGLM-LoRA-RLHF-PyTorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

/ 100

Experimental

140 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 10 / 25

How are scores calculated?

Stars

140

Forks

Language

Python

License

MIT

Category

rlhf-alignment-training

Last pushed

Apr 28, 2023

Commits (30d)

GitHub

RLHF Alignment Training · 106 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jackaduma/ChatGLM-LoRA-RLHF-PyTorch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

agentscope-ai/Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...

huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

PKU-Alignment/align-anything

Align Anything: Training All-modality Model with Feedback

Explore Transformer Models

All categories Trending Transformer directory Insights