MSWagner/qwen-lora-grpo-letter-counting
Fine-tuning Qwen2.5-3B-Instruct model with LoRa (Low-Rank Adaptation) and Group Relative Policy Optimization (GRPO)
Stars
—
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 08, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/MSWagner/qwen-lora-grpo-letter-counting"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
daekeun-ml/genai-ko-LLM
This hands-on lab walks you through a step-by-step approach to efficiently serving and...
GURPREETKAURJETHRA/Llama-3-ORPO-Fine-Tuning
Llama 3 ORPO Fine Tuning on A100 in Colab Pro.
ramalamadingdong/onnx-rubikpi
ONNX LLM runtime on RUBIK-Pi with Gemma 1B and Llama 3.2 1B
keanteng/sesame-csm-elise
Fine-Tuning Sesame CSM Wth Elise. Enjoy the voice ( ̄︶ ̄)↗
sukanyabag/Finetuning-Qwen2-7B-VQA-on-Radiology-Scans
This repository is doing the finetuning of the Qwen2 7B VLM for performing VQA (Visual Question...