Abhi0323/Fine-Tuning-LLaMA-2-with-QLORA-and-PEFT
This project enhances the LLaMA-2 model using Quantized Low-Rank Adaptation (QLoRA) and other parameter-efficient fine-tuning techniques to optimize its performance for specific NLP tasks. The improved model is demonstrated through a Streamlit application, showcasing its capabilities in real-time interactive settings.
Implements 4-bit quantization with LoRA adapters on the "mlabonne/guanaco-llama2-1k" dataset, reducing memory footprint while maintaining task-specific performance through selective parameter updates. Leverages the Hugging Face Hub for model versioning and deployment, with the Streamlit frontend consuming the fine-tuned checkpoint directly for interactive inference without requiring the full 7B parameter model in memory.
No commits in the last 6 months.
Stars
13
Forks
5
Language
Jupyter Notebook
License
—
Category
Last pushed
Apr 18, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Abhi0323/Fine-Tuning-LLaMA-2-with-QLORA-and-PEFT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
adithya-s-k/AI-Engineering.academy
Mastering Applied AI, One Concept at a Time
jax-ml/jax-llm-examples
Minimal yet performant LLM examples in pure JAX
young-geng/scalax
A simple library for scaling up JAX programs
MaximeRobeyns/bayesian_lora
Bayesian Low-Rank Adaptation for Large Language Models