Abhi0323/Fine-Tuning-LLaMA-2-with-QLORA-and-PEFT

This project enhances the LLaMA-2 model using Quantized Low-Rank Adaptation (QLoRA) and other parameter-efficient fine-tuning techniques to optimize its performance for specific NLP tasks. The improved model is demonstrated through a Streamlit application, showcasing its capabilities in real-time interactive settings.

/ 100

Experimental

Implements 4-bit quantization with LoRA adapters on the "mlabonne/guanaco-llama2-1k" dataset, reducing memory footprint while maintaining task-specific performance through selective parameter updates. Leverages the Hugging Face Hub for model versioning and deployment, with the Streamlit frontend consuming the fine-tuned checkpoint directly for interactive inference without requiring the full 7B parameter model in memory.

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 1 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

OptimalScale/LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

adithya-s-k/AI-Engineering.academy

Mastering Applied AI, One Concept at a Time

jax-ml/jax-llm-examples

Minimal yet performant LLM examples in pure JAX

young-geng/scalax

A simple library for scaling up JAX programs

MaximeRobeyns/bayesian_lora

Bayesian Low-Rank Adaptation for Large Language Models

Explore Transformer Models

All categories Trending Transformer directory Insights