microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Applies rank-decomposition matrices to freeze original weights while training task-specific adapters, reducing trainable parameters from millions to thousands (e.g., 1.5B to 4.7M on DeBERTa) without inference latency. Integrates directly with PyTorch models and Hugging Face transformers like RoBERTa, DeBERTa, and GPT-2, with example implementations for both NLU and NLG tasks. Enables efficient multi-task deployment by storing minimal per-task checkpoints rather than full model copies.
13,320 stars and 207,985 monthly downloads. Used by 4 other packages. No commits in the last 6 months. Available on PyPI.
Stars
13,320
Forks
888
Language
Python
License
MIT
Category
Last pushed
Dec 17, 2024
Monthly downloads
207,985
Commits (30d)
0
Reverse dependents
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/microsoft/LoRA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
jadore801120/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
bhavnicksm/vanilla-transformer-jax
JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....
AbdelStark/attnres
Rust implementation of Attention Residuals from MoonshotAI/Kimi
kyegomez/SparseAttention
Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with...
sunnynguyen-ai/llm-attention-visualizer
Interactive tool for analyzing attention patterns in transformer models with layer-wise...