microsoft/LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

/ 100

Established

Applies rank-decomposition matrices to freeze original weights while training task-specific adapters, reducing trainable parameters from millions to thousands (e.g., 1.5B to 4.7M on DeBERTa) without inference latency. Integrates directly with PyTorch models and Hugging Face transformers like RoBERTa, DeBERTa, and GPT-2, with example implementations for both NLU and NLG tasks. Enables efficient multi-task deployment by storing minimal per-task checkpoints rather than full model copies.

13,320 stars and 207,985 monthly downloads. Used by 4 other packages. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 24 / 25

Maturity 25 / 25

Community 18 / 25

How are scores calculated?

Stars

13,320

Forks

888

Language

Python

License

MIT

Related models

jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

bhavnicksm/vanilla-transformer-jax

JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....

AbdelStark/attnres

Rust implementation of Attention Residuals from MoonshotAI/Kimi

kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with...

sunnynguyen-ai/llm-attention-visualizer

Interactive tool for analyzing attention patterns in transformer models with layer-wise...

Explore Transformer Models

All categories Trending Transformer directory Insights