LlamaFactory and lorax
The first tool, a unified fine-tuning framework, is complementary to the second, a multi-LoRA inference server, as one enables efficient LoRA model creation while the other facilitates serving those fine-tuned models at scale.
About LlamaFactory
hiyouga/LlamaFactory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Supports modular fine-tuning approaches including supervised fine-tuning, reward modeling, and reinforcement learning methods (PPO, DPO, KTO, ORPO), with optimizations like Flash Attention, quantized LoRA, and advanced optimizers (GaLore, BAdam, Muon). Provides both CLI and Gradio web interface for model training and inference, integrating with vLLM/SGLang for OpenAI-compatible API deployment.
About lorax
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work