airllm and Chinese-LLaMA-Alpaca
These are complements: AirLLM provides memory-efficient inference techniques that could optimize the deployment of Chinese-LLaMA-Alpaca models on resource-constrained hardware, while Chinese-LLaMA-Alpaca provides Chinese-adapted model weights and training procedures that AirLLM's quantization and offloading methods could enhance.
About airllm
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
About Chinese-LLaMA-Alpaca
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Extends LLaMA's tokenizer with a dedicated Chinese vocabulary and continues pretraining on Chinese corpus to improve semantic understanding, while Alpaca variants are instruction-tuned for dialogue tasks. Supports seamless integration with major frameworks (transformers, llama.cpp, LangChain, text-generation-webui) and includes quantization pipelines enabling efficient inference on consumer-grade CPUs and GPUs. Provides open-source training scripts and model variants (7B/13B/33B) with specialized Plus and Pro editions optimized for response quality and length.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work