laelhalawani/glai
glai - GGUF LLAMA AI - Package for simplified model handling and text generation with Llama models quantized to GGUF format. APIs for downloading and loading models automatically, includes a db with models of various scale and quantizations. With this high level API you need one line to load the model and one to generate text completions.
No commits in the last 6 months. Available on PyPI.
Stars
6
Forks
—
Language
Python
License
—
Category
Last pushed
Jan 14, 2024
Monthly downloads
50
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/laelhalawani/glai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
intel/auto-round
🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality...
ModelCloud/GPTQModel
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD...
pytorch/ao
PyTorch native quantization and sparsity for training and inference
Picovoice/picollm
On-device LLM Inference Powered by X-Bit Quantization
NVIDIA/kvpress
LLM KV cache compression made easy