mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
Built on a modular, backend-agnostic architecture that loads inference engines (llama.cpp, vLLM, MLX, Whisper, Diffusers) as separate processes rather than in-process, enabling lightweight resource management and automatic GPU detection across NVIDIA, AMD, Intel, and Apple Silicon. Exposes OpenAI and Anthropic-compatible REST APIs while supporting advanced features like Model Context Protocol (MCP) for autonomous agents, constrained grammar generation, distributed inference via P2P and RDMA, and realtime speech-to-speech with tool calling via WebRTC.
43,530 stars. Actively maintained with 253 commits in the last 30 days.
Stars
43,530
Forks
3,679
Language
Go
License
MIT
Category
Last pushed
Mar 13, 2026
Commits (30d)
253
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mudler/LocalAI"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
bentoml/OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)