mudler/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference

/ 100

Verified

Built on a modular, backend-agnostic architecture that loads inference engines (llama.cpp, vLLM, MLX, Whisper, Diffusers) as separate processes rather than in-process, enabling lightweight resource management and automatic GPU detection across NVIDIA, AMD, Intel, and Apple Silicon. Exposes OpenAI and Anthropic-compatible REST APIs while supporting advanced features like Model Context Protocol (MCP) for autonomous agents, constrained grammar generation, distributed inference via P2P and RDMA, and realtime speech-to-speech with tool calling via WebRTC.

43,530 stars. Actively maintained with 253 commits in the last 30 days.

No Package No Dependents

Maintenance 25 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

43,530

Forks

3,679

Language

License

MIT

Related models

withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

bentoml/OpenLLM

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

zhudotexe/kani

kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)

Explore Transformer Models

All categories Trending Transformer directory Insights