LLM Docker Deployments LLM Tools

Docker containerization and deployment solutions for running LLMs, inference servers, and related AI services locally or on networks. Does NOT include general containerization tools, Kubernetes orchestration, or non-LLM Docker projects.

There are 150 llm docker deployments tools tracked. 1 score above 70 (verified tier). The highest-rated is containers/ramalama at 82/100 with 2,640 stars. 4 of the top 10 are actively maintained.

Get all 150 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-docker-deployments&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	containers/ramalama RamaLama is an open-source developer tool that simplifies the local serving...	82	Verified	2,640	Python
2	av/harbor One command brings a complete pre-wired LLM stack with hundreds of services...	69	Established	2,498	TypeScript
3	RunanywhereAI/runanywhere-sdks Production ready toolkit to run AI locally	65	Established	10,245	C++
4	runpod-workers/worker-vllm The RunPod worker template for serving our large language model endpoints....	64	Established	406	Python
5	vtuber-plan/olah Self-hosted huggingface mirror service. 自建huggingface镜像服务。	63	Established	218	Python
6	foldl/chatllm.cpp Pure C++ implementation of several models for real-time chatting on your...	62	Established	831	C++
7	quantalogic/qllm QLLM: A powerful CLI for seamless interaction with multiple Large Language...	55	Established	35	TypeScript
8	eastriverlee/LLM.swift LLM.swift is a simple and readable library that allows you to interact with...	53	Established	829	C++
9	varunvasudeva1/llm-server-docs End-to-end documentation to set up your own local & fully private LLM server...	52	Established	719	—
10	dingodb/dingospeed dingospeed is a self-hosted huggingface mirror service	51	Established	30	Go
11	Scottcjn/llama-cpp-power8 AltiVec/VSX optimized llama.cpp for IBM POWER8	51	Established	47	C
12	lordmathis/llamactl Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.	50	Established	89	Go
13	sangyuxiaowu/LLamaWorker LLamaWorker is a HTTP API server developed based on the LLamaSharp project....	49	Emerging	80	C#
14	liltom-eth/llama2-webui Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere...	46	Emerging	1,945	Jupyter Notebook
15	jlonge4/local_llama This repo is to showcase how you can run a model locally and offline, free...	46	Emerging	298	Python
16	France-Travail/happy_vllm A REST API for vLLM, production ready	46	Emerging	27	Python
17	FarisZahrani/llama-cpp-py-sync Auto-synced CFFI ABI python bindings for llama.cpp with prebuilt wheels...	46	Emerging	3	Python
18	ADT109119/llamacpp-distributed-inference 一個基於 llama.cpp 的分佈式 LLM 推理程式，讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理，使用 Electron...	43	Emerging	71	JavaScript
19	timhagel/MeloTTS-Docker-API-Server A docker image to access MeloTTS through API calls	42	Emerging	56	Python
20	hitomi-team/sukima A ready-to-deploy container for implementing an easy to use REST API to...	42	Emerging	66	Python
21	Mcourtyard/m-courtyard M-Courtyard: Local AI Model Fine-tuning Assistant for Apple Silicon....	41	Emerging	67	TypeScript
22	feiyun0112/Local-LLM-Server quick way to build a private large language model server and provide...	40	Emerging	34	Python
23	cdrage/containerfiles Containerfiles including AI, game servers, bootc and even a rickroll.	40	Emerging	38	Dockerfile
24	icppWorld/icpp_llm on-chain LLMs	40	Emerging	19	C++
25	wsmlby/homl The easiest & fastest way to run LLMs in your home lab	40	Emerging	85	Python
26	gsuuon/ad-llama Structured inference with Llama 2 in your browser	40	Emerging	52	TypeScript
27	ashleykleynhans/runpod-worker-oobabooga RunPod Serverless Worker for Oobabooga Text Generation API for LLMs	39	Emerging	3	Python
28	b-data/jupyterlab-mojo-docker-stack (GPU accelerated) Multi-arch (linux/amd64, linux/arm64/v8) JupyterLab...	39	Emerging	5	Dockerfile
29	john-rocky/EdgeLLM Simple LLM package for ios devices.	36	Emerging	30	Swift
30	nicksavarese/allora-ios An iOS Keyboard Extension that allows for interacting with LLMs directly...	36	Emerging	52	Swift
31	DanielZhangyc/RLLM LLM powered RSS reader	36	Emerging	89	Swift
32	ai-action/ollama-action 🦙 Run Ollama large language models (LLMs) with GitHub Actions.	36	Emerging	22	—
33	abundant-ai/oddish Run Harbor tasks in the cloud	35	Emerging	5	Python
34	ruska-ai/llm-server 🤖 Open-source LLM server (OpenAI, Ollama, Groq, Anthropic) with support for...	35	Emerging	33	TypeScript
35	BlackTechX011/Ollama-in-GitHub-Codespaces Learn all how to run Ollama in GitHub Codespaces for free	35	Emerging	44	Jupyter Notebook
36	Flowm/llm-stack Docker compose config for local and hosted llms with multiple chat interfaces	35	Emerging	11	Python
37	persys-ai/persys Welcome!	34	Emerging	140	—
38	sinfallas/opendevin-docker Run OpenDevin inside Docker	34	Emerging	24	Dockerfile
39	Jewelzufo/granitepi-4-nano Run IBM Granite 4.0 locally on Raspberry Pi 5 with Ollama.This is a...	34	Emerging	10	Shell
40	Scottcjn/llama-cpp-tigerleopard WORLD FIRST: llama.cpp for Mac OS X Tiger & Leopard on PowerPC G4/G5	33	Emerging	25	C++
41	aws-samples/sample-ollama-server Ollama on GPU EC2 instance with Open WebUI web interface and Bedrock access	33	Emerging	25	—
42	AbhinaavRamesh/ollama-local-serve Local LLM infrastructure for distributed AI applications. Serve...	33	Emerging	4	Python
43	ai-action/setup-ollama 🦙 Set up GitHub Actions with Ollama CLI	33	Emerging	12	TypeScript
44	teremterem/litellm-server-boilerplate A lightweight LiteLLM server boilerplate pre-configured with uv and Docker...	32	Emerging	11	Python
45	EvilFreelancer/docker-llama.cpp-rpc Данный проект основан на llama.cpp и компилирует только RPC-сервер, а так же...	32	Emerging	23	Shell
46	heyvaldemar/ollama-traefik-letsencrypt-docker-compose Ollama with Let's Encrypt Using Docker Compose	31	Emerging	23	Shell
47	rgryta/LLM-WSL2-Docker One-click install for WizardLM-13B-Uncensored with oobabooga webui	30	Emerging	21	PowerShell
48	sasha0552/ToriLinux Linux LiveCD for offline AI training and inference.	30	Emerging	19	Jinja
49	mitja/llamatunnel Publish local LLMs and LLM apps on the internet.	30	Emerging	27	Jinja
50	raketenkater/llm-server Smart launcher for llama.cpp / ik_llama.cpp — auto-detects GPUs, optimizes...	29	Experimental	30	Shell
51	azer/llmcat Prepare files and directories for LLM consumption	29	Experimental	78	Shell
52	crowdllama/crowdllama CrowdLlama is a distributed system that leverages the open-source Ollama...	29	Experimental	22	Go
53	asreview/asreview-server-stack Docker compose for setting up ASReview server with authentication	28	Experimental	8	Dockerfile
54	linonetwo/MOSS-DockerFile 用于在 Docker 里运行复旦的 MOSS 语言模型，使用 GradIO 提供 WebUI。	28	Experimental	16	Python
55	toku345/dgx-llm-serve Docker Compose configs for running LLM inference on DGX Spark (TensorRT-LLM...	28	Experimental	2	Python
56	Scottcjn/power8-projects POWER8 Projects - Ubuntu 22.04 build, PSE LLM, Darwin cross-compile	28	Experimental	24	Shell
57	m1ns09/Llama 🌐 Run GGUF models directly in your web browser using JavaScript and...	28	Experimental	2	HTML
58	GURPREETKAURJETHRA/Ollama-UseCases This repo brings numerous use cases from the Open Source Ollama	27	Experimental	4	Python
59	soulteary/docker-yi-runtime 零一万物（34B）的本地运行环境。	27	Experimental	9	Dockerfile
60	alex0dd/llm-app-microservices-template Template for building microservice-based apps with a frontend, backend, LLM...	26	Experimental	5	HTML
61	ivangabriele-archives/docker-llm Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.	26	Experimental	6	Dockerfile
62	codygreen/llm_api_server Lab to demonstrate how to apply an API to an AI model and secure it.	26	Experimental	2	Jupyter Notebook
63	g1ibby/homellm A simple Docker Compose boilerplate for deploying Open WebUI and LiteLLM...	25	Experimental	20	—
64	LianHe-BI/Blackwell-optimized-llama.cpp-Docker-image Blackwell-optimized llama.cpp Docker image – works on all NVIDIA GPUs, but...	25	Experimental	4	—
65	muhac/llm-actions Run LLMs for inference in GitHub Actions - add to your workflow!	25	Experimental	4	Python
66	AnLaVN/AL-Library Java utility library, contain many feature, support to Large Language Model...	25	Experimental	5	Java
67	DataJourneyHQ/list-github-models GitHub action to track GitHub Models	25	Experimental	4	—
68	ivangabriele-archives/docker-functionary Ready-to-deploy Docker image for Functionary LLM served as an OpenAI-Compatible API.	25	Experimental	5	Dockerfile
69	Malax/buildpack-ollama Cloud Native Buildpack that builds an OCI image with Ollama and a large...	25	Experimental	5	Rust
70	wizzard0/llama2.ts Llama2 inference in one TypeScript file	25	Experimental	20	JavaScript
71	OutofAi/ChitChat Modal LLM LLama.cpp based model deployment as part of series of Model as a...	25	Experimental	17	Python
72	JimKw1kX/LLM-C2-Server An AI C2 Server	24	Experimental	3	Python
73	mordang7/LlamaForge The Ultimate Command Center for Local LLMs. A professional-grade GUI for...	24	Experimental	6	JavaScript
74	rookiemann/vllm-windows-build Native Windows build patches for vLLM v0.14.1 — MSVC 2022 + CUDA 12.6, 26...	24	Experimental	2	Python
75	arseniy0924/rpc_manager Web UI for orchestrating distributed llama.cpp RPC GPU clusters with auto...	24	Experimental	2	JavaScript
76	nyo16/llama_cpp_ex Elixir bindings for llama.cpp — run LLMs locally with Metal, CUDA, Vulkan,...	24	Experimental	2	Elixir
77	futursolo/pai Collection of AI Containers - Prebuilt and Ready-to-Use	24	Experimental	—	Dockerfile
78	ggalancs/hfl CLI + API server to download, manage, and run 500K+ HuggingFace models...	24	Experimental	2	Python
79	mysticrenji/ollama-on-kubernetes An attempt to run ollama on kuberenetes	24	Experimental	2	—
80	micbi-dt/lmstudio-docker run LMStudio within a Docker container	23	Experimental	19	Dockerfile
81	alasgarovs/openserv OpenServ is a simple Bash-based CLI tool for managing LLMs in llama.cpp server.	23	Experimental	1	Shell
82	SergiuDeveloper/distributed-llama.cpp Distributed LLM inference across multiple machines. A central server routes...	23	Experimental	1	Go
83	SuppieRK/local-ai-lab Offline-capable, open-source AI home lab notes: practical setups, configs,...	23	Experimental	1	Shell
84	onidahabitual85/llm-server Launch and optimize llama.cpp servers automatically across Linux, macOS, and...	23	Experimental	1	Shell
85	Skyluker4/llama-runpod Docker image to run llama.cpp on runpod.io automatically	23	Experimental	1	Shell
86	Pavloffm/remote-llm-server Run Ollama in Docker. Share local LLMs across your network. GPU-accelerated.	23	Experimental	1	—
87	qnianjinri-del/local-llm-recommender 一键识别电脑硬件，推荐最新适配的开源大模型，并支持一键部署。	22	Experimental	—	Python
88	rjxby/llama-runtime `llama-runtime` is a high-performance inference server designed for local...	22	Experimental	—	C#
89	gsavla6-hue/java-llm-integration Comprehensive Java LLM integration library supporting OpenAI, Anthropic and...	22	Experimental	—	Java
90	Daaboulex/lmstudio-nix LM Studio packaged for NixOS — local LLM inference desktop app and server	22	Experimental	—	Nix
91	llmjava/hf_text_generation Hugging Face Text Generation API client for Java	22	Experimental	1	—
92	EricApgar/llm-server Host an LLM and make it accessible on a network via API.	22	Experimental	—	Python
93	openradx/llm_api_server_mock This is a simple fastapi based server mock that implements the OpenAI API.	22	Experimental	1	Jupyter Notebook
94	clixgvvv/AndroidLLMServerScript 📲 Create a local LLM server on Android using Python and llama.cpp for easy...	22	Experimental	—	Python
95	tdiprima/ollama-orchestrator Self-hosted AI automation: manage Ollama models, deploy Open WebUI in...	22	Experimental	—	Shell
96	MooNyeu/granitepi-4-nano 🔒 Run a large language model locally on your Raspberry Pi 5 with IBM Granite...	22	Experimental	—	—
97	sithukyaw007/local-ai-workload Docker-first, local-first AI workload toolkit for macOS Apple Silicon using...	22	Experimental	—	Shell
98	Logicish/p-lanes A modular wrapper for llama.cpp focused on home-lab scaled hardware,...	22	Experimental	—	Shell
99	abdulazizalmalki-gh/local-ai A simple, self-hosted stack for running AI models locally using llama.cpp...	22	Experimental	—	Shell
100	byang37/llama-runner A lightweight desktop GUI for llama-server — multi-model routing, per-model...	22	Experimental	—	HTML
101	dmeldrum6/Llama-Forge Open source llama.cpp wrapper with server and client	22	Experimental	—	C#
102	gperdrizet/llms-devcontainer Containerized development environment for LLM based projects	22	Experimental	—	Python
103	AiratTop/ollama-self-hosted A simple Docker Compose setup to self-host Ollama and Open WebUI. Run your...	21	Experimental	2	Shell
104	b-data/mojo-docker-stack (GPU accelerated) Multi-arch (linux/amd64, linux/arm64/v8) MAX/Mojo docker...	21	Experimental	2	Dockerfile
105	mdaconta/xlm-eco-api Cross Language Model (LLM/SLM/etc.) Ecosystem API (xlm-eco-api)	20	Experimental	1	Java
106	hoonywise/minerva A private, GPU-accelerated AI stack with Ollama, LangChain, Stable...	20	Experimental	1	Shell
107	zyoung11/lmgo Windows system tray for llama.cpp + ROCm. Optimized for AMD RYZEN AI MAX+...	19	Experimental	—	Go
108	rookiemann/llama-cpp-python-py314-cuda131-wheel GPU-accelerated llama-cpp-python 0.3.16 wheel for Python 3.14 (CUDA 13.1, Windows)	19	Experimental	—	—
109	qianniuspace/movie-detectives-server 骆驼电影侦探社（服务端）	19	Experimental	4	Python
110	mo-arvan/local-llm docker compose configuration file for running Llama-2 or any other language...	18	Experimental	4	Dockerfile
111	yokingma/deepseek-vllm Docker&vLLM官方镜像部署DeepSeek模型，在生产环境中提供类OpenAI接口服务。	17	Experimental	15	—
112	kryoz/llama-strix-halo llama.cpp setup on dedicated AMD Strix Halo machine	16	Experimental	2	—
113	ai-action/ollama-github-action-demo 🦙 Demos of large language models (LLMs) with Ollama in GitHub Actions.	16	Experimental	1	—
114	FlorinAndrei/local-inference-docs Run generative AI locally, on your hardware, for coding and other purposes	16	Experimental	10	—
115	ThomasVitale/llm-images Catalog of OCI images for popular open-source or open Large Language Models.	15	Experimental	16	Dockerfile
116	stlin256/llama-remote A web-based remote control panel for managing llama.cpp instances. Monitor...	15	Experimental	1	TypeScript
117	abhiFSD/llama.cpp-Monitor-Dashboard ⚡ Real-time monitoring dashboard for llama.cpp server — single HTML file,...	15	Experimental	1	HTML
118	somya-droid/Pirate-LLM-Server Run local LLM servers on iPhone with OpenAI-compatible API, Metal GPU...	15	Experimental	1	Swift
119	ebowwa-archive/LLM_telecenter A fastapi wrapper of babca / python-gsmmodem for a waveshare sim7600x. Not...	15	Experimental	6	Python
120	sebicom/llamacpp4j Java wrapper for llama.cpp	15	Experimental	6	Java
121	nishantapatil3/litellm-compose Docker Compose setup for LiteLLM proxy server with PostgreSQL and Prometheus...	14	Experimental	—	—
122	beeracs/Llama Run Llama models in your web browser using JavaScript and WebAssembly....	14	Experimental	—	HTML
123	llmjava/llm4j One API to access Large Language Models in Java	14	Experimental	11	Java
124	andrewginns/LocalLLM Configurations for a locally hosted LLM and applications leveraging it	14	Experimental	1	Makefile
125	VityazevEgor/LLMapi4free LLMapi4free provides a unified API for free access to various large Language...	14	Experimental	3	Java
126	buckyinsfo/homelab-ai-stack Self-hosted AI + GPU server homelab — local LLM inference, vector search,...	14	Experimental	—	Shell
127	Weebaay/local-ai-homelab Déploiement d'un serveur IA local sur VM Ubuntu Server 24.04 avec Ollama et...	14	Experimental	—	—
128	mendhak/local-llm-workspace Private, secure, containerized LLM environment for chat and coding. Using...	14	Experimental	—	—
129	Riju007/dev-knowledge-vault 🧠 My second brain — hands-on engineering notes on Docker, AI, Python and beyond	14	Experimental	—	—
130	chaserbot/chaseworkslab-llm Self-hosted LLM stack (Ollama, Open WebUI, etc.) for the homelab	14	Experimental	—	—
131	cyberguard-ai/local-llm-server A containerized, offline-capable LLM API powered by Ollama. Automatically...	13	Experimental	—	Python
132	nishant-sethi/python-ai-extension-server Python Server to use local LLMs	13	Experimental	5	Python
133	thkox/home-ai-server Home AI Server provides the backend infrastructure for the Home AI system....	12	Experimental	4	Python
134	sinfallas/llm-local-loader-docker docker compose to load ollama, flowise, langfuse, open-web-ui	12	Experimental	3	—
135	57Ajay/model-runner A simple model runner using llama.cpp and huggingface	12	Experimental	1	Go
136	gustavostz/Local-AI-Open-Orca-For-Dummies Local AI Open Orca For Dummies is a user-friendly guide to running Large...	12	Experimental	3	Python
137	FarzamMohammadi/self-hosted-ai-stack Blog resources for building a self-hosted AI infrastructure. Contains all...	11	Experimental	—	JavaScript
138	merlijn/scala-llm-api Basic OpenAI client for Scala	11	Experimental	2	Scala
139	wronai/docker-platform Enterprise-grade secure media storage with AI analysis, role-based access,...	11	Experimental	—	Go
140	AntonSHBK/llm_service A FastAPI-based microservice for interacting with LLM (OpenAI API) with...	11	Experimental	—	Python
141	ai-action/ai-inference-demo AI Inference in GitHub Actions demo	11	Experimental	—	—
142	yeeking/llamacpp-minimal-example Minimal example of using llama cpp as library from cpp	11	Experimental	—	C++
143	aayes89/JavaRNN-LLM An RNN written in pure Java to compete with Transformers	11	Experimental	—	Java
144	theomart/llm-based-api-template 🐣 A template to deploy an LLM based API to Cloud Run, using FastAPI, Docker...	11	Experimental	2	HCL
145	desdeux/llama2odin Llama2.C port in Odin	11	Experimental	—	Odin
146	Doculoom/doculoom-server LLM backed API server	11	Experimental	—	Python
147	abhishekrana/llm-service RESTful service with LLMs (Large Language Models) running locally	10	Experimental	1	Python
148	turtleio/turtle 🐰 shoulda been an app - 🐢	10	Experimental	1	—
149	aryansingla45/flask-llm-ci-cd The app allows users to upload files, which are stored in a dedicated...	10	Experimental	1	Python
150	MrTechyWorker/SmartLLM-Server Implementing a robust client-server architecture from scratch, designed to...	10	Experimental	1	Python

Comparisons in this category

worker-vllm and runpod-worker-oobabooga (64 vs 39)