Llm Docker Deployments Transformer Models

There are 11 llm docker deployments models tracked. The highest-rated is beehive-lab/GPULlama3.java at 48/100 with 238 stars.

Get all 11 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-docker-deployments&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	beehive-lab/GPULlama3.java GPU-accelerated Llama3.java inference in pure Java using TornadoVM.	48	Emerging	238	Java
2	gitkaz/mlx_gguf_server This is a FastAPI based LLM server. Load multiple LLM models (MLX or...	43	Emerging	17	Python
3	srgtuszy/llama-cpp-swift Swift bindings for llama-cpp library	37	Emerging	67	Swift
4	RhinoDevel/mt_llm Pure C wrapper library to use llama.cpp with Linux and Windows as simple as...	33	Emerging	14	C++
5	JackZeng0208/llama.cpp-android-tutorial llama.cpp tutorial on Android phone	33	Emerging	155	—
6	dougeeai/llama-cpp-python-wheels Pre-built wheels for llama-cpp-python across platforms and CUDA versions	30	Emerging	40	—
7	awinml/llama-cpp-python-bindings Run fast LLM Inference using Llama.cpp in Python	30	Emerging	19	Jupyter Notebook
8	lennartpollvogt/ollama-instructor Python library for the instruction and reliable validation of structured...	26	Experimental	77	Python
9	nicholasyager/llama-cpp-guidance A guidance compatibility layer for llama-cpp-python	19	Experimental	36	Python
10	thansen0/fastllm.cpp A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.	16	Experimental	11	C++
11	frost-beta/llama2-high-level-cpp Inference Llama2 with High-Level C++.	14	Experimental	11	C