Local LLM Deployment Transformer Models

Tools and resources for running, hosting, and serving open-source LLMs locally or on private infrastructure without cloud dependencies. Includes deployment platforms, free API gateways, optimization guides, and access control for self-hosted models. Does NOT include model training, fine-tuning frameworks, or cloud-based LLM services.

There are 245 local llm deployment models tracked. 4 score above 70 (verified tier). The highest-rated is withcatai/node-llama-cpp at 79/100 with 1,942 stars and 4,219,393 monthly downloads. 9 of the top 10 are actively maintained.

Get all 245 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=local-llm-deployment&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	withcatai/node-llama-cpp Run AI models locally on your machine with node.js bindings for llama.cpp....	79	Verified	1,942	TypeScript
2	ludwig-ai/ludwig Low-code framework for building custom LLMs, neural networks, and other AI models	75	Verified	11,657	Python
3	bentoml/OpenLLM Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible...	74	Verified	12,161	Python
4	mudler/LocalAI :robot: The free, Open Source alternative to OpenAI, Claude and others....	70	Verified	43,530	Go
5	SciSharp/LLamaSharp A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.	68	Established	3,572	C#
6	zhudotexe/kani kani (カニ) is a highly hackable microframework for tool-calling language...	67	Established	599	Python
7	mostlygeek/llama-swap Reliable model swapping for any local OpenAI/Anthropic compatible server -...	65	Established	2,775	Go
8	Michael-A-Kuykendall/shimmy ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF +...	60	Established	3,793	Rust
9	UbiquitousLearning/mllm Fast Multimodal LLM on Mobile Devices	60	Established	1,429	C++
10	kaito-project/aikit 🏗️ Fine-tune, build, and deploy open-source LLMs easily!	60	Established	512	Go
11	mybigday/llama.rn React Native binding of llama.cpp	60	Established	851	C++
12	cheahjs/free-llm-api-resources A list of free LLM inference resources accessible via API.	54	Established	15,475	Python
13	sgl-project/ome Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU...	53	Established	393	Go
14	floneum/floneum Instant, controllable, local pre-trained AI models in Rust	53	Established	2,153	Rust
15	Mobile-Artificial-Intelligence/llama_sdk lcpp is a dart implementation of llama.cpp used by the mobile artificial...	52	Established	115	C++
16	tattn/LocalLLMClient Swift package to run local LLMs on iOS, macOS, Linux	50	Established	168	Swift
17	Strvm/meta-ai-api Llama 3 API 70B & 405B (MetaAI Reverse Engineered)	50	Established	396	Python
18	mukel/llama3.java Practical Llama 3 inference in Java	49	Emerging	800	Java
19	guinmoon/LLMFarm llama and other large language models on iOS and MacOS offline using GGML library.	48	Emerging	1,994	C
20	mirpo/fastapi-gen Build LLM-enabled FastAPI applications without build configuration.	48	Emerging	11	Python
21	belladoreai/llama3-tokenizer-js JS tokenizer for LLaMA 3 and LLaMA 3.1	48	Emerging	117	JavaScript
22	nekomeowww/ollama-operator 🚢 Yet another operator for running large language models on Kubernetes with...	48	Emerging	234	Go
23	guinmoon/llmfarm_core.swift Swift library to work with llama and other large language models.	47	Emerging	278	C++
24	tairov/llama2.mojo Inference Llama 2 in one file of pure 🔥	47	Emerging	2,119	Mojo
25	mfoud444/ollamafreeapi OllamaFreeAPI: Free Distributed API for Ollama LLMs Public gateway to our...	47	Emerging	101	Python
26	tjake/Jlama Jlama is a modern LLM inference engine for Java	46	Emerging	1,259	Java
27	BeRo1985/pasllm PasLLM - LLM inference engine in Object Pascal (synced from my private work...	46	Emerging	76	Pascal
28	local-ai-zone/local-ai-zone.github.io Discover the Best AI Models for Your PC	45	Emerging	20	HTML
29	yoshoku/llama_cpp.rb llama_cpp.rb provides Ruby bindings for llama.cpp	45	Emerging	232	C
30	camenduru/text-generation-webui-colab A colab gradio web UI for running Large Language Models	43	Emerging	2,093	Jupyter Notebook
31	sammcj/ingest Parse files (e.g. code repos) and websites to clipboard or a file for...	43	Emerging	367	Go
32	LM-Kit/lm-kit-net-samples .NET samples for LM-Kit.NET	43	Emerging	38	C#
33	nova-land/gbnf-compiler Plug n Play GBNF Compiler for llama.cpp	42	Emerging	28	Python
34	ngxson/wllama WebAssembly binding for llama.cpp - Enabling on-browser LLM inference	42	Emerging	1,013	TypeScript
35	fboulnois/llama-cpp-docker Run llama.cpp in a GPU accelerated Docker container	41	Emerging	63	Dockerfile
36	jmont-dev/ollama-hpp Modern, Header-only C++ bindings for the Ollama API.	41	Emerging	213	C++
37	hybridgroup/yzma Go with your own intelligence - Go applications that directly integrate...	41	Emerging	350	Go
38	soulteary/docker-llama2-chat Play LLaMA2 (official / 中文版 / INT4 / llama2.cpp) Together! ONLY 3 STEPS! (...	40	Emerging	538	Python
39	withcaer/curtana Simplified zero-cost wrapper over llama.cpp powered by the lama-cpp-2 Crate.	39	Emerging	2	Rust
40	Archimedes1618/Madlab Madlab is an advanced AI development studio designed to streamline the...	39	Emerging	11	TypeScript
41	cocktailpeanut/dalai The simplest way to run LLaMA on your local machine	38	Emerging	12,980	CSS
42	sobelio/llm-chain `llm-chain` is a powerful rust crate for building chains in large language...	38	Emerging	1,593	Rust
43	donderom/llm4s Scala 3 bindings for llama.cpp 🦙	38	Emerging	65	Scala
44	absadiki/pyllamacpp Python bindings for llama.cpp	37	Emerging	68	C++
45	iaalm/llama-api-server A OpenAI API compatible REST server for llama.	37	Emerging	209	Python
46	mdrokz/rust-llama.cpp LLama.cpp rust bindings	37	Emerging	416	Rust
47	loong64/llama.cpp LLM inference in C/C++	37	Emerging	3	—
48	openjlc/riscv64-library Some of the libraries (docs) on the RISCV64 architecture are easy for users...	36	Emerging	69	—
49	gitctrlx/llama.go Llama from scratch in Go.	36	Emerging	104	Go
50	nerve-sparks/iris_android IRIS is an android app for interfacing with GGUF / llama.cpp models locally.	36	Emerging	267	Kotlin
51	nuhmanpk/quick-llama Run Ollama models on Google Colab	35	Emerging	4	Python
52	LLukas22/llm-rs-python Unofficial python bindings for the rust llm library. 🐍❤️🦀	35	Emerging	76	Python
53	gotzmann/llama.go llama.go is like llama.cpp in pure Golang!	35	Emerging	1,398	Go
54	diogok/llama.cpp.zig A build.zig for llama.cpp	35	Emerging	1	Zig
55	loong64/ollama Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other...	35	Emerging	9	Dockerfile
56	KolosalAI/kolosal-server Kolosal AI is an OpenSource and Lightweight alternative to Ollama to run...	35	Emerging	13	C++
57	mybigday/llama.node Node.js binding of llama.cpp	35	Emerging	19	C++
58	phronmophobic/llama.clj Run LLMs locally. A clojure wrapper for llama.cpp.	35	Emerging	173	Clojure
59	developer239/llama.cpp-ts llama.cpp 🦙 LLM inference in TypeScript	35	Emerging	3	C++
60	KolosalAI/kolosal-cli Super lightweight Ollama + Qwen Code alternative to run Llama 3.3,...	34	Emerging	466	TypeScript
61	fardjad/node-llmatic Use self-hosted LLMs with an OpenAI compatible API	34	Emerging	64	TypeScript
62	eugenehp/bitnet-cpp-rs Rust bindings for bitnet.cpp based on llama-cpp-4	34	Emerging	15	Rust
63	KolosalAI/Kolosal Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run...	34	Emerging	440	C++
64	BodhiSearch/BodhiApp Run Open Source/Open Weight LLMs locally with OpenAI compatible APIs	34	Emerging	132	Rust
65	trrahul/llama2.cs Inference Llama 2 in one file of pure C#	33	Emerging	102	C#
66	iverly/llamafile-docker Distribute and run llamafile/LLMs with a single docker image.	33	Emerging	74	Dockerfile
67	cgbur/llama2.zig Inference Llama 2 in one file of pure Zig	33	Emerging	211	Zig
68	dirmacs/lancor A Rust client library for llama.cpp's OpenAI-compatible API server	32	Emerging	2	Rust
69	hpretila/llama.net .NET wrapper for LLaMA.cpp for LLaMA language model inference on CPU. 🦙	32	Emerging	58	C#
70	belladoreai/llama-tokenizer-js JS tokenizer for LLaMA 1 and 2	32	Emerging	363	JavaScript
71	anthonyfoust/ai-stack-homelab Complete AI automation stack optimized for Mac Mini M4, but can work in...	32	Emerging	7	Shell
72	mdegans/drama_llama Yet another `llama.cpp` Rust wrapper	32	Emerging	12	Rust
73	amin-tehrani/ollama-colab Serve Ollama LLMs on Google Colab (free plan) using Ngrok	31	Emerging	26	Jupyter Notebook
74	jaco-bro/MLX.zig MLX.zig: Phi-4, Llama 3.2, and Whisper in Zig	31	Emerging	33	Zig
75	Kagamma/llama-pas Free Pascal bindings for llama.cpp	31	Emerging	23	Pascal
76	Thrasher-Software/sigil A local-first LLM development studio. Build, test, and customize inference...	30	Emerging	17	CSS
77	Agora-Lab-AI/Atom a suite of finetuned LLMs for atomically precise function calling 🧪	30	Emerging	17	Python
78	SeungyounShin/Llama2-Code-Interpreter Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet	30	Emerging	685	Python
79	FlatlinerDOA/PerceptivePyro Run and train Transformer based Large Language Models (LLMS) natively in...	30	Emerging	24	C#
80	adalkiran/llama-nuts-and-bolts A holistic way of understanding how Llama and its components run in...	30	Emerging	317	Go
81	openshieldai/openshield OpenShield is a new generation security layer for AI models	30	Emerging	84	Go
82	K024/llm-sharp Language models in C#	30	Emerging	50	C#
83	abhisheknair10/llama3.cu Lightweight Llama 3 8B Inference Engine in CUDA C	29	Experimental	54	Cuda
84	dravenk/ollama-zig Ollama Zig library	29	Experimental	35	Zig
85	trzy/llava-cpp-server LLaVA server (llama.cpp).	29	Experimental	183	C++
86	dev-sufyaan/Nexlify Unified API platform for free access to enterprise-grade AI models from...	29	Experimental	13	Python
87	Aloereed/llama.cpp-server-ohos Llama.cpp server for OpenHarmony	29	Experimental	9	C++
88	c0sogi/llama-api An OpenAI-like LLaMA inference API	29	Experimental	113	Python
89	sashazykov/json-repair-rb A simple Ruby gem designed to repair broken JSON strings	29	Experimental	10	Ruby
90	lrusso/llama3pure Three inference engines for Llama 3: pure C for desktop systems, pure...	29	Experimental	21	HTML
91	nikolaydubina/llama2.go LLaMA-2 in native Go	28	Experimental	194	Go
92	hoof-ai/hoof "Just hoof it!" - A spotlight like interface to Ollama	28	Experimental	63	Rust
93	saddam213/LLamaStack ASP.NET Core Web, WebApi & WPF implementations for LLama.cpp & LLamaSharp	28	Experimental	60	C#
94	leftmove/cria Run LLMs locally with as little friction as possible.	28	Experimental	121	Python
95	andreiramani/jadi4llamacpp Just another drop in for llama.cpp	28	Experimental	1	—
96	OneInterface/realtime-bakllava llama.cpp with BakLLaVA model describes what does it see	27	Experimental	379	Python
97	fermyon/ai-examples A collection of serverless apps that show how Fermyon's Serverless AI...	27	Experimental	50	Rust
98	chelsea0x3b/llama-dfdx LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!	27	Experimental	111	Rust
99	yfedoseev/llmkit Production-grade LLM client - Rust, Python, TypeScript. 100+ providers,...	27	Experimental	12	Rust
100	moritztng/fltr Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.	27	Experimental	387	Rust
101	maifeeulasad/LocalLLaMA 📚 LocalLLaMA Archive — Community-powered static archive for r/LocalLLaMA	27	Experimental	8	TypeScript
102	5aharsh/collama Run Ollama LLM models in Google Colab for free	27	Experimental	38	Jupyter Notebook
103	daskol/llama.py Python bindings to llama.cpp	26	Experimental	27	C
104	zTgx/llmweb-rs Webpage to structured data in Rust & LLM	26	Experimental	16	Rust
105	zerob13/modelinfo-cli A CLI to query AI model capabilities, context limits, and pricing from...	26	Experimental	8	TypeScript
106	AlenVelocity/langchain-llama Run LLAMA LLMs in Node with Langchain	26	Experimental	39	TypeScript
107	zatevakhin/obsidian-local-llm Obsidian Local LLM is a plugin for Obsidian that provides access to a...	26	Experimental	135	TypeScript
108	Uralstech/vid-orca Deploy LLaMA-2 Chat on Google Cloud.	26	Experimental	4	Python
109	cgjosephlee/ollama-save-load Save and load ollama models just like operating docker images.	26	Experimental	26	Python
110	benct/kotlin-cheat-sheet :star: Kotlin <3 Cheat Sheet, Collection Extension Functions and General Examples	26	Experimental	7	—
111	johnsutor/llama-jarvis Turn any LLM into Jarvis	25	Experimental	6	Python
112	didier-durand/llms-in-clouds Experiments with LLMs in clouds (powered by SGLang)	25	Experimental	6	Python
113	kassane/ollama-d D bindings for the Ollama API	25	Experimental	3	D
114	codewithdark-git/llama-3-Hackathon LLaMA Genius is an AI-powered research assistant designed to help users...	25	Experimental	1	Python
115	khiwniti/kaggle-llm-api 🤖 Comprehensive solution for running Ollama/vLLM API servers in Kaggle...	25	Experimental	2	Python
116	AI-Robotic-Labs/Self-Sovereign-AI-SDK SDK for Self Sovereign AI	25	Experimental	3	Rust
117	BerkeliumLabs/Berkelium-labs Your personal AI Lab, accessible everywhere! Explore, experiment, and...	25	Experimental	2	TypeScript
118	alvion427/PerroPastor Run Llama based LLMs in Unity entirely in compute shaders with no dependencies	25	Experimental	106	C#
119	rabilrbl/llamafile-builder A simple github actions script to build a llamafile and uploads to huggingface	25	Experimental	17	Python
120	cvedix/omnisdk On-device AI deloper platform	24	Experimental	2	C++
121	excorsistvoid/Neuro-Bridge 🔗 Enable seamless hardware access on Android with Neuro-Bridge, a...	24	Experimental	2	Rust
122	RahulSChand/llama2.c-for-dummies Step by step explanation/tutorial of llama2.c	24	Experimental	225	C
123	avatsaev/av-local-llm-api Allows to easily run local REST API with a custom LLM, running locally or...	24	Experimental	4	Python
124	makllama/makllama MaK(Mac+Kubernetes)llama - Running and orchestrating large language models...	24	Experimental	45	Go
125	Brazilian-willametteriver232/llama.swift 🚀 Access llama.cpp easily in your Swift projects, leveraging precompiled...	24	Experimental	2	Swift
126	ksylvest/omniai-llama An implementation of the OmniAI interface for Llama.	23	Experimental	1	Ruby
127	frinknet/gelli Containerized LLM for any use-case big or small	23	Experimental	1	Shell
128	RichardHam-co-uk/ProjectLodestar AI development environment with 90% cost savings. Routes between 8 LLM...	23	Experimental	1	Python
129	PCfVW/plip-rs Mechanistic interpretability toolkit for code LLMs, in Rust. Analysis of...	23	Experimental	8	Rust
130	seanpm2001/DALL-E_LLaMA_Docs 🤖️🦙️🧠️📖️ The official documentation source repository for DALL-E LLaMA, a...	23	Experimental	2	Markdown
131	kurnevsky/llama-cpp.el A client for llama-cpp server	23	Experimental	28	Emacs Lisp
132	seanpm2001/DALL-E_LLaMA 🤖️🦙️🧠️ DALL-E LLaMA is a combination of DALL-E and LLaMA (Large Language...	23	Experimental	2	Python
133	tunib-ai/joker AI model designed to test the effectiveness in handling external ethical attacks.	23	Experimental	11	Python
134	luiscavallcante859/collectiv-ai-sdk 🌐 Build and integrate with the CollectiVAI Router using official SDKs for...	22	Experimental	—	Python
135	veerapatel/llm.nexus 🌐 Streamline integration with various LLM providers using LLM.Nexus, a .NET...	22	Experimental	—	—
136	UgurkanTech/ArchNetAI ArchNetAI is a Python library that leverages the Ollama API for generating...	22	Experimental	3	Python
137	Adriankhl/godot-llm-template Godot LLM Template/Demo	22	Experimental	32	GDScript
138	estrify/ProjectLodestar 🌟 Optimize AI development with Lodestar by smartly routing between free...	22	Experimental	—	Python
139	nininau/awesome-llm-services 🔍 Discover 106+ open-source LLM services and tools for AI, ideal for local...	22	Experimental	—	TypeScript
140	ahmedmagood/cpu-slm 🖥️ Explore CPU-SLM, a Rust-based SLM/LLM project that runs on CPU, offering...	22	Experimental	—	Rust
141	whyisitworking/llama-bro On-device LLM inference SDK for Android, powered by llama.cpp. Run GGUF...	22	Experimental	—	Kotlin
142	ferranpons/Llamatik-Server Remote inference backend implementing the same API as the Llamatik library...	22	Experimental	—	Kotlin
143	NeoZel/huatuo 🔍 Enhance your cloud-native observability with HUATUO, using eBPF for deep...	22	Experimental	—	C
144	ns408/local-ai-setup Run modern AI models on older laptops - optimized for 2nd-gen Intel hardware	22	Experimental	—	Shell
145	qxoticai/qxotic AI engine for the JVM	22	Experimental	—	Java
146	wk-y/rama-swap ramalama-based model swapping server	22	Experimental	—	Go
147	blackboxprogramming/ai-chain AI Chain — Distributed multi-node LLM inference with automatic failover....	22	Experimental	—	Python
148	fuglede/llama.ttf A font for writing tiny stories	21	Experimental	319	Rust
149	ariannamethod/yent.yo diffusion AI with a bad character	21	Experimental	2	Go
150	hurui200320/llama-cpp-kt The Kotlin wrapper of llama.cpp, powered by JNA	20	Experimental	13	Kotlin
151	Root1V/axonium-sdk A production-grade Python SDK for llama-server that streamlines...	20	Experimental	1	Python
152	anglerfishlyy/llm-watch-grafana AI observability Grafana plugin tracking real-time LLM metrics — latency,...	20	Experimental	1	JavaScript
153	fbaldassarri/llama-cpp-container Docker image to deploy a llama-cpp container with conda-ready environments	20	Experimental	17	Dockerfile
154	bkataru/chatllm.zig Zig wrapper for chatllm.cpp - LLM inference with 70+ model architectures	20	Experimental	1	Zig
155	Stoksweet/modlable A platform for building, training and running inference on TensorflowJS...	20	Experimental	1	TypeScript
156	haormj/llama2.go Inference Llama 2 in one file of pure go	20	Experimental	16	Go
157	LastBotInc/llama2j Pure Java Llama2 inference with optional multi-GPU CUDA implementation	20	Experimental	13	Java
158	chromejaw/free-llm-api A list of free LLM inference resources accessible via API.	19	Experimental	—	Python
159	lwch/llama2.go Port of Facebook's LLaMA 2 model in pure go and use little memory	19	Experimental	36	Go
160	leaxer-ai/leaxer-llama Pre-built llama.cpp binaries for Leaxer	19	Experimental	—	—
161	Komdosh/kLLaMa-jvm Simple example of using llama.cpp with kotlin (JVM)	19	Experimental	—	C++
162	revengerrr/LedgerCOBOL A COBOL banking system with AI integration. Built to learn how legacy code...	19	Experimental	—	COBOL
163	EZForever/llama.cpp-static Static builds of llama.cpp (Currently only amd64 server builds are available)	19	Experimental	—	Dockerfile
164	tokenrouter/tokenrouter-python Official Python SDK for TokenRouter - an intelligent LLM routing service...	19	Experimental	—	Python
165	invergent-ai/surogate-website Website for surogate.ai	19	Experimental	—	JavaScript
166	instavm/llm-token-visualizer See How Big Exactly A 128k Token Text Is	18	Experimental	4	TypeScript
167	mhajder/llama.cpp-updater A shell script to automatically update or build llama.cpp with optimal GPU...	18	Experimental	3	Shell
168	Andrew2077/Alpaca Simple Q/A app, where i created a UI for alpaca (fine tuned LLAMA) model...	17	Experimental	4	Jupyter Notebook
169	LlamaGenAI/llamagenai-openapi LlamaGen.Ai REST API, LlamaGen is AI Comic Factory - Generate Comics with...	17	Experimental	5	—
170	sc0v0ne/udemy_course_mastering_ollama_build_private_local_llm_apps_with_python Udemy Course Mastering Ollama Build Private Local LLM Apps with Python	17	Experimental	3	Python
171	lenticularis39/llama2.inferno Inference Llama 2 in one file of pure Limbo	17	Experimental	2	Limbo
172	waqasm86/Ubuntu-Cuda-Llama.cpp-Executable Pre-built llama.cpp CUDA binary for Ubuntu 22.04. No compilation required -...	16	Experimental	1	Python
173	Gaolingx/llama.cpp-Launcher run llama.cpp quickly and conveniently.	16	Experimental	1	Batchfile
174	lennor-tan/openrouter-free-model 🌐 Explore and manage free models on OpenRouter effortlessly with our web...	16	Experimental	—	TypeScript
175	entelecheia/llama-factory-container Container for LLaMA-Factory	16	Experimental	—	Shell
176	GP-Silah/silah-ai Powering Silah's smart features!	15	Experimental	—	Python
177	KolosalAI/kolosal-desktop Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run...	15	Experimental	5	Svelte
178	mrtrizer/UnityLlamaCpp Llama.cpp in Unity, straightforward and clean	15	Experimental	19	C#
179	austinweis/alpaca.cpp-gui GUI for GGML Alpaca models	15	Experimental	2	HTML
180	ChristianHohlfeld/ollama-local-docker Ollama Local Docker - A simple Docker-based setup for running Ollama's API...	15	Experimental	2	HTML
181	harpertoken/memoraxx LLaMA-style models with memory persistence.	15	Experimental	—	—
182	sak96/rust_llama_app Chat bot (llama) written in rust using Yew and Tauri.	14	Experimental	1	Rust
183	jihadkhawaja/Llama.Grammar GBNF converter for llama.cpp Grammar directly from C# types	14	Experimental	3	C#
184	juansalnac/API-mega-list 🌐 Discover a comprehensive collection of APIs to enhance your projects and...	14	Experimental	—	JavaScript
185	nathanborror/swift-llama A Swift client library for interacting with Meta's Llama API.	14	Experimental	4	Swift
186	secret-ai-labs/awesome-local-llm Your complete guide to running powerful AI models locally in 2025. Covers...	14	Experimental	4	—
187	nerdsupremacist/LlamaLang Repository for the Llama Programming Language. Work In Progress	14	Experimental	11	Python
188	seehiong/micronaut-llama3 A high-performance Llama3 implementation using Micronaut and GraalVM Native Image	14	Experimental	31	Java
189	aratan/ApiCloudLLaMA The idea is to make an api that everyone can consume in their GPT4-like...	14	Experimental	13	Go
190	SanMog/Uroboros Automated red-teaming framework for LLMs. Tests GPT-4o, Claude, Llama...	14	Experimental	—	Python
191	botosadam/matryoshka 🚀 Build Ruby gems that utilize Rust for enhanced performance through two...	14	Experimental	—	—
192	Atsusheeesh/vllm-daily 📊 Summarize merged PRs daily with vLLM, ensuring you stay updated on key...	14	Experimental	—	—
193	miga1999/AirClaw Run OpenClaw locally on any GPU or CPU without API costs, supporting large...	14	Experimental	—	Shell
194	codewithosama03/openrouter-free-model 🌐 Explore and manage free models on OpenRouter with this web app, featuring...	14	Experimental	—	TypeScript
195	nherx/free-llm-api-resources 🤖 Discover free API access and credits for various legitimate large language...	14	Experimental	—	Python
196	xxxbf0222/LlamaDeck A command-line tool for quickly managing and experimenting with multiple...	13	Experimental	5	Python
197	llamajs/llama A dynamic logger for the dynamic developer	13	Experimental	5	TypeScript
198	tbogdala/woolyrust A high-level Rust wrapper around llama.cpp for text generation AI with LLMs.	13	Experimental	7	Rust
199	unaidedelf8777/faster-outlines A Lazy, high throughput and blazing fast structured text generation backend.	13	Experimental	5	Rust
200	CameLLM/CameLLM Run your favourite LLMs locally on macOS from Swift	13	Experimental	82	Swift
201	tbogdala/woolycore The core wrapper around llama.cpp in C to provide an easy surface to build...	13	Experimental	5	C++
202	MaoJianwei/llama.cpp-arm-armv7l-Raspberry-Pi-Release-Prebuild On the Releases page, you can download pre-built binaries for arm, armv7l...	13	Experimental	2	—
203	yasir13001/MoonAI_API This MoonAI API service built with FastAPI that calculates and provides...	13	Experimental	—	Python
204	TimeSurgeLabs/promptproxy Call many AIs from a single API.	12	Experimental	3	Go
205	themaximalist/ModelDeployer API Proxy for AI models, rate limiting, management and more!	12	Experimental	4	CSS
206	JinHanLei/LLM-Stream-Service Streaming API and Web page for Large Language Models (Llama3) based on...	12	Experimental	3	Python
207	iakashpaul/Ghudsavar Ghudsavar (Horse rider) - Is a quick llama.cpp server for CPU only runtimes	12	Experimental	3	Dockerfile
208	mkashirin/splinter Splinter (Sequence Processing Language Interpreter) is a tree-walking...	12	Experimental	1	Zig
209	Root1V/llm-security JWT-based authentication and authorization gateway for locally deployed LLM...	12	Experimental	1	Python
210	gyanaranjans/llma-rust A simple webapp to showcase the ability to write a simple chatbot webapp...	12	Experimental	3	Rust
211	kashan-alam/ai-backend-fastapi AI-powered backend API built with FastAPI, JWT authentication, rate...	12	Experimental	1	Python
212	Jshulgach/NeuroBridge NeuroBridge: Where AI perception meets real-time robotics control	12	Experimental	1	Python
213	lufixSch/auto_llama Supercharge your local LLM	12	Experimental	4	Python
214	antononcube/Raku-WWW-LLaMA Raku package that provides access to the algorithms/models of (the...	12	Experimental	1	Raku
215	m9m9ra/llama.swiftui It`s my playground to test mokpell llama swift lib	11	Experimental	—	C
216	numq/text-generation JVM library for text generation, written in Kotlin and based on the C++...	11	Experimental	—	Kotlin
217	lenML/llama2-tokenizer.js llama2 tokenizer for javascript	11	Experimental	2	TypeScript
218	tbogdala/ai_notepad A lightweight Rust application to test interaction with large language...	11	Experimental	2	Rust
219	JavaLLM/llama4j An easy-to-use Java SDK for running LLaMA models on edge devices, powered by...	11	Experimental	23	Java
220	yachty66/aicomputer Open source DIY AI computing platform: Build a powerful RTX 3090 GPU rig...	11	Experimental	6	Jupyter Notebook
221	coderonion/awesome-mojo-max-mlir A collection of some awesome public MAX platform, Mojo programming language...	11	Experimental	41	—
222	pantaleone-ai/private-ai-stack Deploy a complete, self-hosted AI stack for private LLMs, agentic workflows,...	11	Experimental	5	—
223	zTgx/llama.rust LLM inference in Rust	11	Experimental	—	Rust
224	niansa/libjustlm Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of:...	11	Experimental	2	C++
225	eccenca/llama-index-cmem llama-index tools eccenca Corporate Memory Integration	11	Experimental	—	Python
226	georon/llama_test_proj Skeleton project to run and test Llama and Chromadb locally on a gaming...	11	Experimental	—	Python
227	diogok/llamautils Some python utilities for running llama.cpp on linux	11	Experimental	—	Python
228	Abdullahali77/AI_Testing_CLI A specialized command-line tool that generates Python unit tests for your...	11	Experimental	—	Python
229	tripolskypetr/agent-tune A React-based tool for constructing fine-tuning datasets with list and grid...	11	Experimental	—	TypeScript
230	NavodPeiris/node_llama run llama models using llamafile and communicate with llama models through...	11	Experimental	—	JavaScript
231	scttfrdmn/genkit-aws AWS plugins for Google's GenKit framework - add AWS Bedrock models and...	11	Experimental	—	Go
232	OnlyF0uR/interactive-ai Rust CLI application for interacting with LLMs for Llama & OpenRouter.	11	Experimental	—	Rust
233	jazibjohar/ai-text-structor A powerful asynchronous framework for orchestrating Large Language Model...	11	Experimental	—	Python
234	pAI-OS/fetch_llama_cpp llama.cpp downloader that selects the latest and best available binaries for...	11	Experimental	—	Python
235	fasuizu-br/brainiall-llm-gateway Brainiall LLM Gateway — 113+ AI models via OpenAI-compatible API. Claude,...	11	Experimental	—	—
236	aruntemme/llamacpp-swap-boilerplate A cross-platform template for running and managing llama-swap with...	11	Experimental	—	Shell
237	NeuralWeights/Llama-Server-AuthKeys Authorization tokens to access llama.cpp server (LM Studio, Ollama, Msty,...	10	Experimental	1	Python
238	updcon/libmisc-clj DKD miscellaneous for Clojure development	10	Experimental	1	Clojure
239	d1pankarmedhi/Phi3-rust Serve Phi3 with Candle and Actix 🦀	10	Experimental	1	Rust
240	Inferra/Inferra-Python-SDK Official Python SDK for Inferra API access	10	Experimental	1	Python
241	asaddi/lv-serve Llama 3.2 Vision OpenAI-like API server	10	Experimental	1	Python
242	Inferra/Inferra-JS-SDK Official JavaScript/TypeScript SDK for Inferra API access	10	Experimental	1	TypeScript
243	0xricksanchez/AIonic AIonic: A unified, user-friendly Rust library for seamless integration with...	10	Experimental	1	Rust
244	3axislabs/llm4j Build Context Aware LLM Apps using Java	10	Experimental	1	—
245	shakfu/llamalib Thin cython, pybind11, and nanobind wrappers around llama.cpp	10	Experimental	1	Cython

Comparisons in this category

OpenLLM and free-llm-api-resources (74 vs 54) ludwig and OpenLLM (75 vs 74) OpenLLM and llama-swap (74 vs 65) OpenLLM and ome (74 vs 53) node-llama-cpp and LLamaSharp (79 vs 68) node-llama-cpp and llama-swap (79 vs 65) node-llama-cpp and llama_sdk (79 vs 52)