whyisitworking/llama-bro
On-device LLM inference SDK for Android, powered by llama.cpp. Run GGUF models directly on Android devices with a clean Kotlin coroutine API — no server, no network, and fully privacy-preserving. Built specifically for modern Android development architectures.
Stars
—
Forks
—
Language
Kotlin
License
Apache-2.0
Category
Last pushed
Mar 14, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/whyisitworking/llama-bro"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
bentoml/OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...