jasonacox/TinyLLM

Setup and run a local LLM and Chatbot using consumer grade hardware.

49
/ 100
Emerging

Supports multiple inference backends (Ollama, vLLM, llama-cpp-python) with OpenAI API compatibility, enabling flexible deployment across different hardware constraints. The chatbot layer adds RAG capabilities including URL summarization, news aggregation, stock/weather lookups, and vector database integration for knowledge retrieval. Architecture uses containerized services with persistent model caching and multi-session support via FastAPI frontend querying the inference backend.

319 stars.

No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

319

Forks

37

Language

JavaScript

License

MIT

Last pushed

Nov 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/jasonacox/TinyLLM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.