Maximilian-Winter/llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Leverages guided sampling with JSON schema grammars to constrain model outputs, enabling function calling and structured output even on models not fine-tuned for these tasks. Integrates with multiple inference backends including llama.cpp, TGI, and vLLM servers, and supports agentic workflows through conversational, sequential, and mapping chain patterns with tool integration from Pydantic, llama-index, and OpenAI schemas.
620 stars and 8,620 monthly downloads. Actively maintained with 1 commit in the last 30 days. Available on PyPI.
Stars
620
Forks
69
Language
Python
License
—
Category
Last pushed
Mar 09, 2026
Monthly downloads
8,620
Commits (30d)
1
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Maximilian-Winter/llama-cpp-agent"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
mozilla-ai/any-llm
Communicate with an LLM provider using a single interface
ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
CliDyn/climsight
A next-generation climate information system that uses large language models (LLMs) alongside...
rizerphe/local-llm-function-calling
A tool for generating function arguments and choosing what function to call with local LLMs
day50-dev/llcat
cat for LLMs