eth-sri/ToolFuzz

ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.

/ 100

Emerging

Combines LLM-based prompt generation with two specialized testers—`RuntimeErrorTester` detects crashes and `CorrectnessTester` validates output correctness—while maintaining framework agnosticism through abstract `TestingAgentExecutor` and `ToolExtractor` interfaces. Includes built-in support for LangChain, AutoGen, LlamaIndex, and CrewAI, with results exported as interactive HTML and JSON reports. Uses OpenAI models for fuzzing and evaluation, with extensibility for custom agents and tools via interface implementation.

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Featured in

Agent Governance in 2026: Who's Building the Guardrails?

Higher-rated alternatives

petterjuan/agentic-reliability-framework

ARF is an agentic reliability intelligence platform that separates decision intelligence (OSS)...

sarkar-ai-taken/riva

Local-first observability and control plane for AI agents.

Nubaeon/empirica

Make AI agents and AI workflows measurably reliable. Epistemic measurement, Noetic RAG,...

relai-ai/relai-sdk

A platform for building reliable AI agents

soumendrak/ragwatch

An SDK for Python AI Agents. Under heavy development.

Explore AI Agents

All categories Trending AI Agent directory Insights