roli-lpci/little-canary

Sacrificial LLM instances as behavioral probes for prompt injection detection

/ 100

Emerging

Combines structural pattern detection (regex + encoding-aware decode checks) with behavioral analysis via a small sacrificial LLM probe that runs at temperature=0 to detect compromise signals like persona adoption or instruction leakage. Supports both local Ollama models and OpenAI-compatible cloud APIs (MiniMax, Groq, Together, etc.), offering three detection modes (block, advisory, full) that integrate directly into existing LLM apps with ~250ms latency overhead and optional LLM-based verdict classification.

Available on PyPI.

Maintenance 13 / 25

Adoption 10 / 25

Maturity 18 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Featured in

Agent Governance in 2026: Who's Building the Guardrails?

Higher-rated alternatives

Nebulock-Inc/agentic-threat-hunting-framework

ATHF is a framework for agentic threat hunting - building systems that can remember, learn, and...

AgentSeal/agentseal

Security toolkit for AI agents. Scan your machine for dangerous skills and MCP configs, monitor...

cosai-oasis/secure-ai-tooling

The CoSAI Risk Map is a framework for identifying, analyzing, and mitigating security risks in...

HeadyZhang/agent-audit

Static security scanner for LLM agents — prompt injection, MCP config auditing, taint analysis....

oasm-platform/open-asm

Open-source platform for cybersecurity Attack Surface Management (OASM).

Explore AI Agents

All categories Trending AI Agent directory Insights