agencyenterprise/PromptInject

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022

/ 100

Established

Implements mask-based iterative adversarial composition to systematically evaluate two attack vectors: goal hijacking (redirecting model behavior to execute malicious instructions) and prompt leaking (extracting hidden system prompts). The framework enables researchers to quantitatively measure LLM vulnerability by testing handcrafted adversarial inputs against production models like GPT-3, revealing how simple user interactions can exploit stochastic model behavior.

465 stars and 88 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 14 / 25

Maturity 25 / 25

Community 16 / 25

How are scores calculated?

Stars

465

Forks

Language

Python

License

MIT

Related tools

protectai/llm-guard

The Security Toolkit for LLM Interactions

MaxMLang/pytector

Easy to use LLM Prompt Injection Detection / Detector Python Package with support for local...

utkusen/promptmap

a security scanner for custom LLM applications

Dicklesworthstone/acip

The Advanced Cognitive Inoculation Prompt

Resk-Security/Resk-LLM

Resk is a robust Python library designed to enhance security and manage context when...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights