agencyenterprise/PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022
Implements mask-based iterative adversarial composition to systematically evaluate two attack vectors: goal hijacking (redirecting model behavior to execute malicious instructions) and prompt leaking (extracting hidden system prompts). The framework enables researchers to quantitatively measure LLM vulnerability by testing handcrafted adversarial inputs against production models like GPT-3, revealing how simple user interactions can exploit stochastic model behavior.
465 stars and 88 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
465
Forks
44
Language
Python
License
MIT
Category
Last pushed
Feb 26, 2024
Monthly downloads
88
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/agencyenterprise/PromptInject"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
protectai/llm-guard
The Security Toolkit for LLM Interactions
MaxMLang/pytector
Easy to use LLM Prompt Injection Detection / Detector Python Package with support for local...
utkusen/promptmap
a security scanner for custom LLM applications
Dicklesworthstone/acip
The Advanced Cognitive Inoculation Prompt
Resk-Security/Resk-LLM
Resk is a robust Python library designed to enhance security and manage context when...