rb81/prompt-hacking-classifier
A flexible and portable solution that uses a single robust prompt and customized hyperparameters to classify user messages as either malicious or safe, helping to prevent jailbreaking and manipulation of chatbots and other LLM-based solutions.
No commits in the last 6 months.
Stars
16
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Aug 08, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/rb81/prompt-hacking-classifier"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
protectai/llm-guard
The Security Toolkit for LLM Interactions
MaxMLang/pytector
Easy to use LLM Prompt Injection Detection / Detector Python Package with support for local...
agencyenterprise/PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a...
utkusen/promptmap
a security scanner for custom LLM applications
Dicklesworthstone/acip
The Advanced Cognitive Inoculation Prompt