yueliu1999/Awesome-Jailbreak-on-LLMs

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

/ 100

Established

Organizes attack and defense methods across multiple threat vectors—targeting reasoning models, black-box/white-box scenarios, multi-turn conversations, RAG systems, and multi-modal inputs—alongside guardrail approaches like learning-based defenses and guard models. The repository indexes implementation code and datasets alongside paper citations, enabling reproducible comparison of attack/defense effectiveness. Covers emerging safety challenges in reasoning-heavy LLMs (o1-style models) and multimodal systems alongside traditional text-based jailbreaks.

1,245 stars. Actively maintained with 15 commits in the last 30 days.

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

1,245

Forks

101

Language

—

License

MIT

Compare

Awesome-Jailbreak-on-LLMs and awesome-llm-jailbreaks

Related tools

wuyoscar/ISC-Bench

Internal Safety Collapse: Turning LLMs into a "Jailbroken State" Without "a Jailbreak Attack".

xirui-li/DrAttack

Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes...

yiksiu-chan/SpeakEasy

[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

tmlr-group/DeepInception

[arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"

Techiral/awesome-llm-jailbreaks

Latest AI Jailbreak Payloads & Exploit Techniques for GPT, QWEN, and all LLM Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights