sleeepeer/PoisonedRAG

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

/ 100

Established

Implements both black-box (LM-targeted) and white-box (HotFlip) poisoning attacks on retrieval corpora, targeting popular RAG retriever-LLM pairs including Contriever with GPT-3.5/4, PaLM 2, and LLaMA. Evaluates attacks across BEIR benchmark datasets (NQ, HotpotQA, MS-MARCO) with configurable hyperparameters for adversarial document generation and ranking manipulation. Integrates with Hugging Face model APIs and supports local model deployment via FastChat for reproducible adversarial evaluation.

242 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

242

Forks

Language

Python

License

MIT

Related tools

LLAMATOR-Core/llamator

Red Teaming python-framework for testing chatbots and GenAI systems.

JuliusHenke/autopentest

CLI enabling more autonomous black-box penetration tests using Large Language Models (LLMs)

kelkalot/simpleaudit

Allows to red-team your AI systems through adversarial probing. It is simple, effective, and...

SecurityClaw/SecurityClaw

A modular, skill-based autonomous Security Operations Center (SOC) agent that monitors...

AI-secure/AgentPoison

[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or...

Explore RAG Tools

All categories Trending RAG directory Insights