davidemodolo/malicious_llm_finetuning
Proof of concept demonstrating backdoor injection into fine-tuned LLMs using LoRA. Shows how supposedly "open weights" models can be secretly compromised to harbor malicious behaviors triggered by specific patterns (email addresses).
Stars
—
Forks
—
Language
Python
License
—
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/davidemodolo/malicious_llm_finetuning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google/scaaml
SCAAML: Side Channel Attacks Assisted with Machine Learning
Koukyosyumei/AIJack
Security and Privacy Risk Simulator for Machine Learning (arXiv:2312.17667)
pralab/secml
A Python library for Secure and Explainable Machine Learning
AI-SDC/SACRO-ML
Collection of tools and resources for managing the statistical disclosure control of trained...
liuyugeng/ML-Doctor
Code for ML Doctor