git-disl/Lisa
This is the official code for the paper "Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning" (NeurIPS2024)
No commits in the last 6 months.
Stars
26
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 10, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/git-disl/Lisa"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
steering-vectors/steering-vectors
Steering vectors for transformer language models in Pytorch / Huggingface
jianghoucheng/AlphaEdit
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
kmeng01/memit
Mass-editing thousands of facts into a transformer memory (ICLR 2023)
boyiwei/alignment-attribution-code
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
jianghoucheng/AnyEdit
AnyEdit: Edit Any Knowledge Encoded in Language Models, ICML 2025