thu-coai/Safety-Prompts

Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。

43
/ 100
Emerging

Contains 100k Chinese safety prompts across 7 typical scenarios (insult, discrimination, crimes, physical harm, mental health, privacy, ethics) and 6 instruction attack types, paired with ChatGPT responses for training safer models. Data is accessible via Hugging Face Datasets and organized in JSON format, designed primarily for fine-tuning rather than evaluation—the project recommends SafetyBench for benchmarking. Complements the broader Safety-Prompts ecosystem including ShieldLM (a customizable safety detector) and provides integration with Tsinghua's Chinese LLM safety evaluation platform.

1,135 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

1,135

Forks

88

Language

License

Apache-2.0

Category

guardrails

Last pushed

Feb 27, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/thu-coai/Safety-Prompts"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.