thu-coai/Safety-Prompts
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。
Contains 100k Chinese safety prompts across 7 typical scenarios (insult, discrimination, crimes, physical harm, mental health, privacy, ethics) and 6 instruction attack types, paired with ChatGPT responses for training safer models. Data is accessible via Hugging Face Datasets and organized in JSON format, designed primarily for fine-tuning rather than evaluation—the project recommends SafetyBench for benchmarking. Complements the broader Safety-Prompts ecosystem including ShieldLM (a customizable safety detector) and provides integration with Tsinghua's Chinese LLM safety evaluation platform.
1,135 stars. No commits in the last 6 months.
Stars
1,135
Forks
88
Language
—
License
Apache-2.0
Category
Last pushed
Feb 27, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/thu-coai/Safety-Prompts"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
x-hannibal/open-webui-easymage
Multi-engine image generation filter for Open WebUI. Features automated prompt enhancement,...
ahmadbuilds/multi-agent-hr-assistant
An autonomous, multi-agent HR service desk. It uses a supervisor architecture to route employee...
Shaw1011/prompt-lint
A linter for LLM system prompts - detects contradictions, injection risks, security...
Dewensong/email-marketing-skill
Reusable email marketing skill with local-first setup, knowledge-driven drafting, SMTP/IMAP...
wisterx-spec/agent-rails
Description:Opinionated workflow framework for AI-assisted development — rules, skills &...