PII Detection Redaction NLP Tools
Tools for detecting, masking, and redacting personally identifiable information (PII) in text, images, and documents. Does NOT include privacy policy analysis, general data anonymization frameworks, or data leak detection platforms.
There are 41 pii detection redaction tools tracked. 2 score above 50 (established tier). The highest-rated is DataFog/datafog-python at 66/100 with 48 stars and 28,287 monthly downloads.
Get all 41 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=pii-detection-redaction&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
DataFog/datafog-python
Python SDK for PII detection and redaction in text and images, combining... |
|
Established |
| 2 |
vmenger/deduce
Deduce: de-identification method for Dutch medical text |
|
Established |
| 3 |
martincjespersen/DaAnonymization
Simple customizable pipeline tool for anonymizing Danish text. |
|
Emerging |
| 4 |
aphp/eds-pseudo
EDS-Pseudo is a hybrid model for detecting personally identifying entities... |
|
Emerging |
| 5 |
seanpedrick-case/doc_redaction
Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical... |
|
Emerging |
| 6 |
DilawarShafiq/phi-redactor
HIPAA-native PHI redaction proxy for AI/LLM interactions. Detects and masks... |
|
Emerging |
| 7 |
thoughtbot/top_secret
Filter sensitive information from free text before sending it to external... |
|
Emerging |
| 8 |
dimanjet/piicloak
Enterprise-grade PII detection and anonymization REST API built on Presidio |
|
Emerging |
| 9 |
SMI/IsIdentifiable
A tool for detecting identifiable information in data sources (CSV, DICOM,... |
|
Emerging |
| 10 |
icescentral/MASK_public
Masking identifiable information from health related documents. |
|
Emerging |
| 11 |
edwardcooper/data-sentry
A project to build a machine learning pipeline to detect personal... |
|
Emerging |
| 12 |
ahmedbesbes/anonymization-api
How to build and deploy an anonymization API with FastAPI and SpaCy |
|
Emerging |
| 13 |
databricks-industry-solutions/ocr-phi-masking
Our joint Solution Accelerator with John Snow Labs automates the detection... |
|
Experimental |
| 14 |
jftuga/deidentification
Deidentify people's names and gender specific pronouns |
|
Experimental |
| 15 |
vmenger/docdeid
Create your own document de-identifier using docdeid, a simple framework... |
|
Experimental |
| 16 |
Welding-Torch/Excel-Anonymizer
A Python script that anonymizes an Excel file and synthesizes new data in its place. |
|
Experimental |
| 17 |
HC200ok/manual-data-masking
A lightweight javascript library for manual data masking |
|
Experimental |
| 18 |
awsaf49/pii-data-detection
The Learning Agency Lab - PII Data Detection || Develop automated techniques... |
|
Experimental |
| 19 |
zacharykzhao/CA4P-483
NLP dataset: Chinese Android Privacy Policy Dataset |
|
Experimental |
| 20 |
worka-ai/pii
A library to identify and help redact Personally Identifiable Information... |
|
Experimental |
| 21 |
ahmedbesbes/anonymizer
Text Anonymization app with Streamlit and Spacy |
|
Experimental |
| 22 |
OmkarPathak/piiscrub
A blazing-fast, zero-dependency PII scrubbing engine for LLMs. Multi-core... |
|
Experimental |
| 23 |
swissprismia/pii-shield
System-wide PII & secret detection for your clipboard. Tokenizes sensitive... |
|
Experimental |
| 24 |
marichu-kt/PrivScore
Extensión de navegador (Chrome/Brave/Edge) que evalúa la “salud” de... |
|
Experimental |
| 25 |
sagarcs818/Trustify-privacy-analyzer
An AI-powered privacy analyzer for apps and websites. Most users blindly... |
|
Experimental |
| 26 |
iYassr/maskr
Detect & mask PII in documents - 100% offline. Names, emails, phones, SSNs,... |
|
Experimental |
| 27 |
hsleonis/pii-detection-group
Research Group works on PII Detection |
|
Experimental |
| 28 |
AbhilashaRavichander/PrivacyQA_EMNLP
PrivacyQA, a resource to support question-answering over privacy policies. |
|
Experimental |
| 29 |
PriyeshDave/Document-Redaction
This project revolves around the ability to recognise sensitive words within... |
|
Experimental |
| 30 |
sdsc-ordes/deid-module
Text deidentification module. |
|
Experimental |
| 31 |
spak2005/AI_privacy_layer
A stateless LLM API proxy that tokenizes PII using parallel regex + NER... |
|
Experimental |
| 32 |
0xjgv/inconnu
Data privacy tool, for fast & thorough anonymization/pseudonymization, easy... |
|
Experimental |
| 33 |
biagiocornacchia/microsoft-presidio-using-grpc
Implementation of a Distributed Personal Information Recognition System that... |
|
Experimental |
| 34 |
SwissFederalArchives/tcc-metadata-anonymization
An named-entity-recognition (NER) based anonymizer for archival documents metadata. |
|
Experimental |
| 35 |
nedap/mdpi2021-textgen
Source code for the paper "Generating Synthetic Training Data for Supervised... |
|
Experimental |
| 36 |
F2u0a0d3/TrustRead
Analyze any privacy policy with AI—see risks, data use, rights, and a 0–100... |
|
Experimental |
| 37 |
sonu-gupta/Doxing-on-Twitter
This repository contains my work on the prevention and anonymization of dox... |
|
Experimental |
| 38 |
crisp-du/ppevo
Evolution of Privacy Policies |
|
Experimental |
| 39 |
gattil/realtime-anonymisation-phiidata
Near real-time identification and redaction of PII and PHI in data stream... |
|
Experimental |
| 40 |
Th3Tr00p3r/PrivacyPolicy
PPA breaks down privacy policies, aiming to simplify their understanding. By... |
|
Experimental |
| 41 |
Biswas-N/Redactor
Redactor is a python based utillity tool used to redact sensitive... |
|
Experimental |