PII Detection Redaction NLP Tools

Tools for detecting, masking, and redacting personally identifiable information (PII) in text, images, and documents. Does NOT include privacy policy analysis, general data anonymization frameworks, or data leak detection platforms.

There are 41 pii detection redaction tools tracked. 2 score above 50 (established tier). The highest-rated is DataFog/datafog-python at 66/100 with 48 stars and 28,287 monthly downloads.

Get all 41 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=pii-detection-redaction&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 DataFog/datafog-python

Python SDK for PII detection and redaction in text and images, combining...

66
Established
2 vmenger/deduce

Deduce: de-identification method for Dutch medical text

64
Established
3 martincjespersen/DaAnonymization

Simple customizable pipeline tool for anonymizing Danish text.

41
Emerging
4 aphp/eds-pseudo

EDS-Pseudo is a hybrid model for detecting personally identifying entities...

41
Emerging
5 seanpedrick-case/doc_redaction

Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical...

39
Emerging
6 DilawarShafiq/phi-redactor

HIPAA-native PHI redaction proxy for AI/LLM interactions. Detects and masks...

37
Emerging
7 thoughtbot/top_secret

Filter sensitive information from free text before sending it to external...

37
Emerging
8 dimanjet/piicloak

Enterprise-grade PII detection and anonymization REST API built on Presidio

35
Emerging
9 SMI/IsIdentifiable

A tool for detecting identifiable information in data sources (CSV, DICOM,...

34
Emerging
10 icescentral/MASK_public

Masking identifiable information from health related documents.

32
Emerging
11 edwardcooper/data-sentry

A project to build a machine learning pipeline to detect personal...

32
Emerging
12 ahmedbesbes/anonymization-api

How to build and deploy an anonymization API with FastAPI and SpaCy

30
Emerging
13 databricks-industry-solutions/ocr-phi-masking

Our joint Solution Accelerator with John Snow Labs automates the detection...

29
Experimental
14 jftuga/deidentification

Deidentify people's names and gender specific pronouns

29
Experimental
15 vmenger/docdeid

Create your own document de-identifier using docdeid, a simple framework...

28
Experimental
16 Welding-Torch/Excel-Anonymizer

A Python script that anonymizes an Excel file and synthesizes new data in its place.

28
Experimental
17 HC200ok/manual-data-masking

A lightweight javascript library for manual data masking

28
Experimental
18 awsaf49/pii-data-detection

The Learning Agency Lab - PII Data Detection || Develop automated techniques...

26
Experimental
19 zacharykzhao/CA4P-483

NLP dataset: Chinese Android Privacy Policy Dataset

26
Experimental
20 worka-ai/pii

A library to identify and help redact Personally Identifiable Information...

26
Experimental
21 ahmedbesbes/anonymizer

Text Anonymization app with Streamlit and Spacy

25
Experimental
22 OmkarPathak/piiscrub

A blazing-fast, zero-dependency PII scrubbing engine for LLMs. Multi-core...

25
Experimental
23 swissprismia/pii-shield

System-wide PII & secret detection for your clipboard. Tokenizes sensitive...

24
Experimental
24 marichu-kt/PrivScore

Extensión de navegador (Chrome/Brave/Edge) que evalúa la “salud” de...

23
Experimental
25 sagarcs818/Trustify-privacy-analyzer

An AI-powered privacy analyzer for apps and websites. Most users blindly...

22
Experimental
26 iYassr/maskr

Detect & mask PII in documents - 100% offline. Names, emails, phones, SSNs,...

19
Experimental
27 hsleonis/pii-detection-group

Research Group works on PII Detection

17
Experimental
28 AbhilashaRavichander/PrivacyQA_EMNLP

PrivacyQA, a resource to support question-answering over privacy policies.

17
Experimental
29 PriyeshDave/Document-Redaction

This project revolves around the ability to recognise sensitive words within...

16
Experimental
30 sdsc-ordes/deid-module

Text deidentification module.

15
Experimental
31 spak2005/AI_privacy_layer

A stateless LLM API proxy that tokenizes PII using parallel regex + NER...

15
Experimental
32 0xjgv/inconnu

Data privacy tool, for fast & thorough anonymization/pseudonymization, easy...

15
Experimental
33 biagiocornacchia/microsoft-presidio-using-grpc

Implementation of a Distributed Personal Information Recognition System that...

13
Experimental
34 SwissFederalArchives/tcc-metadata-anonymization

An named-entity-recognition (NER) based anonymizer for archival documents metadata.

13
Experimental
35 nedap/mdpi2021-textgen

Source code for the paper "Generating Synthetic Training Data for Supervised...

13
Experimental
36 F2u0a0d3/TrustRead

Analyze any privacy policy with AI—see risks, data use, rights, and a 0–100...

12
Experimental
37 sonu-gupta/Doxing-on-Twitter

This repository contains my work on the prevention and anonymization of dox...

12
Experimental
38 crisp-du/ppevo

Evolution of Privacy Policies

11
Experimental
39 gattil/realtime-anonymisation-phiidata

Near real-time identification and redaction of PII and PHI in data stream...

11
Experimental
40 Th3Tr00p3r/PrivacyPolicy

PPA breaks down privacy policies, aiming to simplify their understanding. By...

10
Experimental
41 Biswas-N/Redactor

Redactor is a python based utillity tool used to redact sensitive...

10
Experimental