ntsation/personal-data-pseudonymizer
The Personal Data Pseudonymizer is a Python script designed to anonymize sensitive personal information in text. It identifies and pseudonymizes named entities such as people's names, locations, phone numbers, and email addresses by replacing them with asterisks (*).
This script helps you protect privacy by automatically finding and replacing sensitive personal details like names, locations, phone numbers, and email addresses in your text with asterisks. You input a document or block of text containing personal data, and it outputs a version of the text where that information has been masked. This is useful for anyone handling text data that needs to be shared, stored, or analyzed without exposing private information.
No commits in the last 6 months.
Use this if you need to quickly anonymize text documents to protect personal information before sharing or processing them.
Not ideal if you need a more sophisticated pseudonymization method beyond simple masking with asterisks, or if your text is not in English.
Stars
4
Forks
—
Language
Python
License
—
Category
Last pushed
Dec 30, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ntsation/personal-data-pseudonymizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vmenger/deduce
Deduce: de-identification method for Dutch medical text
DataFog/datafog-python
Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines...
aphp/eds-pseudo
EDS-Pseudo is a hybrid model for detecting personally identifying entities in clinical reports
seanpedrick-case/doc_redaction
Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface....
martincjespersen/DaAnonymization
Simple customizable pipeline tool for anonymizing Danish text.