redhuntlabs/Octopii

An AI-powered Personal Identifiable Information (PII) scanner.

35
/ 100
Emerging

Combines OCR (Tesseract), regex pattern matching, and NLP (spaCy/NLTK) to extract and classify PII types from images, PDFs, and documents with face detection via Haar cascades. Supports scanning from local filesystems, S3 buckets, and Apache open directory listings, outputting structured JSON with detected identifiers, contact information, and geolocation data.

725 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 16 / 25

How are scores calculated?

Stars

725

Forks

63

Language

Python

License

Last pushed

Jan 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/redhuntlabs/Octopii"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.