platisd/duplicate-code-detection-tool
A simple Python3 tool to detect similarities between files within a repository
Leverages gensim's document similarity models to compute semantic similarity between source files across C, C++, Java, Python, and C# codebases. Available as both a CLI tool and GitHub Action that integrates directly into pull request workflows, with configurable thresholds for reporting and failure conditions. Includes pre-commit hook support and uses token-based NLP analysis, becoming more accurate as project size increases.
203 stars. No commits in the last 6 months.
Stars
203
Forks
34
Language
Python
License
MIT
Category
Last pushed
Jun 01, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/platisd/duplicate-code-detection-tool"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
fossology/safaa
Agent to compliment FOSSology's copyright scanner and find false positive findings.
hhhhhhhhhn/HookeJs
An open source plagiarism detector built in node.
Crypt0knights/Plagiarism-Detector
A Web Platform to detect plagiarised documents or plain text.
izikeros/sentence-plagiarism
Compare sentences from input document with all sentences from reference documents - find very...
noorkhokhar99/Plagiarsim-Checker
Plagiarsim checker using cosine algorithm #Plagiarsimchecker