Legal Document Processing NLP Tools

Tools for extracting structured information from legal documents, parsing legal text, identifying legal concepts/citations, and organizing legal data. Does NOT include general contract analysis, legal research databases, or law-specific knowledge bases without document processing components.

There are 39 legal document processing tools tracked. 1 score above 50 (established tier). The highest-rated is discopy/discopy at 68/100 with 406 stars and 4,054 monthly downloads.

Get all 39 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=legal-document-processing&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 discopy/discopy

The Python toolkit for computing with string diagrams.

68
Established
2 jblake1965/eluciDoc

Screens legal text and extracts sentences containing user input party...

49
Emerging
3 LexPredict/lexpredict-lexnlp

LexNLP by LexPredict

44
Emerging
4 Liquid-Legal-Institute/Legal-Text-Analytics

A list of selected resources, methods, and tools dedicated to Legal Text Analytics.

44
Emerging
5 Neplex/ArchiTXT

ArchiTXT is an open source Python library that transforms unstructured text...

41
Emerging
6 openlegaldata/awesome-legal-data

A collection of datasets and other resources for legal text processing.

41
Emerging
7 Legilibre/legi.py

Outils de manipulation des archives LEGI (lois françaises)

41
Emerging
8 LukaVuli/Entity_Neutering

A methodology to pre-process text data for preventing lookahead bias in...

38
Emerging
9 maastrichtlawtech/bsard

🔍 A statutory article retrieval dataset in French. (ACL 2022)

33
Emerging
10 yinhao0214/ParseLawDocuments

对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。

33
Emerging
11 nokia/codesearch

Models and datasets for annotated code search.

32
Emerging
12 ondata/normattiva_2_md

Trasforma i testi delle leggi italiane in formato leggibile e pronto per...

32
Emerging
13 DerwenAI/arxiv-trends

Analyze trends in articles published on arXiv

30
Emerging
14 medelman17/eyecite-ts

TypeScript legal citation extraction library with zero dependencies....

30
Emerging
15 zeeuws-archief/ArchiveTextMiner

Transform textual information to structured metadata in MDTO-format.

29
Experimental
16 Starscream-11813/MathBot

MathBot is a transformer-based Math Word Problem (MWP) solver made as the...

28
Experimental
17 MI2DataLab/HADES

A powerful tool for comparing similarly structured documents

28
Experimental
18 liamcripwell/disco_split

Code and data for discourse-based sentence splitting experiments.

27
Experimental
19 tvhahn/arxiv-code-search

Do authors on arXiv make their code and data available? We're building text...

27
Experimental
20 george-gca/ai_papers_cleaner

Extract text from papers PDFs and abstracts, and remove uninformative words.

26
Experimental
21 mastaal/uitspraken

a simple Python program to easily load in Dutch court decision XML-files as...

24
Experimental
22 mastaal/nllegalcit

A Python library to find citations to Dutch legal documents in natural...

24
Experimental
23 organvm-i-theoria/linguistic-atomization-framework

LingFrame — computational rhetoric platform: hierarchical text atomization,...

23
Experimental
24 openeventdata/PLOVER

Next generation event data ontology

22
Experimental
25 thejeswi/BobGoesToJail

A semantic law interpreter for the English translations for the German...

22
Experimental
26 bflashcp3f/textlabs-xwlp-code

EACL 2021 "Process-Level Representation of Scientific Protocols with...

22
Experimental
27 MUSC-TBIC/etude-engine

ETUDE (Evaluation Tool for Unstructured Data and Extractions) is a...

20
Experimental
28 chigwell/legalysis

legalysis extracts parties, issues, outcomes, and lessons from case texts...

20
Experimental
29 DaBr01/AGB-DE

A corpus and models for the automated legal assessment of clauses in German...

18
Experimental
30 fanta-mnix/nlp-contract-analysis

NLP-based Contract Analysis

14
Experimental
31 phHartl/eu-judgement-analyse

Quantitative analysis of judgments of the European Court of Justice

13
Experimental
32 justmars/citation-utils

Docket citation regexes from Philippine Supreme Court decisions

12
Experimental
33 TLP-COI/tlp-coi-docs

Governance, contribution guidance, and project-planning documentation for the TLP-CoI

11
Experimental
34 ssciwr/argumentation-management

Annotator combining different NLP pipelines.

11
Experimental
35 justmars/citation-report

Parse legal citations having the publisher format - i.e. SCRA, PHIL, OFFG -...

11
Experimental
36 justmars/citation-date

This is a dependency: a regex date formula and decoder for dates referenced...

11
Experimental
37 berkearda/croissantminer

Automated metadata extraction from ML dataset papers using LLMs and the...

11
Experimental
38 innerNULL/monoml

Mono Implementations' Archive

11
Experimental
39 askmuhsin/legal_maxims

legal maxims dataset

10
Experimental