lavis-nlp/german_legal_sentences
A dataset of semantically related sentence pairs in the German legal domain
This project provides a specialized collection of German legal sentences designed to improve search accuracy within legal documents. It takes raw German court decisions and extracts pairs of sentences that are semantically related, often by sharing legal citations. Legal professionals, researchers, and even the general public can use this dataset to train advanced search systems that find relevant legal information more easily, without needing to know specific legal jargon.
No commits in the last 6 months.
Use this if you are developing or evaluating a search system for German legal documents and need high-quality, semantically linked sentence pairs for training.
Not ideal if your focus is on legal texts outside of German, or if you need to analyze the full structure of legal documents rather than just sentence-level relationships.
Stars
10
Forks
—
Language
—
License
—
Category
Last pushed
Feb 26, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/lavis-nlp/german_legal_sentences"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Garrafao/LSCDetection
Data Sets and Models for Evaluation of Lexical Semantic Change Detection
RepoAnalysis/RepoSim
This repository contains experiments on comparing the similarity of Python repositories using ML models.
cod3licious/simec
Similarity Encoder (SimEc) Neural Network Framework for learning low dimensional similarity...
jorge-martinez-gil/uwsd
Context-Aware Semantic Similarity Measurement for Unsupervised Word Sense Disambiguation
cr1m5onk1ng/text_similarity
A nlp library for text similarity based on Transformer models