christos42/inductive_bias_IE
An Information Extraction Study: Take In Mind the Tokenization! (official repository of the paper)
This project helps natural language processing researchers analyze how different tokenization strategies impact the performance of information extraction models, particularly for identifying relationships between entities in text. It takes textual data and various language model configurations as input, and outputs trained models and analysis results on their effectiveness. This is for computational linguists or AI researchers experimenting with advanced NLP techniques.
No commits in the last 6 months.
Use this if you are a researcher studying the nuances of tokenization and its inductive bias on information extraction tasks.
Not ideal if you are looking for a plug-and-play solution for general text analysis or a simple API for common information extraction tasks.
Stars
7
Forks
1
Language
Shell
License
MIT
Category
Last pushed
Oct 30, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/christos42/inductive_bias_IE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nltk/nltk
NLTK Source
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many...
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)