christos42/inductive_bias_IE

An Information Extraction Study: Take In Mind the Tokenization! (official repository of the paper)

/ 100

Experimental

This project helps natural language processing researchers analyze how different tokenization strategies impact the performance of information extraction models, particularly for identifying relationships between entities in text. It takes textual data and various language model configurations as input, and outputs trained models and analysis results on their effectiveness. This is for computational linguists or AI researchers experimenting with advanced NLP techniques.

No commits in the last 6 months.

Use this if you are a researcher studying the nuances of tokenization and its inductive bias on information extraction tasks.

Not ideal if you are looking for a plug-and-play solution for general text analysis or a simple API for common information extraction tasks.

natural-language-processing information-extraction computational-linguistics relation-extraction machine-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Shell

License

MIT

Higher-rated alternatives

nltk/nltk

NLTK Source

explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

stanfordnlp/stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many...

flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Explore NLP Tools

All categories Trending NLP directory Insights