di37/ner-electrical-engineering-dataset
This repository provides scripts and notebooks to create a Named Entity Recognition (NER) dataset tailored for the electrical engineering domain.
This project helps electrical engineers, researchers, or educators who need specialized text analysis tools to create annotated datasets for the electrical engineering domain. It takes plain text about electrical engineering topics and generates detailed annotations that identify key terms and concepts. The output is a structured dataset ready for training custom natural language processing models.
No commits in the last 6 months.
Use this if you need a specialized dataset to train AI models for tasks like extracting specific information from electrical engineering documents, reports, or research papers.
Not ideal if you need a dataset for critical applications without extensive validation, as the data is generated by an LLM and may contain inaccuracies.
Stars
7
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Dec 31, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/di37/ner-electrical-engineering-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nltk/nltk
NLTK Source
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many...
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)