jdkato/prose
:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.
ArchivedBuilt entirely in pure Go with no external dependencies, prose implements a modular NLP pipeline (tokenization → POS tagging → NE extraction) with functional options to disable stages as needed. Its sentence segmenter achieves 75% accuracy on the Golden Rules benchmark while executing 4× faster than Stanford CoreNLP, and its POS tagger outperforms NLTK's implementation (96.1% vs 89.3% accuracy) on the Treebank corpus. The tokenizer handles modern text artifacts like URLs, mentions, hashtags, and emoticons as distinct tokens.
3,069 stars. No commits in the last 6 months.
Stars
3,069
Forks
169
Language
Go
License
MIT
Category
Last pushed
May 02, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jdkato/prose"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.