rekram1-node/tokenizer
Natural Language Processing (NLP) Tokenization Libary designed for English. Fast, Lean, Customizable. Tokenizes text, replaces abbreviations, replaces contractions, lowercases words, optionally you can remove stop words as well
No commits in the last 6 months.
Stars
3
Forks
—
Language
Go
License
MIT
Category
Last pushed
Feb 07, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/rekram1-node/tokenizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
guillaume-be/rust-tokenizers
Rust-tokenizer offers high-performance tokenizers for modern language models, including...
sugarme/tokenizer
NLP tokenizers written in Go language
elixir-nx/tokenizers
Elixir bindings for 🤗 Tokenizers
openscilab/tocount
ToCount: Lightweight Token Estimator
reinfer/blingfire-rs
Rust wrapper for the BlingFire tokenization library