taishi-i/nagisa

A Japanese tokenizer based on recurrent neural networks

/ 100

Verified

415 stars and 263,142 monthly downloads. Used by 2 other packages. Available on PyPI.

Maintenance 10 / 25

Adoption 22 / 25

Maturity 25 / 25

Community 13 / 25

Stars

415

Forks

Language

Python

License

MIT

Category

Last pushed

Feb 12, 2026

Monthly downloads

263,142

Commits (30d)

Dependencies

Reverse dependents

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/taishi-i/nagisa"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Related tools

EmilStenstrom/conllu

A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.

OpenPecha/Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python

zaemyung/sentsplit

A flexible sentence segmentation library using CRF model and regex rules

natasha/razdel

Rule-based token, sentence segmentation for Russian language

polm/cutlet

Japanese to romaji converter in Python