ku-nlp/jumanpp-jumandic

Scripts for training Jumandic Juman++ model

/ 100

Experimental

This tool helps Japanese natural language processing developers build a custom Juman++ model tailored for the Jumandic dictionary. You provide text corpora and dictionary entries, and it generates a ready-to-use Juman++ model. This is for developers or NLP engineers working on applications that require precise Japanese morphological analysis and text parsing.

No commits in the last 6 months.

Use this if you need to create a specialized Juman++ morphological analyzer with custom vocabulary for Japanese text processing applications.

Not ideal if you're an end-user looking for a pre-trained, ready-to-use Japanese NLP tool without custom model training.

Japanese-NLP morphological-analysis custom-dictionary text-processing language-model-training

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Makefile

License

—

Higher-rated alternatives

EmilStenstrom/conllu

A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.

OpenPecha/Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python

taishi-i/nagisa

A Japanese tokenizer based on recurrent neural networks

zaemyung/sentsplit

A flexible sentence segmentation library using CRF model and regex rules

natasha/razdel

Rule-based token, sentence segmentation for Russian language

Explore NLP Tools

All categories Trending NLP directory Insights