PyThaiNLP/attacut

A Fast and Accurate Neural Thai Word Segmenter

62
/ 100
Established

Built on a 3-layer dilated CNN architecture that processes syllable and character features, AttaCut achieves 91% word-level F1 on the BEST benchmark while running 6x faster than previous state-of-the-art approaches. It integrates with PyTorch and provides both command-line and Python APIs for immediate use, with support for custom model retraining on user datasets. The toolkit includes pre-trained models (`attacut-sc` and `attacut-c`) optimized for different accuracy-speed tradeoffs in Thai NLP pipelines.

94 stars and 4,237 monthly downloads. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 0 / 25
Adoption 18 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

94

Forks

18

Language

Python

License

MIT

Last pushed

Jan 14, 2025

Monthly downloads

4,237

Commits (30d)

0

Dependencies

8

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/PyThaiNLP/attacut"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.