FutureComputing4AI/KiloGrams
KiloGram algorithm for finding the top-k most frequent n-grams for large values of n quickly with fixed memory.
Stars
9
Forks
4
Language
Java
License
Apache-2.0
Category
Last pushed
Oct 08, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/FutureComputing4AI/KiloGrams"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
facebookresearch/stopes
A library for preparing data for machine translation research (monolingual preprocessing,...
rkcosmos/deepcut
A Thai word tokenization library using Deep Neural Network
Droidtown/ArticutAPI
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到...
fukuball/jieba-php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation:...
jiesutd/NCRFpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER,...