1000-7/xinlp
把李航老师《统计学习方法》的后几章的算法都用java实现了一遍,实现盒子与球的EM算法,扩展到去GMM训练,后来实现了HMM分词(实现了HMM分词的参数训练)和CRF分词(借用CRF++训练的参数模型),最后利用tensorFlow把BiLSTM+CRF实现了,然后为lucene包装了一个XinAnalyzer
No commits in the last 6 months.
Stars
23
Forks
11
Language
Java
License
—
Category
Last pushed
Jun 17, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/1000-7/xinlp"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
blmoistawinde/HarvestText
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
polm/unidic-py
Unidic packaged for installation via pip.
huspacy/huspacy
HuSpaCy: industrial-strength Hungarian natural language processing
BramVanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and...
tanloong/neosca
L2SCA & LCA fork: cross-platform, GUI, without Java dependency