CarryChang/Real_Time_DataMining_Software

携程/榛果民宿实时评论挖掘软件,包含数据的实时采集/数据清洗/结构化保存/ UGC 数据主题提取/情感分析/后结构化可视化等技术的综合性演示 Demo。基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和 NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。主要克服用户打分和评论不一致,实时对携程和美团在线民宿的满意度进行评测以及对额外数据进行可视化的综合性工具,多维度的对在线 UGC 进行数据挖掘并可视化,demo 视频演示见链接。

45
/ 100
Emerging

The pipeline employs Request-based web scraping for automated Ctrip/Meituan comment collection, then applies sentence segmentation via POS tagging and punctuation rules to decompose multi-topic reviews before applying topic-dictionary classification and supervised sentiment models trained on user ratings. A Python GUI (RealTime_UGC_Analysis_GUI.py) orchestrates the workflow, storing raw data in local TXT files and generating comparative visualizations across dimensions like topic-specific sentiment, repeat booking rates, and temporal booking trends to reconcile rating-comment inconsistencies.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

82

Forks

24

Language

Python

License

Apache-2.0

Last pushed

Feb 01, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/CarryChang/Real_Time_DataMining_Software"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.