Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (10): 47-57    DOI: 10.11925/infotech.2096-3467.2020.0127
Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”
Wang Xiwei1,2,3,Zhang Liu1(),Huang Bo4,Wei Ya’nan1
1School of Management, Jilin University, Changchun 130022, China
2Research Center for Big Data Management, Jilin University, Changchun 130022, China
3Cyberspace Governance Research Center, Jilin University, Changchun 130022, China
4School of Computer Science and Technology, Jilin University, Changchun 130022, China
[Objective] This paper constructs a topic graph for Weibo users, aiming to identify the characteristics of user groups and opinion leaders. It also tries to guide online public opinion and reduce the surveillance costs.[Methods] First, we built a processing model for topic graph of Weibo users based on LDA. Then, we determined the optimal number and distribution of users’ topics with the index of perplexity. Third, we used JS divergence to measure the similarity of user topics, and constructed the topic graph. Finally, we took “Egypt air disaster” data to examine the proposed method.[Results] The topic graph generated by LDA clustered the user topics and identified the opinion leaders.[Limitations] More research is needed to determine the optimal number of LDA topics.[Conclusions] The proposed method could help us identify the characteristics of different topic groups and their opinion leaders.

Key wordsLDA      Weibo User      Topic Map     
Received: 22 February 2020      Published: 10 July 2020
ZTFLH:  TP393  
Wang Xiwei,Zhang Liu,Huang Bo,Wei Ya’nan. Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”. Data Analysis and Knowledge Discovery, 2020, 4(10): 47-57.

Processing Model of Weibo User Topic Map Based on LDA
Baidu Index of “Egypt Air Disaster”
Perplexity-Topic Line Chart
主题0 概率 主题1 概率 主题2 概率 主题3 概率 主题4 概率 主题5 概率 主题6 概率
埃塞俄比亚 0.042 停飞 0.038 照片 0.028 公布 0.027 记者会 0.044 波音 0.062 家属 0.019
飞行员 0.031 故障 0.032 护照 0.026 视频 0.024 东航 0.036 系统 0.039 起诉 0.017
发布 0.021 公布 0.016 意外 0.016 状况 0.021 国航 0.036 缺陷 0.039 遇难者 0.016
坠机 0.022 中国 0.013 信息 0.016 翻找 0.020 遗物 0.030 客机 0.038 死者 0.015
发布 0.021 全球 0.012 员工 0.015 机动 0.019 女孩 0.020 飞机 0.037 遗体 0.015
Topic High Frequency Word Distribution
Random Weibo Users’ Document-Topic Distribution
User Topic Map of “Egypt Air Disaster”
Document-Topic Average Probability
The Number of Weibo Users and Authenticated Users
User Node Distribution and Opinion Leader Identification in Topic 3
序号 用户节点 度中心度
1 凤凰网视频 62
2 安徽反邪教 54
3 S丶Rachel 53
4 高庆一 49
5 时间国际视频 26
6 眉山残联 26
7 快科技2018 22
8 火勺看点 20
9 新浪天津 11
10 潘清華005 5
User Degree Centrality in Topic 3 (Top10)
