Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”
Wang Xiwei1,2,3,Zhang Liu1(),Huang Bo4,Wei Ya’nan1
1School of Management, Jilin University, Changchun 130022, China 2Research Center for Big Data Management, Jilin University, Changchun 130022, China 3Cyberspace Governance Research Center, Jilin University, Changchun 130022, China 4School of Computer Science and Technology, Jilin University, Changchun 130022, China
[Objective] This paper constructs a topic graph for Weibo users, aiming to identify the characteristics of user groups and opinion leaders. It also tries to guide online public opinion and reduce the surveillance costs.[Methods] First, we built a processing model for topic graph of Weibo users based on LDA. Then, we determined the optimal number and distribution of users’ topics with the index of perplexity. Third, we used JS divergence to measure the similarity of user topics, and constructed the topic graph. Finally, we took “Egypt air disaster” data to examine the proposed method.[Results] The topic graph generated by LDA clustered the user topics and identified the opinion leaders.[Limitations] More research is needed to determine the optimal number of LDA topics.[Conclusions] The proposed method could help us identify the characteristics of different topic groups and their opinion leaders.
王晰巍,张柳,黄博,韦雅楠. 基于LDA的微博用户主题图谱构建及实证研究*——以“埃航空难”为例[J]. 数据分析与知识发现, 2020, 4(10): 47-57.
Wang Xiwei,Zhang Liu,Huang Bo,Wei Ya’nan. Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”. Data Analysis and Knowledge Discovery, 2020, 4(10): 47-57.
( Zhu Xiaoxia, Song Jiaxin, Meng Jianfang. Analysis of Online Public Opinion Information Based on the Dynamic Theme-Emotion Evolution Model[J]. Information Science, 2019,37(7):72-78.)
( Wang Xiwei, Zhang Liu, Li Shimeng, et al. The Dissemination of Online Public Opinion on Social Welfare Issues via New Media: Case Study of “Draw up the Lifeline” in Sina Weibo[J]. Data Analysis and Knowledge Discovery, 2017,1(6):93-101.)
( Ling Chen, Feng Junwen, Wu Peng, et al. A Study on Crisis Response of Campus Network Public Opinion Based on SOAR Model[J]. Information Science, 2019,37(9):145-152.)
[4]
Chen S Y, Jin Z S. Weibo Topic Detection Based on Improved TF-IDF Algorithm[J]. Science & Technology Review, 2016,34(2):282-286.
doi: 10.1126/science.34.870.282
[5]
Srijith P K, Hepple M, Bontcheva K, et al. Sub-Story Detection in Twitter with Hierarchical Dirichlet Processes[J]. Information Processing & Management, 2017,53(4):989-1003.
doi: 10.1016/j.ipm.2016.10.004
[6]
Choi H J, Park C H. Emerging Topic Detection in Twitter Stream Based on High Utility Pattern Mining[J]. Expert Systems with Applications, 2019,115(1):27-36.
doi: 10.1016/j.eswa.2018.07.051
[7]
Nolasco D, Oliveira J. Subevents Detection Through Topic Modeling in Social Media Posts[J]. Future Generation Computer Systems, 2019,93(4):290-303.
doi: 10.1016/j.future.2018.09.008
[8]
Ma T H, Li J, Liang X N, et al. A Time-Series Based Aggregation Scheme for Topic Detection in Weibo Short Texts[J]. Physica A: Statistical Mechanics and Its Applications, 2019, 536: Article No. 120972.
doi: 10.1016/j.physa.2019.04.266
pmid: 32288109
( Liang Xiaohe, Tian Ruya, Wu Lei, et al. A Method of Public Opinion Topic Mining in Micro-Blog Based on Super-Network[J]. Information Studies: Theory & Application, 2017,40(10):100-105.)
( Zhao Changyu, Wu Yaping, Wang Jimin. Twitter Text Topic Mining and Sentiment Analysis Under the Belt and Road Initiative[J]. Library and Information Service, 2019,63(19):119-127.)
( Zhu Xiaoxia, Song Jiaxin, Meng Jianfang. Research on the Classification of Emotion in Microblog Comments Based on the Theme-Emotion Mining Model[J]. Information Studies: Theory & Application, 2019,42(5):159-164.)
( Xu Min, Li Guangjian. Short Texts’ Hot Topics Detection: Based on Word Frequency Mean Fluctuation and Probabilistic Language Model[J]. Journal of Intelligence, 2019,38(6):152-158.)
[14]
Zhang Y L, Eick C F. Tracking Events in Twitter by Combining an LDA-Based Approach and a Density-Contour Clustering Approach[J]. International Journal of Semantic Computing, 2019,13(1):87-110.
doi: 10.1142/S1793351X19400051
[15]
Luo L X. Network Text Sentiment Analysis Method Combining LDA Text Representation and GRU-CNN[J]. Personal and Ubiquitous Computing, 2019,23(3-4):405-412.
( Cai Yongming, Chang Qing. Chinese Short Text Topic Analysis by Latent Dirichlet Allocation Model with Co-word Network Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(3):305-317.)
[17]
Huang L, Ma J Y, Chen C L. Topic Detection from Microblogs Using T-LDA and Perplexity[C]//Proceedings of the 24th Asia-Pacific Software Engineering Conference Workshops. 2017: 71-77.
( Guan Peng, Wang Yuefen. Identifying Optimal Topic Numbers from Sci-Tech Information with LDA Model[J]. New Technology of Library and Information Service, 2016(9):42-50.)
( Zeng Ziming, Wang Jing. Research on Microblog Rumor Identification Based on LDA and Random Forest[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(1):89-96.)
[20]
Zareie A, Sheikhahmadi A, Jalili M. Identification of Influential Users in Social Networks Based on Users’ Interest[J]. Information Sciences, 2019,493(4):217-231.
doi: 10.1016/j.ins.2019.04.033
[21]
Wang H C, Chen W F, Lin C Y. NoteSum: An Integrated Note Summarization System by Using Text Mining Algorithms[J]. Information Sciences, 2020,513(3):536-552.
doi: 10.1016/j.ins.2019.11.011
( Jiang Yanqing, Xu Xin. Research on Microblog Information Obsolescence from the Perspective of Half-Life: Taking Universities’ Official Microblog for Example[J]. Documentation, Information& Knowledge, 2016(2):94-102.)
[24]
Hagen L. Content Analysis of E-petitions with Topic Modeling: How to Train and Evaluate LDA Models?[J]. Information Processing& Management, 2018,54(6):1292-1307.
doi: 10.1016/j.ipm.2018.05.006
[25]
Jain L, Katarya R. Discover Opinion Leader in Online Social Network Using Firefly Algorithm[J]. Expert Systems with Applications, 2019,112(5):1-15.
doi: 10.1016/j.eswa.2018.06.026
( Zhang Liu, Wang Xiwei, Huang Bo, et al. A Sentiment Classification Model of Multi-scale Convolutional Neural Network Microblog Comments Based on Word Vectors and Experimental Research[J]. Library and Information Service, 2019,63(18):99-108.)