Clustering User Groups of Public Opinion Events from Multi-dimensional Social Network
Wang Xiwei1,2,3,Jia Ruonan1(),Wei Yanan1,Zhang Liu1
1School of Management, Jilin University, Changchun 130022, China 2Research Center for Big Data Management, Jilin University, Changchun 130022, China 3Cyberspace Governance Research Center, Jilin University, Changchun 130022, China
[Objective] User groups are the main units to disseminate public opinion. This study identifies the characteristics of user groups through clustering techniques, which could help social network companies provide better services. [Methods] With the help of Group Theory, we clustered users based on their influence, sentiments, and behaviors. First, we collected user data from the Sina Weibo. Then, we utilized Canopy and K-Means algorithms to cluster users. Finally, we visualized our findings with Neo4j and Weka. [Results] User groups of the same public opinion event were different in emotion, influence, and behaviors, while user groups from different public opinion events shared common characteristics. [Limitations] Both public opinion events in this study happened at Chinese universities, and we only collected data from Sina Weibo. [Conclusions] Based on the clustering results, we could propose effective administration strategies for each user group in the same or different public opinion events.
王晰巍,贾若男,韦雅楠,张柳. 多维度社交网络舆情用户群体聚类分析方法研究*[J]. 数据分析与知识发现, 2021, 5(6): 25-35.
Wang Xiwei,Jia Ruonan,Wei Yanan,Zhang Liu. Clustering User Groups of Public Opinion Events from Multi-dimensional Social Network. Data Analysis and Knowledge Discovery, 2021, 5(6): 25-35.
(Sina Weibo Data Center. 2018 Weibo User Development Report [EB/OL]. [2019-11-01]. http://www.199it.com/archives/847890.html.)
[3]
Garcia D, Rimé B. Collective Emotions and Social Resilience in the Digital Traces After a Terrorist Attack[J]. Psychological Science, 2019,30(4):617-628.
doi: 10.1177/0956797619831964
pmid: 30865565
[4]
Qiu Z C, Shen H. User Clustering in a Dynamic Social Network Topic Model for Short Text Streams[J]. Information Sciences, 2017,414:102-116.
doi: 10.1016/j.ins.2017.05.018
[5]
Liu Z Y, Ma Y H. A Divide and Agglomerate Algorithm for Community Detection in Social Networks[J]. Information Sciences, 2019,482:321-333.
doi: 10.1016/j.ins.2019.01.028
[6]
You X M, Ma Y H, Liu Z Y. A Three-stage Algorithm on Community Detection in Social Networks[J]. Knowledge-Based Systems, 2020,187:104822.
doi: 10.1016/j.knosys.2019.06.030
(Lin Yanxia, Xie Xiangsheng. User Portrait of Diversified Groups in Micro-blog Based on Social Identity Theory[J]. Information Studies: Theory & Applicaiton, 2018,41(3):142-148.)
(He Gaoqi, Bian Xiaohui, Sun Fei, et al. Crowd Emotional Contagion Model Based on the Epidemic Mechanism under Emergencies[J]. Journal of East China University of Science and Technology (Natural Science Edition), 2018,44(6):909-917, 949.)
(Zhang Haitao, Liu Yashu, Zhang Xiaohui, et al. Research on Topic Discovery Based on Modularity and Sentiment Fluctuation of Internet Users——Taking Sina Weibo’s “China-US Trade Friction” as an Example[J]. Library and Information Service, 2019,63(4):5-14.)
(Sun Yueheng, Liu Xiaotong, Wang Wenjun. Predicting the Event-driven Evolution Behavior of Online Social Groups[J]. Journal of Intelligence, 2019,38(6):110-117.)
[11]
顾明远. 教育大辞典(增订合编本)[M]. 上海: 上海教育出版社, 1998.
[11]
(Gu Mingyuan. The Dictionary of Education (Revised Edition)[M]. Shanghai: Shanghai Education Press, 1998.)
[12]
古斯塔夫·勒庞. 乌合之众:大众心理研究[M]. 冯克利译. 北京: 中央编译出版社, 2005.
[12]
(Gustav. Le Pen. The Crowd: A Study of Popular Mind[M]. Translated by Feng Keli. Beijing: Central Compilation and Translation Press, 2005.)
(Wang Xiao, Chen Yunben. The Effect of Bystanders on Campus Bullying and Its Corrective Strategies: Analysis Based on Group Theory[J]. Study & Exploration, 2019(3):44-48.)
(Zhang Haitao, Tang Shiman, Wei Mingzhu, et al. Research on the Clustering of Microblog Users Based on Multi-dimensional Attribute Weighting Analysis[J]. Library and Information Service, 2018,62(24):124-133.)
[15]
Liang S S, Ren Z C, Zhao Y K, et al. Inferring Dynamic User Interests in Streams of Short Texts for User Clustering[J]. ACM Transactions on Information Systems, 2017,36(1):10.
[16]
Hu L, Xing Y H, Gong Y L, et al. Nonnegative Matrix Tri-factorization with User Similarity for Clustering in Point-of-Interest[J]. Neurocomputing, 2019,363:58-65.
doi: 10.1016/j.neucom.2019.07.040
[17]
Koc S S, Ozer M, Toroslu I H, et al. Triadic Co-clustering of Users, Issues and Sentiments in Political Tweets[J]. Expert Systems with Applications, 2018,100:79-94.
doi: 10.1016/j.eswa.2018.01.043
(Wang Xiwei, Zhang Liu, Wen Qing, et al. Research on Sentiment Evaluation of Online Public Opinion Based on the Bayesian Model in a Mobile Environment: The Case of “China Women’s Volleyball Won the Championship in the Rio Olympics” in Sina Weibo[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(12):1241-1248.)
(Bi Dianjie, Wei Sulin, Zhao Tao, et al. Sentiment Analysis of Online Comments Based on Convolutional Neural Network[J]. Journal of Hebei Normal University of Science & Technology, 2019,33(2):41-47.)
(Zhang Liu, Wang Xiwei, Huang Bo, et al. A Sentiment Classification Model and Experimental Study of Microblog Commentary Based on Multivariate Convolutional Neural Networks Based on Word Vector[J]. Library and Information Service, 2019,63(18):99-108.)
(Li Hui, Chai Yaqing. Fine-grained Sentiment Analysis Based on Convolutional Neural Network[J]. Data Analysis and Knowledge Discovery, 2019,3(1):95-103.)
(Wang Xiwei, Xing Yunfei, Wei Yanan, et al. Research on the Topic Model Construction of Sentiment Classification of Public Opinion Users in Social Networks Driven by Big Data——Taking “Immigration” as the Topic[J]. Journal of Information Resources Management, 2020,10(1):29-38, 48.)
[23]
Zhang S X, Wei Z L, Wang Y, et al. Sentiment Analysis of Chinese Micro-blog Text Based on Extended Sentiment Dictionary[J]. Future Generation Computer Systems, 2018,81:395-403.
doi: 10.1016/j.future.2017.09.048
(Lin Qing, Li Lixuan, Yang Tengfei. Study on User Influence Quantitative Model of Social Network——Taking Sina Microblog for Example[J]. Journal of Intelligence, 2018,37(8):203-207.)
(Chen Sijing, Li Gang, Mao Jin, et al. Dynamic Identification of Key Nodes in Information Propagation Networks During Emergencies[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(2):178-190.)
(Wang Yu, Liu Dongsu. Vital Node Detection and Evolution Analysis in Dynamic Networks Based on PageRank[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(7):703-711.)
(Chen Xiaowei, Shi Yutian. Identifying Key Nodes in Social Network with Improved PageRank Algorithm[J]. Data Analysis and Knowledge Discovery, 2017,1(8):68-75.)
(Zhang Fengjun. Research and Application of Social Network Data Based on Neo4j Graph Database[D]. Changsha: Hunan University, 2016.)
[29]
Holzschuher F, Peinl R. Performance of Graph Query Languages: Comparison of Cypher, Gremlin and Native Access in Neo4j[C]// Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, 2013: 195-204.
(New Media Tian Yu. Campus Public Opinion · Top Ten Hot Events of Colleges and Universities in 2018[EB/OL]. [2019-11-08]. https://baijiahao.baidu.com/s?id=1623436522696496935&wfr=spider&for=pc.)
[34]
Cho S W, Cha M S, Sohn K A. Topic Category Analysis on Twitter via Cross-media Strategy[J]. Multimedia Tools & Applications, 2016,75(20):12879-12899.
[35]
Nainggolan R, Perangin-Angin R, Simarmata E, et al. Improved the Performance of the K-Means Cluster Using the Sum of Squared Error (SSE) Optimized by Using the Elbow Method[J]. Journal of Physics Conference Series, 2019,1361:012015.
doi: 10.1088/1742-6596/1361/1/012015
[36]
Fernandez-Gavilanes M, Juncal-Martinez J, García-Méndez S, et al. Differentiating Users by Language and Location Estimation in Sentiment Analysis of Informal Text During Major Public Events[J]. Expert Systems with Applications, 2019,117:15-28.
doi: 10.1016/j.eswa.2018.09.007
[37]
Zhang W, Wang M, Zhu Y C. Does Government Information Release Really Matter in Regulating Contagion-Evolution of Negative Emotion During Public Emergencies? From the Perspective of Cognitive Big Data Analytics[J]. International Journal of Information Management, 2020,50:498-514.
doi: 10.1016/j.ijinfomgt.2019.04.001
[38]
Chen S J, Mao J, Li G, et al. Uncovering Sentiment and Retweet Patterns of Disaster-related Tweets from a Spatiotemporal Perspective - A Case Study of Hurricane Harvey[J]. Telematics and Informatics, 2020,47:101326.
doi: 10.1016/j.tele.2019.101326
[39]
Lee J Y H, Yang C S, Hsu C, et al. A Longitudinal Study of Leader Influence in Sustaining an Online Community[J]. Information & Management, 2019,56(2):306-316.
doi: 10.1016/j.im.2018.10.008
[40]
Ahajjam S, Haddad M E, Badir H. A New Scalable Leader-community Detection Approach for Community Detection in Social Networks[J]. Social Networks, 2018,54:41-49.
doi: 10.1016/j.socnet.2017.11.004
[41]
Zhang L F, Su C, Jin Y F, et al. Cross-network Dissemination Model of Public Opinion in Coupled Networks[J]. Information Sciences, 2018,451:240-252.
[42]
Li C L, Bai J P, Zhang L, et al. Opinion Community Detection and Opinion Leader Detection Based on Text Information and Network Topology in Cloud Environment[J]. Information Sciences, 2019,504:61-83.
doi: 10.1016/j.ins.2019.06.060