Predicitng Retweets of Government Microblogs with Deep-combined Features

doi:10.11925/infotech.2096-3467.2019.0720

Data Analysis and Knowledge Discovery

2020, Vol. 4

Issue (2/3): 18-28 DOI: 10.11925/infotech.2096-3467.2019.0720

Current Issue | Archive | Adv Search

Predicitng Retweets of Government Microblogs with Deep-combined Features

Xu Yuemei(

),Liu Yunwen,Cai Lianqiao

School of Information Science and Technology, Beijing Foreign Studies University, Beijing 100089, China

Download: PDF (2023 KB) HTML ( 31 )
Export: BibTeX | EndNote (RIS)

Abstract

[Objective] This paper tries to predict the number of retweets of government microblogs, aiming to evaluate the important features affecting retweets and public opinions.[Methods] First, we used the Convolutional Neural Network (CNN) and Gradient Boosting Decision Tree (GBDT) to combine user, time and content features. Then, we predicted the retweet numbers of government microblogs. Finally, we ranked the importance of every feature to find the most important one for retweets.[Results] The proposed model improved the accuracy of retweet prediction to 0.933. The semantic feature of microblog texts is the most important one.[Limitations] We did not study the impacts of indirect retweeting behaviors.[Conclusions] The CNN-GBDT model for deep-combined features could effectively predict retweets of government microblogs.

Key words： Government Microblogs Retweeting Scale Prediction Convolutional Neural Network Text Classification

Received: 20 June 2019 Published: 26 April 2020

ZTFLH:

TP393

Corresponding Authors: Yuemei Xu E-mail: xuyuemei@bfsu.edu.cn

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yuemei Xu
	Yunwen Liu
	Lianqiao Cai

Cite this article:

Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 18-28.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0720 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I2/3/18

Flowchart of Retweeting Scale Prediction of Government Microblogs Based on Deep-combined Features

The Procedure of Microblogs Text Semantic Calculation Based on CNN

Examples of the Raw Dataset

Parameter Settings of CNN

Examples of Input Dataset in the Retweeting-Scale-Prediction Model

Confusion Matrix

Experiment Results

Accuracy of the Four Algorithms

Recall of the Four Algorithms

Precision of the Four Algorithms

F1-value of the Four Algorithms

Performance of GBDT Model Using Different Feature Settings

Performance of SVM Model Using Different Feature Settings

Importance Ranking of Different Features Measured by GBDT

[1]	刘泱育 . 新闻大学[J]. 新闻大学, 2017(1):78-84.
[1]	Liu Yangyu . Communication Efficacy of the Local Government Affairs Micro-blogging in China: Evidence from the Central Government Work Report by the Official Sina Micro-blogging in 31 Provincial Capital Cities[J]. Journalism Bimonthly, 2017(1):78-84.)
[2]	人民网舆情数据中心. 2018年度人民日报政务指数·微博影响力报告[R/OL]. [ 2019- 03- 03]. http://yuqing.people.com.cn/NMediaFile/2019/0121/MAIN201901211335000329860253572.pdf.
[2]	( Public Sentiment Data Center of People’s Daily Online. Government Affairs Index of People’s Daily and Report of Microblog Influence in 2018[R/OL]. [ 2019- 03- 03]. http://yuqing.people.com.cn/NMediaFile/2019/0121/MAIN201901211335000329860253572.pdf.)
[3]	仇学明, 肖基毅, 陈磊 . 基于用户特征的微博转发预测研究[J]. 南华大学学报:自然科学版, 2016,30(4):100-105.
[3]	( Qiu Xueming, Xiao Jiyi, Chen Lei . Research on Micro-blog Forward Prediction Based on User Characteristics[J]. Journal of University of South China: Science and Technology, 2016,30(4):100-105.)
[4]	刘玮, 贺敏, 王丽宏 , 等. 基于用户行为特征的微博转发预测研究[J]. 计算机学报, 2016,39(10):1992-2006.
[4]	( Liu Wei, He Min, Wang Lihong , et al. Research on Microblog Retweeting Prediction Based on User Behavior Features[J]. Chinese Journal of Computers, 2016,39(10):1992-2006.)
[5]	马晓峰, 王磊, 陈观淡 . 基于混合特征学习的微博转发预测方法[J]. 计算机应用与软件, 2016,33(11):249-252, 257.
[5]	( Ma Xiaofeng, Wang Lei, Chen Guandan . A Microblogging Retweet Prediction Method Based on Hybrid Features Learning[J]. Computer Applications and Software, 2016,33(11):249-252, 257.)
[6]	李志清 . 基于LDA主题特征的微博转发预测[J]. 情报杂志, 2015,34(9):158-162.
[6]	( Li Zhiqing . Predicting Retweeting Behavior Based on LDA Topic Features[J]. Journal of Intelligence, 2015,34(9):158-162.)
[7]	Kim Y . Convolutional Neural Networks for Sentence Classification [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. 2014: 1746-1751.
[8]	Friedman J H . Greedy Function Approximation: A Gradient Boosting Machine[J]. The Annals of Statistics, 2001,29(5):1189-1232.
[9]	Petrovic S, Osborne M, Lavrenko V . RT to Win! Predicting Message Propagation in Twitter [C]// Proceedings of the 5th International AAAI Conference on Web and Social Media. 2011.
[10]	曹玖新, 吴江林, 石伟 , 等. 新浪微博网信息传播分析与预测[J]. 计算机学报, 2014,37(4):779-790.
[10]	( Cao Jiuxin, Wu Jianglin, Shi Wei , et al. Sina Microblog Information Diffusion Analysis and Prediction[J]. Chinese Journal of Computers, 2014,37(4):779-790.)
[11]	陈江, 刘玮, 巢文涵 , 等. 融合热点话题的微博转发预测研究[J]. 中文信息学报, 2015,29(6):150-158.
[11]	( Chen Jiang, Liu Wei, Chao Wenhan , et al. Microblog Forwarding Prediction Based on Hot Topics[J]. Journal of Chinese Information Processing, 2015,29(6):150-158.)
[12]	Weng J, Lim E P, Jiang J , et al. TwitterRank: Finding Topic Sensitive Influential Twitters [C]// Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. ACM, 2010: 261-270.
[13]	李倩, 张碧君, 赵中英 . 微博信息转发影响因素研究[J]. 软件导刊, 2017,16(1):15-17.
[13]	( Li Qian, Zhang Bijun, Zhao Zhongying . Research on the Influencing Factors of Microblogs Information[J]. Software Guide, 2017,16(1):15-17.)
[14]	周莉, 李晓, 黄娟 . 新闻大学[J].新闻大学, 2015(2):144-152.
[14]	( Zhou Li, Li Xiao, Huang Juan . The Release of Information and Its Impact on Government Microblogs in Emergencies[J]. Journalism Bimonthly, 2015(2):144-152.)
[15]	陈然, 刘洋 . 电子政务[J].电子政务, 20177):108-117.
[15]	( Chen Ran, Liu Yang . Research on the Dissemination Mode of Government Microblogs Based on Retweeting Behaviors[J]. E-Government, 2017(7):108-117.)
[16]	张漫锐, 刘文波 . 政务微博传播效果影响因素研究——以“江宁公安在线”为例[J]. 今传媒, 2017,25(10):72-73.
[16]	( Zhang Manrui, Liu Wenbo . A Study on the Influencing Factors of the Effect of Government Microblogs——Taking Jiangning Public Security Online as an Example[J]. Today’s Massmedia, 2017,25(10):72-73.)
[17]	李倩倩, 姜景, 李瑛 , 等. 我国政务微博转发规模分类预测[J]. 情报杂志, 2018,37(1):95-99.
[17]	( Li Qianqian, Jiang Jing, Li Ying , et al. The Retweeting Scale Classification Prediction of Government Microblogs in China[J]. Journal of Intelligence, 2018,37(1):95-99.)
[18]	Maning C D, Schütze H, Raghavan P. 信息检索导论[M]. 王斌译. 北京: 人民邮电出版社, 2011.
[18]	( Manning C D, Schütze H, Raghavan P. Introduction to Information Retrieval[M]. Translated by Wang Bin. Beijing: Post &Telecom Press, 2011.)
[19]	Ilia I, Tsangaratos P . Applying Weight of Evidence Method and Sensitivity Analysis to Produce a Landslide Susceptibility Map[J]. Landslides, 2016,13(2):379-397.

[1]	Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2]	Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3]	Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[4]	Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[5]	Wang Yan, Wang Huyan, Yu Bengong. Chinese Text Classification with Feature Fusion[J]. 数据分析与知识发现, 2021, 5(10): 1-14.
[6]	Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[7]	Wang Sidi,Hu Guangwei,Yang Siyu,Shi Yun. Automatic Transferring Government Website E-Mails Based on Text Classification[J]. 数据分析与知识发现, 2020, 4(6): 51-59.
[8]	Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[9]	Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[10]	Xu Tongtong,Sun Huazhi,Ma Chunmei,Jiang Lifen,Liu Yichen. Classification Model for Few-shot Texts Based on Bi-directional Long-term Attention Features[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[11]	Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[12]	Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[13]	Yunfei Shao,Dongsu Liu. Classifying Short-texts with Class Feature Extension[J]. 数据分析与知识发现, 2019, 3(9): 60-67.
[14]	Heran Qin,Liu Liu,Bin Li,Dongbo Wang. Automatic Classification of Ancient Classics with Entity Features[J]. 数据分析与知识发现, 2019, 3(9): 68-76.
[15]	Guo Chen,Tianxiang Xu. Sentence Function Recognition Based on Active Learning[J]. 数据分析与知识发现, 2019, 3(8): 53-61.

Viewed

Full text

Abstract

Cited

Shared

Discussed