Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (7): 28-37    DOI: 10.11925/infotech.2096-3467.2020.0324
Current Issue | Archive | Adv Search |
Research on Public Policy Support Based on Character-level CNN Technology
Qiu Erli1(),He Hongwei2,Yi Chengqi1,Li Huiying1
1Big Data Development Department, State Information Center, Beijing 100045, China
2Department of Information Management, Peking University, Beijing 100871, China
Download: PDF (1053 KB)   HTML ( 18
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposed an index of Internet users’ sentiment classification which is more suitable for public policy evaluation, and explored the automatic method for Internet users’ stance detection based on the deep learning technology.[Methods] Three important public policies of different types and in different fields were selected as research objects. After collecting, cleaning and labeling the related data of Sina Weibo, this paper analyzed the three policies’ support on Internet, and constructed a text classification model based on the character-level convolutional neural network (CNN) technology. Meanwhile this paper compared and interpretd the effectiveness and efficiency of the experimental results.[Results] The results showed that our model can achieve good performance on the indicators of the accuracy and recall rate of the three datasets.There were two datasets with F1 value above 0.8 and one dataset with F1 value above 0.6. Meanwhile the model took less time than the recurrent neural network (RNN) model, and the training time gap is dozens of times.[Limitations] The data sample size and policy coverage are limited, and the calculation method for Internet users’ support needs to be further studied.[Conclusions] The stance classification method and the character-level CNN technology perform well in the effectiveness and efficiency of public policy evaluation, and may play a significant role especially in the evaluation of emergency policies.

Key wordsPublic Policy      Stance Detection      Convolutional Neural Network      Weibo      Big Data     
Received: 16 April 2020      Published: 25 July 2020
ZTFLH:  TP391  
Corresponding Authors: Qiu Erli     E-mail: qiuerli@sic.gov.cn

Cite this article:

Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology. Data Analysis and Knowledge Discovery, 2020, 4(7): 28-37.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0324     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I7/28

Research Route
政策编号 政策名称 发布级别 政策类型 发布时间 采集时间 数据量
A 湖北新冠肺炎疫情一线医务人员子女中考加10分 地方 突发性政策
应急管理领域
2020-2-18
12时许
发布后3.5小时内 4 672条
B 2019年劳动节放假安排由1天调整为4天 国家 短期政策
民生领域
2019-3-22 发布后4天内 18 697条
C 《粤港澳大湾区发展规划纲要》 国家 中长期重大政策
经济领域
2019-2-18 发布后4天内 16 045条
Description of the Three Policy Datasets
类别 政策A 政策B 政策C
计数 占比 计数 占比 计数 占比
支持 426 9.12% 10 888 58.23% 3 882 24.19%
反对 3 360 71.92% 1 668 8.92% 88 0.55%
中立 444 9.50% 3 036 16.24% 7 283 45.39%
不相关 361 7.73% 1 309 7.00% 3 118 19.43%
空值 81 1.73% 1 796 9.61% 1 674 10.43%
网民支持度 15.85 78.17 81.02
Labeling Results of the Three Policy Datasets
Sentiment Classification Model for Public Policy Comments Based on the Character-level CNN
层数 类型 参数名称 参数设置
第1层 重构层(映射) 字符表长度 5 000
序列长度 600
词向量维度 64
第2层 卷积层 卷积核数量 256
卷积核长度 2
步长 1
补零位设置 1
第3层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第4层 卷积层 卷积核数量 256
卷积核长度 3
步长 1
补零位设置 1
第5层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第6层 卷积层 卷积核数量 256
卷积核长度 4
步长 1
补零位设置 1
第7层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第8层 全连接层 神经元数量 128
第9层 Dropout层 丢失节点值的概率 0.5
第10层 全连接层 神经元数量 128
Parameter Settings for Each Layer of the Model
Classification Effectiveness Due to Different Numbers of Iterations
Contrast in Final Application Between CNN and RNN (F1)
Contrast in Final Application Between CNN and RNN (AUC)
Training Time Between CNN and RNN
[1] Bollen J, Pepe A, Mao H. Modeling Public Mood and Emotion: Twitter Sentiment and Socio-economic Phenomena[C] // Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM). 2011: 450-453.
[2] 董颖红, 陈浩, 赖凯声, 等. 微博客基本社会情绪的测量及效度检验[J]. 心理科学, 2015,38(5):1141-1146.
[2] ( Dong Yinghong, Chen Hao, Lai Kaisheng, et al. Weibo Social Moods Measurement and Validation[J]. Journal of Psychological Science, 2015,38(5):1141-1146.)
[3] Bermingham A, Smeaton A F. On Using Twitter to Monitor Political Sentiment and Predict Election Results[C] // Proceedings of the 2011 Workshop on Sentiment Analysis Where AI Meets Psychology (SAAIP). 2011: 2-10.
[4] Agarwal A, Xie B, Vovsha I, et al. Sentiment Analysis of Twitter Data[C] // Proceedings of the 2011 Workshop on Language in Social Media (LSM). 2011: 30-38.
[5] 周艳芳, 周刚, 鹿忠磊. 一种基于迁移学习及多表征的微博立场分析方法[J]. 计算机科学, 2018,45(9):243-247.
[5] ( Zhou Yanfang, Zhou Gang, Lu Zhonglei. Approach of Stance Detection in Micro-blog Based on Transfer Learning and Multi-representation[J]. Computer Science, 2018,45(9):243-247.)
[6] Durahim A O, Coşkun M. #iamhappybecause: Gross National Happiness Through Twitter Analysis and Big Data[J]. Technological Forecasting & Social Change, 2015,99:92-105.
[7] 朱廷劭. 大数据时代的心理学研究及应用[M]. 北京: 科学出版社, 2016.
[7] ( Zhu Tingshao. Psychological Research and Application in the Age of Big data[M]. Beijing: Science Press, 2016.)
[8] 魏颖, 张竞, 邱尔丽. 基于大数据视角的“双创”推进效果评价[J]. 中国经贸导刊, 2019(16):12-14.
[8] ( Wei Ying, Zhang Jing, Qiu Erli. Evaluation of the Effect of Polices on Promoting Mass Entrepreneurship and Innovation Based on Big Data[J]. China Economic & Trade Herald, 2019(16):12-14.)
[9] 王建冬, 童楠楠, 易成岐. 大数据时代公共政策评估的变革:理论、方法与实践[M]. 北京: 社会科学文献出版社, 2019.
[9] ( Wang Jiandong, Tong Nannan, Yi Chengqi. The Transformation of Public Policy Assessment in the Age of Big Data[M]. Beijing: Social Sciences Academic Press, 2019.)
[10] 王亚民, 宁静, 马续补. 基于社会化媒体的公共政策舆情支持度研究[J]. 情报理论与实践, 2018,41(3):95-100.
[10] ( Wang Yamin, Ning Jing, Ma Xubu. Research on Public Opinion Support of Public Policy Based on Social Media[J]. Information Studies: Theory & Application, 2018,41(3):95-100.)
[11] Hu G, Yang Y, Yi D, et al. When Face Recognition Meets with Deep Learning: An Evaluation of Convolutional Neural Networks for Face Recognition[C] // Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops. 2015: 142-150.
[12] Nian F, Li T, Wang Y, et al. Pornographic Image Detection Utilizing Deep Convolutional Neural Networks[J]. Neurocomputing, 2016,210:283-293.
[13] Liu Z, Wu Z, Li T, et al. GMM and CNN Hybrid Method for Short Utterance Speaker Recognition[J]. IEEE Transactions on Industrial Informatics, 2018,14(7):3244-3252.
[14] Athiwaratkun B, Stokes J W. Malware Classification with LSTM and GRU Language Models and a Character-level CNN[C] // Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017: 2482-2486.
[15] Kim Y. Convolutional Neural Networks for Sentence Classification[C] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . 2014: 1746-1751.
[16] Zhang X, Zhao J, LeCun Y. Character-level Convolutional Networks for Text Classification[C] // Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS) . 2015: 649-657.
[17] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. Transactions of the Association for Computational Linguistics (TACL), 2017,5:135-146.
[18] Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C] // Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL). 2016: 1480-1489.
[19] Kim Y, Jernite Y, Sontag D, et al. Character-aware Neural Language Models[C] // Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016: 2741-2749.
[20] 刘万军, 梁雪剑, 曲海成. 不同池化模型的卷积神经网络学习性能研究[J]. 中国图象图形学报, 2016,21(9):1178-1190.
[20] ( Liu Wanjun, Liang Xuejian, Qu Haicheng. Learning Performance of Convolutional Neural Networks with Different Pooling Models[J]. Journal of Image and Graphics, 2016,21(9):1178-1190.)
[21] Wang X, Jiang W, Luo Z. Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts[C] // Proceedings of the 26th International Conference on Computational Linguistics (COLING). 2016: 2428-2437.
[22] Manning C, Raghavan P, Schütze H. Book Review: Introduction to Information Retrieval[J]. Natural Language Engineering, 2010,16(1):100-103.
[23] Brown C D, Davis H T. Receiver Operating Characteristics Curves and Related Decision Measures: A Tutorial[J]. Chemometrics and Intelligent Laboratory Systems, 2006,80(1):24-38.
[24] Pascanu R, Mikolov T, Bengio Y. On the Difficulty of Training Recurrent Neural Networks[C] // Proceedings of the 30th International Conference on Machine Learning (ICML). 2013: 1310-1318.
[1] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[2] Chang Zhijun,Qian Li,Xie Jing,Wu Zhenxin,Zhang Hu,Yu Qianqian,Wang Ying,Wang Yongji. Big Data Platform for Sci-Tech Literature Based on Distributed Technology[J]. 数据分析与知识发现, 2021, 5(3): 69-77.
[3] Chen Shiji, Qiu Junping, Yu Bo. Topic Analysis of LIS Big Data Research with Overlay Mapping[J]. 数据分析与知识发现, 2021, 5(10): 51-59.
[4] Zhao Yuxiang,Lian Jingwen. Review of Cultural Heritage Crowdsourcing in the Domain of Digital Humanities[J]. 数据分析与知识发现, 2021, 5(1): 36-55.
[5] Wang Jiandong,Yu Shiyang. Principles on Constructing National Economic Brain[J]. 数据分析与知识发现, 2020, 4(7): 2-17.
[6] Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[7] Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[8] Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[9] Wang Xiwei,Zhang Liu,Huang Bo,Wei Ya’nan. Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”[J]. 数据分析与知识发现, 2020, 4(10): 47-57.
[10] Jiandong Wang. Monitoring and Forecasting Economic Performance with Big Data[J]. 数据分析与知识发现, 2020, 4(1): 12-26.
[11] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[12] Beibei Kong,Jing Xie,Li Qian,Zhijun Chang,Zhenxin Wu. Methodology and Tools to Enrich Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(7): 113-122.
[13] Xiaozhou Dong,Xinkang Chen. E-Coupon and Economic Performance of E-commerce[J]. 数据分析与知识发现, 2019, 3(6): 42-49.
[14] Kan Liu,Lu Chen. Deep Neural Network Learning for Medical Triage[J]. 数据分析与知识发现, 2019, 3(6): 99-108.
[15] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn