Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (7): 28-37    DOI: 10.11925/infotech.2096-3467.2020.0324
Research on Public Policy Support Based on Character-level CNN Technology
Qiu Erli1(),He Hongwei2,Yi Chengqi1,Li Huiying1
1Big Data Development Department, State Information Center, Beijing 100045, China
2Department of Information Management, Peking University, Beijing 100871, China
[Objective] This paper proposed an index of Internet users’ sentiment classification which is more suitable for public policy evaluation, and explored the automatic method for Internet users’ stance detection based on the deep learning technology.[Methods] Three important public policies of different types and in different fields were selected as research objects. After collecting, cleaning and labeling the related data of Sina Weibo, this paper analyzed the three policies’ support on Internet, and constructed a text classification model based on the character-level convolutional neural network (CNN) technology. Meanwhile this paper compared and interpretd the effectiveness and efficiency of the experimental results.[Results] The results showed that our model can achieve good performance on the indicators of the accuracy and recall rate of the three datasets.There were two datasets with F1 value above 0.8 and one dataset with F1 value above 0.6. Meanwhile the model took less time than the recurrent neural network (RNN) model, and the training time gap is dozens of times.[Limitations] The data sample size and policy coverage are limited, and the calculation method for Internet users’ support needs to be further studied.[Conclusions] The stance classification method and the character-level CNN technology perform well in the effectiveness and efficiency of public policy evaluation, and may play a significant role especially in the evaluation of emergency policies.

Key wordsPublic Policy      Stance Detection      Convolutional Neural Network      Weibo      Big Data     
Received: 16 April 2020      Published: 25 July 2020
ZTFLH:  TP391  
Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology. Data Analysis and Knowledge Discovery, 2020, 4(7): 28-37.

Research Route
政策编号 政策名称 发布级别 政策类型 发布时间 采集时间 数据量
A 湖北新冠肺炎疫情一线医务人员子女中考加10分 地方 突发性政策
发布后3.5小时内 4 672条
B 2019年劳动节放假安排由1天调整为4天 国家 短期政策
2019-3-22 发布后4天内 18 697条
C 《粤港澳大湾区发展规划纲要》 国家 中长期重大政策
2019-2-18 发布后4天内 16 045条
Description of the Three Policy Datasets
类别 政策A 政策B 政策C
计数 占比 计数 占比 计数 占比
支持 426 9.12% 10 888 58.23% 3 882 24.19%
反对 3 360 71.92% 1 668 8.92% 88 0.55%
中立 444 9.50% 3 036 16.24% 7 283 45.39%
不相关 361 7.73% 1 309 7.00% 3 118 19.43%
空值 81 1.73% 1 796 9.61% 1 674 10.43%
网民支持度 15.85 78.17 81.02
Labeling Results of the Three Policy Datasets
Sentiment Classification Model for Public Policy Comments Based on the Character-level CNN
层数 类型 参数名称 参数设置
第1层 重构层(映射) 字符表长度 5 000
序列长度 600
词向量维度 64
第2层 卷积层 卷积核数量 256
卷积核长度 2
步长 1
补零位设置 1
第3层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第4层 卷积层 卷积核数量 256
卷积核长度 3
步长 1
补零位设置 1
第5层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第6层 卷积层 卷积核数量 256
卷积核长度 4
步长 1
补零位设置 1
第7层 最大池化层 池化层过滤器长度 1
步长 all(文本长度)
第8层 全连接层 神经元数量 128
第9层 Dropout层 丢失节点值的概率 0.5
第10层 全连接层 神经元数量 128
Parameter Settings for Each Layer of the Model
Classification Effectiveness Due to Different Numbers of Iterations
Contrast in Final Application Between CNN and RNN (F1)
Contrast in Final Application Between CNN and RNN (AUC)
Training Time Between CNN and RNN
