Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (4): 101-113    DOI: 10.11925/infotech.2096-3467.2022.0379
Current Issue | Archive | Adv Search |
Method for Automatically Generating Online Comments
Liu Xinran1,2,Xu Yabin1,2(),Li Jixian3
1Beijing Key Laboratory of Network Culture and Digital Communication, Beijing University of Information Science and Technology, Beijing 100101, China
2School of Computer Science, Beijing University of Information Science and Technology, Beijing 100101, China
3School of Humanities and Education, Beijing Open University, Beijing 100081, China
Download: PDF (1288 KB)   HTML ( 18
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a Temporal Sequence Generative Adversarial Network (T-SeqGAN) automatically generating online comments, aiming to counteract malicious information on social networks and guide the correct direction of public opinion. [Methods] First, we modified the Sequence Generative Adversarial Network (SeqGAN) generator to a Seq2Seq structure. Then, we used the bidirectional gated recurrent unit (BiGRU) and the sequential convolutional neural network (TCN) as the skeleton network of the encoder and decoder, respectively. Next, we improved the similarity of the syntactic structure and semantic features between the generated posts and the real online comments. Finally, we modified the discriminator of SeqGAN to a model combing TCN and attention mechanism layers to improve the fluency of generated posts. [Results] Compared with the baseline model, the comments generated by the proposed model have significantly higher BLEU-2 (0.799 35), BLEU-3(0.603 96), BLEU-4(0.476 42), and KenLM (-27.670 29)metrics, as well as lower PPL(0.752 47) metrics. [Limitations] The vocabulary and language style of the generated posts are limited by actual posts, and the applicability of our method is limited. [Conclusions] The comments generated by the proposed model have higher syntactic and grammatical correctness and higher similarity to the real-world ones, which can guide the correct direction of public opinion on social networks.

Key wordsSocial Network Comment Posts      SeqGAN      TCN      Seq2Seq     
Received: 21 April 2022      Published: 07 June 2023
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(61672101);Key Laboratory of Network Culture and Digital Communication of Beijing(ICCD XN004);Key Laboratory of Information Network Security of the Ministry of Public Security(C18601)
Corresponding Authors: Xu Yabin,ORCID:0000-0003-2727-3773,E-mail: xyb@bistu.edu.cn   

Cite this article:

Liu Xinran, Xu Yabin, Li Jixian. Method for Automatically Generating Online Comments. Data Analysis and Knowledge Discovery, 2023, 7(4): 101-113.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0379     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I4/101

Overall Architecture of T-SeqGAN Model
Encoder Structure Diagram
Decoder Structure
Discriminator Structure
生成模型 BLEU-2 BLEU-3 BLEU-4 PPL KenLM
NMT 0.701 15 0.530 68 0.418 85 0.781 67 -25.576 32
CNN2CNN 0.693 00 0.476 64 0.328 88 0.791 53 -32.772 72
VAE 0.585 18 0.208 87 0.098 22 0.844 17 -30.045 38
T2T 0.750 25 0.549 47 0.415 34 0.772 86 -31.458 31
B2T 0.795 99 0.601 05 0.474 88 0.753 72 -28.369 01
Evaluation Index Values of Posts Generated by Seq2Seq Models
Loss Function of T-SeqGAN Adversarial Training
Evaluation Indexes of Epoch in T-SeqGAN Adversarial Training
生成模型 BLEU-2 BLEU-3 BLEU-4 PPL KenLM
SeqGAN1 0.795 93 0.601 69 0.475 96 0.753 30 -28.389 92
SeqGAN2 0.795 70 0.600 78 0.475 32 0.753 65 -28.411 49
SeqGAN3 0.792 31 0.597 86 0.476 20 0.755 88 -28.694 24
T-SeqGAN 0.799 35 0.603 96 0.476 42 0.752 47 -27.670 29
Evaluation Index Values of Posts Generated by SeqGAN Models
[1] 程新斌. 对重大舆情与突发事件舆论引导研究的分析与对策[J]. 西南民族大学学报(人文社会科学版), 2022, 43(2): 235-240.
[1] Cheng Xinbin. Analysis and Countermeasures of Public Opinion Guidance Research on Major Public Opinions and Emergencies[J]. Journal of Southwest Minzu University (Humanities and Social Science), 2022, 43(2): 235-240.)
[2] 余晓青. 意识形态网络舆情应对机制研究[J]. 南京邮电大学学报(社会科学版), 2021, 23(5): 37-47.
[2] Yu Xiaoqing. The Problems and Countermeasures of Coping Mechanism on Ideological Internet Public Opinion[J]. Journal of Nanjing University of Posts and Telecommunications (Social Science Edition), 2021, 23(5): 37-47.)
[3] Zou X H, Lin C, Zhang Y J, et al. To be an Artist: Automatic Generation on Food Image Aesthetic Captioning[C]// Proceedings of the 32nd International Conference on Tools with Artificial Intelligence. IEEE, 2020: 779-786.
[4] Zeng W H, Abuduweili A, Li L, et al. Automatic Generation of Personalized Comment Based on User Profile[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics:Student Research Workshop. ACL, 2019: 229-235.
[5] Li W, Xu J J, He Y C, et al. Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. ACL, 2019: 4843-4852.
[6] Casar Morejon C. Automatic Generation of Comments on Twitter Based on News[D]. Universität Politècnica de Catalunya, 2018.
[7] 阮叶丽. 基于GAN考虑外部知识与情感属性的文本生成研究[D]. 武汉: 中南财经政法大学, 2020.
[7] (Ruan Yeli. The Research of Text Generation with Considering External Knowledge and Emotional Attributes Based on GAN[D]. Wuhan: Zhongnan University of Economics and Law, 2020.)
[8] Kingma D P, Welling M. Auto-Encoding Variational Bayes[OL]. arXiv Preprint, arXiv: 1312.6114.
[9] Li C Y, Gao X, Li Y, et al. Optimus: Organizing Sentences via Pre-Trained Modeling of a Latent Space[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 4678-4699.
[10] 胡盛伟. 基于深度学习的条件文本生成技术与应用研究[D]. 泉州: 华侨大学, 2020.
[10] (Hu Shengwei. Research on Conditional Text Generation Technology and Its Application Based on Deep Learning[D]. Quanzhou: Huaqiao University, 2020.)
[11] Liu D Y, Liu G S. A Transformer-Based Variational Autoencoder for Sentence Generation[C]// Proceedings of the 2019 International Joint Conference on Neural Networks. IEEE, 2019: 1-7.
[12] 张超. 生成对抗网络在文本生成中的应用研究[D]. 武汉: 湖北工业大学, 2019.
[12] (Zhang Chao. Application Research of Generative Adversarial Networks in Text Generation[D]. Wuhan: Hubei University of Technology, 2019.)
[13] 刘磊. 基于生成式对抗网络与异质集成学习的文本情感分类研究[D]. 南京: 南京邮电大学, 2020.
[13] (Liu Lei. Research on Text Sentiment Classfication Based on Generative Adversarial Network and Heterogenous Ensemble Learning[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2020.)
[14] Yu L T, Zhang W N, Wang J, et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017.
[15] 王姿雯. 基于深度学习的多条件个性化文本生成[D]. 北京: 北京邮电大学, 2019.
[15] (Wang Ziwen. Multi-Conditional Generation of Personalized Texts Based on Deep Learning[D]. Beijing: Beijing University of Posts and Telecommunications, 2019.)
[16] 严丹, 何军, 刘红岩, 等. 考虑评级信息的音乐评论文本自动生成[J]. 计算机科学与探索, 2020, 14(8): 1389-1396.
doi: 10.3778/j.issn.1673-9418.1908075
[16] (Yan Dan, He Jun, Liu Hongyan, et al. Considering Grade Information for Music Comment Text Automatic Generation[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(8): 1389-1396.)
doi: 10.3778/j.issn.1673-9418.1908075
[17] 韩虎, 孙天岳, 赵启涛. 引入自编码机制对抗网络的文本生成模型[J]. 计算机工程与科学, 2020, 42(9): 1704-1710.
[17] (Han Hu, Sun Tianyue, Zhao Qitao. Generative Adversarial Networks with Autoencoder for Text Generation[J]. Computer Engineering & Science, 2020, 42(9): 1704-1710.)
[18] Zheng H T, Wang W, Chen W, et al. Automatic Generation of News Comments Based on Gated Attention Neural Networks[J]. IEEE Access, 2017, 6: 702-710.
doi: 10.1109/ACCESS.2017.2774839
[19] 朱向其, 张忠林, 李林川, 等. 基于改进词性信息和ACBiLSTM的短文本分类[J]. 计算机应用与软件, 2021, 38(12): 179-186.
[19] (Zhu Xiangqi, Zhang Zhonglin, Li Linchuan, et al. Short Text Classification Based on Improved Part of Speech Information and ACBiLSTM[J]. Computer Applications and Software, 2021, 38(12): 179-186.)
[20] Bai S J, Kolter J Z, Koltun V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling[OL]. arXiv Preprint, arXiv: 1803.01271.
[21] 高巍, 马辉, 李大舟, 等. 基于双编码器的中文文本摘要技术的研究与实现[J]. 计算机工程与设计, 2021, 42(9): 2687-2695.
[21] (Gao Wei, Ma Hui, Li Dazhou, et al. Research and Implementation of Chinese Text Abstract Technology Based on Double Encoder[J]. Computer Engineering and Design, 2021, 42(9): 2687-2695.)
[22] Cao D, Huang Y J, Fu Y B. Text Sentiment Analysis Based on Parallel TCN Model and Attention Model[C]// Proceedings of the 2nd Symposium on Signal Processing Systems. ACM, 2020: 86-90.
[23] 邹智, 吴铁洲, 张晓星, 等. 基于贝叶斯优化CNN-BiGRU混合神经网络的短期负荷预测[J]. 高电压技术, 2022, 48(10): 3935-3945.
[23] (Zou Zhi, Wu Tiezhou, Zhang Xiaoxing, et al. Short-Term Load Forecast Based on Bayesian Optimized CNN-BiGRU Hybrid Neural Networks[J]. High Voltage Engineering, 2022, 48(10): 3935-3945.)
[24] 段明君. 基于生成对抗网络的中文文本生成[D]. 成都: 电子科技大学, 2021.
[24] (Duan Mingjun. Research on Chinese Text Generation Based on Generative Adversarial Networks[D]. Chengdu: University of Electronic Science and Technology of China, 2021.)
[25] Papineni K, Roukos S, Ward T, et al. BLEU: A Method for Automatic Evaluation of Machine Translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. New York: ACM, 2002: 311-318.
[26] 何天文, 王红. 基于语义语法分析的中文语句困惑度评价[J]. 计算机应用研究, 2017, 34(12): 3538-3542.
[26] (He Tianwen, Wang Hong. Evaluating Perplexity of Chinese Sentences Based on Grammar & Semantics Analysis[J]. Application Research of Computers, 2017, 34(12): 3538-3542.)
[27] Heafield K. KenLM: Faster and Smaller Language Model Queries[C]// Proceedings of the 6th Workshop on Statistical Machine Translation. ACM, 2011: 187-197.
[28] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[OL]. arXiv Preprint, arXiv: 1409.3215.
[29] Gehring J, Auli M, Grangier D, et al. Convolutional Sequence to Sequence Learning[C]// Proceedings of the 34th International Conference on Machine Learning. ACM, 2017: 1243-1252.
[30] Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. ACM, 2015: 2267-2273.
[31] Srivastava R K, Greff K, Schmidhuber J. Highway Networks[OL]. arXiv Preprint, arXiv: 1505.00387.
[32] Zhou P, Shi W, Tian J, et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 207-212.
[1] Zhang Siyang, Wei Subo, Sun Zhengyan, Zhang Shunxiang, Zhu Guangli, Wu Houyue. Extracting Emotion-Cause Pairs Based on Multi-Label Seq2Seq Model[J]. 数据分析与知识发现, 2023, 7(2): 86-96.
[2] Yang Lin, Huang Xiaoshuo, Wang Jiayang, Ding Lingling, Li Zixiao, Li Jiao. Identifying Subtypes of Clinical Trial Diseases with BERT-TextCNN[J]. 数据分析与知识发现, 2022, 6(4): 69-81.
[3] Xu Tongtong,Sun Huazhi,Ma Chunmei,Jiang Lifen,Liu Yichen. Classification Model for Few-shot Texts Based on Bi-directional Long-term Attention Features[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[4] Changlei Fu,Li Qian,Huaping Zhang,Huaming Zhao,Jing Xie. Mining Innovative Topics Based on Deep Learning[J]. 数据分析与知识发现, 2019, 3(1): 46-54.
[5] Feng Guoming,Zhang Xiaodong,Liu Suhui. Classifying Chinese Texts with CapsNet[J]. 数据分析与知识发现, 2018, 2(12): 68-76.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn