Please wait a minute...
Data Analysis and Knowledge Discovery  2024, Vol. 8 Issue (3): 110-119    DOI: 10.11925/infotech.2096-3467.2022.1303
Current Issue | Archive | Adv Search |
Generating Chinese Abstracts with Content and Image Features
Quan Ankun1,2,Li Honglian1(),Zhang Le2,Lyu Xueqiang2
1School of Information & Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China
2Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
Download: PDF (1416 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new Chinese abstract generation method integrating content and image features. It aims to improve the performance of existing methods based on text features. [Methods] First, we used the BERT to extract text features and used ResNet to extract image features. Then, we utilized these features to complement and validate each other. Third, we fused the two modal features with the attention mechanism. Finally, we inputted the fused features into a pointer generation network to generate higher-quality Chinese abstracts. [Results] Compared to models solely relying on single text modality, the proposed method showed improvements of 1.9%, 1.3%, and 1.4% on ROUGE-1, ROUGE-2, and ROUGE-L metrics, respectively. [Limitations] The experimental data were primarily retrieved from the news domain, and the model’s effectiveness in other fields remains to be verified. [Conclusions] Incorporating image information allows the fused features to retain more important information. It helps the model identify the key content better and makes the generated abstracts more comprehensive and readable.

Key wordsFeature Fusion      BERT      ResNet      Attention Mechanism      Abstract Generation     
Received: 07 December 2022      Published: 16 May 2023
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(62171043);“Diligent Talents” Training Scheme Foundation of Beijing Information Science and Technology University(QXTCP B201908)
Corresponding Authors: Li Honglian,ORCID:0000-0002-0531-3650,E-mail:lihonglian@bistu.edu.cn。   

Cite this article:

Quan Ankun, Li Honglian, Zhang Le, Lyu Xueqiang. Generating Chinese Abstracts with Content and Image Features. Data Analysis and Knowledge Discovery, 2024, 8(3): 110-119.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1303     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2024/V8/I3/110

Framework for Chinese Abstract Generation with Content and Image Features
卷积核尺寸 通道数 填充尺寸 步长
卷积层 3×3 3 1 1×2
残差堆(4个残差模块) 3×3 1 1 2×2
3×3 1 1 2×2
Paramerter Settings of ResNet
Attention Mechanical Model
Pointer Generator Networks
参数
训练集的批处理大小 16
验证集的批处理大小 128
优化器 Adam
学习率 1e-3
词汇表大小 20 000
解码器序列最大长度 40
Beam Size 4
Parameter Setting
序号 方法 ROUGE-1/% ROUGE-2/% ROUGE-L/%
1 RNN Context 9.2 6.3 9.6
2 PGN 25.2 14.3 25.6
3 BERT+PGN 30.7 18.4 30.6
4 本文方法 32.6 19.7 32.0
Experimental Results
对比项 示例
原文本 #高考结束#6月8日,四川成都。电子科技大学大一学生和志愿者,在成都七中考点外,向考生收集学习用品…
他们准备在支教时带给山区小孩。还有好心人特地打电话,捐了很多本子…
参考摘要 大一学生高考点外收文具:会带给山区小孩
BERT+PGN 6月8日,大一学生高考点外收文具:会带给山区小孩
本文方法 6月8日,成都,大一学生和志愿者高考点外收文具:会带给山区小孩
Comparison of Summary Examples
Attention Visualization of Image and Text
消融实验 ROUGE-1/% ROUGE-2/% ROUGE-L/%
1 30.7 18.4 30.6
2 31.2 18.6 31.0
3 25.3 12.0 22.1
4 26.4 11.6 26.6
本文方法 32.6 19.7 32.0
Results of Ablation Experiments
[1] 明拓思宇, 陈鸿昶. 文本摘要研究进展与趋势[J]. 网络与信息安全学报, 2018, 4(6): 1-10.
[1] (Ming Tuosiyu, Chen Hongchang. Research Progress and Trend of Text Summarization[J]. Chinese Journal of Network and Information Security, 2018, 4(6): 1-10.)
[2] 何丽. 基于多模态神经网络的图文摘要生成方法研究[D]. 北京: 北京邮电大学, 2021.
[2] (He Li. Research on Method of Text-Image Summarization Based on Multimodal Neural Network[D]. Beijing: Beijing University of Posts and Telecommunications, 2021.)
[3] See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-Generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1073-1083.
[4] 谭金源, 刁宇峰, 祁瑞华, 等. 基于BERT-PGN模型的中文新闻文本自动摘要生成[J]. 计算机应用, 2021, 41(1): 127-132.
doi: 10.11772/j.issn.1001-9081.2020060920
[4] (Tan Jinyuan, Diao Yufeng, Qi Ruihua, et al. Automatic Summary Generation of Chinese News Text Based on BERT-PGN Model[J]. Journal of Computer Applications, 2021, 41(1): 127-132.)
doi: 10.11772/j.issn.1001-9081.2020060920
[5] 李金鹏, 张闯, 陈小军, 等. 自动文本摘要研究综述[J]. 计算机研究与发展, 2021, 58(1): 1-21.
[5] (Li Jinpeng, Zhang Chuang, Chen Xiaojun, et al. Survey on Automatic Text Summarization[J]. Journal of Computer Research and Development, 2021, 58(1): 1-21.)
[6] Mihalcea R, Tarau P. TextRank: Bringing Order into Text[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004: 404-411.
[7] 程齐凯, 王佳敏, 陆伟. 基于引用共词网络的领域基础词汇发现研究[J]. 数据分析与知识发现, 2019, 3(6): 57-65.
[7] (Cheng Qikai, Wang Jiamin, Lu Wei. Discovering Domain Vocabularies Based on Citation Co-word Network[J]. Data Analysis and Knowledge Discovery, 2019, 3(6): 57-65.)
[8] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014: 3104-3112.
[9] Shi T, Keneshloo Y, Ramakrishnan N, et al. Neural Abstractive Text Summarization with Sequence-to-Sequence Models[J]. ACM Transactions on Data Science, 2021, 2(1): 1-37.
[10] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2019: 4171-4186.
[11] 刘泽宇, 马龙龙, 吴健, 等. 基于多模态神经网络的图像中文摘要生成方法[J]. 中文信息学报, 2017, 31(6): 162-171.
[11] (Liu Zeyu, Ma Longlong, Wu Jian, et al. Chinese Image Captioning Method Based on Multimodal Neural Network[J]. Journal of Chinese Information Processing, 2017, 31(6): 162-171.)
[12] 陈祥. 基于多模态数据的文本摘要生成研究[D]. 成都: 电子科技大学, 2020.
[12] (Chen Xiang. Research on Text Abstraction Generation Based on Multimodal Data[D]. Chengdu: University of Electronic Science and Technology of China, 2020.)
[13] Li H R, Zhu J N, Ma C, et al. Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 1092-1102.
[14] Li H R, Zhu J N, Liu T S, et al. Multi-modal Sentence Summarization with Modality Attention and Image Filtering[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4152-4158.
[15] Li M Z, Chen X Y, Gao S, et al. VMSMO: Learning to Generate Multimodal Summary for Video-Based News Articles[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 9360-9369.
[16] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[17] 刘文斌, 何彦青, 吴振峰, 等. 基于BERT和多相似度融合的句子对齐方法研究[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
[17] (Liu Wenbin, He Yanqing, Wu Zhenfeng, et al. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. Data Analysis and Knowledge Discovery, 2021, 5(7): 48-58.)
[18] Chen Y H. Convolutional Neural Network for Sentence Classification[D]. Waterloo: University of Waterloo, 2015.
[19] Philipp G, Song D, Carbonell J G. The Exploding Gradient Problem Demystified-Definition, Prevalence, Impact, Origin, Tradeoffs, and Solutions[OL]. arXiv Preprint, arXiv: 1712.05577.
[20] Bahdanau D, Cho K H, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[21] 邓珍荣, 汤园钰, 杨睿, 等. 基于关键词与指针生成网络的摘要生成算法[J]. 计算机系统应用, 2022, 31(11): 246-253.
[21] (Deng Zhenrong, Tang Yuanyu, Yang Rui, et al. Summarization Algorithm Based on Key Words and Pointer Generation Network[J]. Computer Systems and Applications, 2022, 31(11): 246-253.)
[22] Lin C Y. ROUGE: A Package for Automatic Evaluation of Summaries[C]// Proceedings of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. 2004: 74-81.
[1] Huang Taifeng, Ma Jing. Text Sentiment Classification Algorithm Based on Prompt Learning Enhancement[J]. 数据分析与知识发现, 2024, 8(3): 77-84.
[2] Liu Chengshan, Li Puguo, Wang Zhen. A Researcher Recommendation Model for Research Teams[J]. 数据分析与知识发现, 2024, 8(3): 132-142.
[3] Li Hui, Hu Yaohua, Xu Cunzhen. Personalized Recommendation Algorithm with Review Sentiments and Importance[J]. 数据分析与知识发现, 2024, 8(1): 69-79.
[4] Lyu Xueqiang, Yang Yuting, Xiao Gang, Li Yuxian, You Xindong. Extracting Long Terms from Sparse Samples[J]. 数据分析与知识发现, 2024, 8(1): 135-145.
[5] He Li, Yang Meihua, Liu Luyao. Detecting Events with SPO Semantic and Syntactic Information[J]. 数据分析与知识发现, 2023, 7(9): 114-124.
[6] Han Pu, Gu Liang, Ye Dongyu, Chen Wenqi. Recognizing Chinese Medical Literature Entities Based on Multi-Task and Transfer Learning[J]. 数据分析与知识发现, 2023, 7(9): 136-145.
[7] He Chaocheng, Huang Qian, Li Xinru, Wang Chunying, Wu Jiang. Trending Topics on Metaverse: A Microblog Text Analysis with BERT and DTM[J]. 数据分析与知识发现, 2023, 7(9): 25-38.
[8] Zhao Xuefeng, Wu Delin, Wu Weiwei, Sun Zhuoluo, Hu Jinjin, Lian Ying, Shan Jiayu. Identifying High-Quality Technology Patents Based on Deep Learning and Multi-Category Polling Mechanism——Case Study of Patent Applications[J]. 数据分析与知识发现, 2023, 7(8): 30-45.
[9] Shi Guoliang, Zhou Shu, Wang Yunfeng, Shi Chunjiang, Liu Liang. Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(6): 61-72.
[10] Ben Yanyan, Pang Xueqin. Identifying Medical Named Entities with Word Information[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[11] Li Kaijun, Niu Zhendong, Shi Kaize, Qiu Ping. Paper Recommendation Based on Academic Knowledge Graph and Subject Feature Embedding[J]. 数据分析与知识发现, 2023, 7(5): 48-59.
[12] Xu Kang, Yu Shengnan, Chen Lei, Wang Chuandong. Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
[13] Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[14] Deng Na, He Xinyang, Chen Weijie, Chen Xu. MPMFC: A Traditional Chinese Medicine Patent Classification Model Integrating Network Neighborhood Structural Features and Patent Semantic Features[J]. 数据分析与知识发现, 2023, 7(4): 145-158.
[15] Han Pu, Zhong Yule, Lu Haojie, Ma Shiwen. Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning[J]. 数据分析与知识发现, 2023, 7(3): 131-141.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn