[Objective] This paper addresses the problem of single-bias in patent text summarization caused by the single input structure of the patent text in patent texts. It also addresses the issues of repeated generation, the need for conciseness and fluency, and the loss of original information in generating abstracts. [Methods] We designed a patent text abstract generation model based on an improved multi-head attention mechanism (IMHAM). Firstly, we designed two cosine similarity-based algorithms based on the logical structure of the patent text to address the single structure issue and select the most important patent document. Then, we established a sequence-to-sequence model with a multi-head attention mechanism to learn the feature representation of patent text. Meanwhile, we added self-attention layers at the encoder and decoder levels. Next, we modified the attention function to address the problem of repetitive generation. Finally, we added an improved pointer network structure to solve the problem of original information loss. [Results] On the publicly available patent text dataset, the Rouge-1, Rouge-2, and Rouge-L scores of the proposed model were 3.3%, 2.4%, and 5.5% higher than the MedWriter baseline model. [Limitations] The proposed model is more applicable for documents with multiple structures and cannot fully utilize the algorithm for selecting the most important ones from single-structured documents. [Conclusions] The proposed model has good generalization ability in improving the quality of summary generation for text with multi-document structures.
施国良, 周抒, 王云峰, 施春江, 刘亮. 基于改进多头注意力机制的专利文本摘要生成研究*[J]. 数据分析与知识发现, 2023, 7(6): 61-72.
Shi Guoliang, Zhou Shu, Wang Yunfeng, Shi Chunjiang, Liu Liang. Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism. Data Analysis and Knowledge Discovery, 2023, 7(6): 61-72.
Tan J W, Wan X J, Xiao J G. Abstractive Document Summarization with A Graph-based Attentional Neural Model[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2017: 1171-1181.
[2]
Rush A M, Chopra S, Weston J. A Neural Attention Model for Abstractive Sentence Summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 379-389.
[3]
See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-Generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2017: 1073-1083.
(Zhu Yongqing, Zhao Peng, Zhao Feifei, et al. Survey on Abstractive Text Summarization Technologies Based on Deep Learning[J]. Computer Engineering, 2021, 47(11): 11-21, 28.)
doi: 10.19678/j.issn.1000-3428.0061174
[5]
Landauer T K, Dumais S T. A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge[J]. Psychological Review, 1997, 104(2): 211-240.
doi: 10.1037/0033-295X.104.2.211
[6]
Landauer T K, Foltz P W, Laham D. An Introduction to Latent Semantic Analysis[J]. Discourse Processes, 1998, 25(2-3): 259-284.
doi: 10.1080/01638539809545028
[7]
Mohamed A H T, Mohamed B A, Abdelmajid B H. Computing Semantic Relatedness Using Wikipedia Features[J]. Knowledge-Based Systems, 2013, 50: 260-278.
doi: 10.1016/j.knosys.2013.06.015
[8]
Sinoara R A, Camacho-Collados J, Rossi R G, et al. Knowledge-Enhanced Document Embeddings for Text Classification[J]. Knowledge-Based Systems, 2019, 163: 955-971.
doi: 10.1016/j.knosys.2018.10.026
[9]
Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. 2014: 3104-3112.
[10]
R J, Anami B S, Poornima B K. Text Document Summarization Using POS Tagging for Kannada Text Documents[C]// Proceedings of the 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence). 2021: 423-426.
[11]
Pan M, Wang J M, Huang X J, et al. A Probabilistic Framework for Integrating Sentence-Level Semantics via BERT into Pseudo-Relevance Feedback[J]. Information Processing and Management, 2022, 59(1): 102734.
doi: 10.1016/j.ipm.2021.102734
[12]
Wang X P, Liu X X, Guo J, et al. A Deep Person re-Identification Model with Multi Visual-Semantic Information Embedding[J]. Multimedia Tools and Applications, 2021, 80(5): 6853-6870.
doi: 10.1007/s11042-020-09957-5
[13]
McMillan-Major A, Osei S, Rodriguez J D, et al. Reusable Templates and Guides for Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards[C]// Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021). 2021: 121-135.
[14]
Alami N, Mallahi M El, Amakdouf H, et al. Hybrid Method for Text Summarization Based on Statistical and Semantic Treatment[J]. Multimedia Tools and Applications, 2021, 80(13): 19567-19600.
doi: 10.1007/s11042-021-10613-9
[15]
Shao Y N, Lin J C W, Srivastava G, et al. Self-Attention-Based Conditional Random Fields Latent Variables Model for Sequence Labeling[J]. Pattern Recognition Letters, 2021, 145(C): 157-164.
[16]
Badanidiyuru A, Karbasi A, Kazemi E, et al. Submodular Maximization Through Barrier Functions[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020: 524-534.
[17]
Faris M R, Ibrahim H M, Abdulrahman K Z, et al. Fuzzy Logic Model for Optimal Operation of Darbandikhan Reservoir, Iraq[J]. International Journal of Design & Nature and Ecodynamics, 2021, 16(4): 335-343.
[18]
Ghumade T G, Deshmukh R A. A Document Classification Using NLP and Recurrent Neural Network[J]. International Journal of Engineering and Advanced Technology, 2019, 8(6): 633-636.
[19]
Shi T, Keneshloo Y, Ramakrishnan N, et al. Neural Abstractive Text Summarization with Sequence-to-Sequence Models[J]. ACM/IMS Transactions on Data Science, 2021, 2(1): 1-37.
[20]
Sultana M, Chakraborty P, Choudhury T. Bengali Abstractive News Summarization Using Seq2Seq Learning with Attention[C]// Proceedings of Cyber Intelligence and Information Retrieval. 2021: 279-289.
[21]
Yang M, Qu Q, Shen Y, et al. Cross-Domain Aspect/Sentiment-Aware Abstractive Review Summarization by Combining Topic Modeling and Deep Reinforcement Learning[J]. Neural Computing and Applications, 2020, 32(11): 6421-6433.
doi: 10.1007/s00521-018-3825-2
[22]
Chen Y B, Ma Y, Mao X D, et al. Multi-Task Learning for Abstractive and Extractive Summarization[J]. Data Science and Engineering, 2019, 4(1): 14-23.
doi: 10.1007/s41019-019-0087-7
[23]
Choi H, Cho K, Bengio Y. Context-Dependent Word Representation for Neural Machine Translation[J]. Computer Speech & Language, 2017, 45: 149-160.
[24]
Yun H, Hwang Y, Jung K. Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34: 9498-9506.
[25]
Munir K, Zhao H, Li Z C. Adaptive Convolution for Semantic Role Labeling[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021, 29: 782-791.
doi: 10.1109/TASLP.6570655
[26]
Hou K K, Hou T T, Cai L L. Public Attention about COVID-19 on Social Media: An Investigation Based on Data Mining and Text Analysis[J]. Personality and Individual Differences, 2021, 175: 110701.
doi: 10.1016/j.paid.2021.110701
[27]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[28]
Tao C Y, Gao S, Shang M Y, et al. Get the Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4418-4424.
[29]
Jean S, Cho K, Memisevic R, et al. On Using Very Large Target Vocabulary for Neural Machine Translation[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2015: 1-10.
[30]
Kumar A, Seth S, Gupta S, et al. Sentic Computing for Aspect-Based Opinion Summarization Using Multi-Head Attention with Feature Pooled Pointer Generator Network[J]. Cognitive Computation, 2022, 14(1): 130-148.
doi: 10.1007/s12559-021-09835-8
[31]
Gill H S, Khehra B S, Singh A, et al. Teaching-Learning-Based Optimization Algorithm to Minimize Cross Entropy for Selecting Multilevel Threshold Values[J]. Egyptian Informatics Journal, 2018, 20(1): 11-25.
doi: 10.1016/j.eij.2018.03.006
[32]
Mihalcea R, Tarau P. TextRank: Bringing Order into Texts[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2003: 404-411.
[33]
Nallapati R, Zhai F F, Zhou B W. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2013: 3075-3081.
[34]
Pan Y C, Chen Q C, Peng W H, et al. MedWriter: Knowledge-Aware Medical Text Generation[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 2363-2368.
[35]
Lin C Y. Rouge: A Package for Automatic Evaluation of Summaries[C]// Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. 2004: 74-81.