Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (6): 61-72    DOI: 10.11925/infotech.2096-3467.2022.0530
Current Issue | Archive | Adv Search |
Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism
Shi Guoliang1(),Zhou Shu1,2,Wang Yunfeng2,Shi Chunjiang2,Liu Liang2
1Business School, Hohai University, Nanjing 211100, China
2Bank of Jiangsu, Nanjing 210006, China
Download: PDF (1242 KB)   HTML ( 9
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper addresses the problem of single-bias in patent text summarization caused by the single input structure of the patent text in patent texts. It also addresses the issues of repeated generation, the need for conciseness and fluency, and the loss of original information in generating abstracts. [Methods] We designed a patent text abstract generation model based on an improved multi-head attention mechanism (IMHAM). Firstly, we designed two cosine similarity-based algorithms based on the logical structure of the patent text to address the single structure issue and select the most important patent document. Then, we established a sequence-to-sequence model with a multi-head attention mechanism to learn the feature representation of patent text. Meanwhile, we added self-attention layers at the encoder and decoder levels. Next, we modified the attention function to address the problem of repetitive generation. Finally, we added an improved pointer network structure to solve the problem of original information loss. [Results] On the publicly available patent text dataset, the Rouge-1, Rouge-2, and Rouge-L scores of the proposed model were 3.3%, 2.4%, and 5.5% higher than the MedWriter baseline model. [Limitations] The proposed model is more applicable for documents with multiple structures and cannot fully utilize the algorithm for selecting the most important ones from single-structured documents. [Conclusions] The proposed model has good generalization ability in improving the quality of summary generation for text with multi-document structures.

Key wordsPatent Text      Abstract Generation      Multi-head Attention      Pointer Network     
Received: 25 May 2022      Published: 09 August 2023
ZTFLH:  TP391  
  G350  
Fund:Fundamental Research Funds for the Central Universities(B200207036)
Corresponding Authors: Shi Guoliang,ORCID:0000-0001-8672-9342,E-mail:shigl@hhu.edu.cn。   

Cite this article:

Shi Guoliang, Zhou Shu, Wang Yunfeng, Shi Chunjiang, Liu Liang. Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism. Data Analysis and Knowledge Discovery, 2023, 7(6): 61-72.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0530     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I6/61

Architecture of the Model
Details of Encoder and Decoder
正则表达式 作用
<script[^>]*?>[\\s\\S]*?<\\/script> 处理网页标签等格式
<style[^>]*?>[\\s\\S]*?<\\/style>
<(?!div|/div|p|/p|br)[^>]*>
<tr>(.*?)</tr>
<th>(.*?)</th>
<td>(.*?)</td>
(?<=<title>).*?(?=</title>) 处理标题
<a.*?href=.*?<\/a> 处理图片引用等超链接格式
\\s*|\t|\r|\n 处理多余空格、换行、空白等格式
Regular Expression Processing Types and Their Expressions
属性 内容
标题 一种人工智能教育机器人的雷达安装座组件
公开号 CN211333277U
摘要 本实用新型公开了一种人工智能教育机器人的雷达安装座组件,属于机器人技术领域,通过…
说明书文本 本实用新型的目的是至少解决现有技术中存在的技术问题之一提供一种雷达安装座组件实现通过一个红外检测雷达工作扩大红外监测雷达的工作范围减少盲区对前方障碍物进行更宽范围的扫描判定…
权利要求书 1.一种人工智能教育机器人的雷达安装座组件,包括车体(1),其特征在于,所述车体(1)的前端形成有雷达安装座(2),所述雷达安装座(2)包括第一安装面板(3)和第二安装面板…
Patent Data Presentation Form
实验环境 具体配置
系统 CentOS Linux release 7.4.1708
GPU Tesla T4(16GB)×2
CPU Intel(R) Xeon(R) Gold 6130 CPU@3.5GHz
CPU核心数 32
CPU线程数 64
内存 256GB
CUDA 10.0
Python 3.6.5
PyTorch 1.1.0
Specific Parameters and Configurations of the Experimental Environment
模型 说明
TextRank[32] 基于图的文本处理模型,有两种创新的无监督关键词句提取方法
SummaRuNNer[33] 基于循环神经网络的序列模型进行提取式总结
Baseline-Multi-Encoder 基于以上流程,把改进的多头注意力机制单独用于编码器
Baseline-Multi-Decoder 基于以上流程,把改进的多头注意力机制单独用于解码器
Baseline-Multi-Encoder-NoChoosen 基于以上流程,把改进的多头注意力机制单独用于编码器,但是没有使用最重要文档语义相似度抉择
Baseline-Multi-Decoder-NoChoosen 基于以上流程,把改进的多头注意力机制单独用于解码器,但是没有使用最重要文档语义相似度抉择
MedWriter[34] 采用一种知识感知的文本生成模型,具有学习图级表示的能力
IMHAM(本文) 将改进的多头注意力用于解码器和编码器,使用最重要文档语义相似度选择和指针网络优化
Models and Corresponding Description
数据分类 算法A/% 算法B/%
水利 45.3 54.7
人工智能 44.9 55.1
光纤 37.2 62.8
农业 39.9 60.1
金融 41.8 58.2
Selection Ratio of Patent Data Selection Algorithm
数据分类 说明书文本/% 权利要求书/%
水利 83.3 16.7
人工智能 79.3 20.7
光纤 86.2 13.8
农业 76.9 23.1
金融 72.1 27.9
Proportion of Specification Text and Claim
Iteration Times and Loss Values of Models
模型 Rouge-1 Rouge-2 Rouge-L
TextRank 0.432 0.235 0.367
SummaRuNNer 0.482 0.293 0.393
Baseline-Multi-Encoder-NoChoosen 0.488 0.303 0.401
Baseline-Multi-Encoder 0.491 0.312 0.407
Baseline-Multi-Decoder-NoChoosen 0.509 0.329 0.411
Baseline-Multi-Decoder 0.515 0.335 0.416
MedWriter 0.518 0.339 0.419
IMHAM 0.535 0.347 0.442
The Performance of the Models on the Patent Text
模型 摘要内容
原文内容(重要文档选择为
“说明书文本”)
水利工程承担挡水蓄水和泄水的任务,因而对水工建筑物的稳定承压防渗抗冲耐磨抗冻抗裂等性能都有特殊要求,需按照水利工程的技术规范,采取专门的施工方法和措施,确保工程质量但是在实际使用时,堤坝在长时间的经过潮水的冲撞后,很容易导致坝体出现松动,存在较大的危险性,并且堤坝处的水面上会带来很多漂浮大垃圾,非常的不便于工作人员打捞。实用新型内容一解决的技术问题针对现有技术的不足,本实用新型公开了一种水利工程用防冲堤坝保护装置,以解决上述背景技术中提出的问题。技术方案为实现以上目的,本实用新型通过以下技术方案予以实现一种水利工程用防冲堤坝保护装置…
人工摘要 本实用新型公开一种水利工程用防冲堤坝保护装置,涉及水利工程技术领域。该水利工程用防冲堤坝保护装置包括,防护板两个支撑杆两个第二连接板和两个第一连接板,两个所述支撑杆均固定设置在防护板的顶部两端。该水利工程用防冲堤坝保护装置通过设置有防护清理机构,加强板在滑槽内滑动,便于调节加强板的所在位置,用于适配不同的水位,卷取盘进行旋转,便于对收集板的所在位置进行调节,便于工作人员对水面垃圾进行收集,整体设计简单,结构紧凑,在能够对堤坝进行防护的同时,便于对水面垃圾进行清理,具有良好的实用性。
Baseline-Multi-Encoder 本实用新型涉及水利工程技术领域,尤其为一种水利工程堤坝保护装置,包括防护板两个支撑杆两个第二连接板两个所述支撑杆均固定设置在防护板的顶部两端。两个所述第二连接板均固定设置在防护板的顶部,两个所述第一连接板分别固定设置在支撑杆的顶部一侧所述防护板的表面开设有两个滑槽,所述防护板的表面设置有防护清理机构所述。防护清理机构包括加强板,支撑板设置在防护板的表面,能够调节调节调节调节调节调节调节调节调节
Baseline-Multi-Decoder 本实用新型涉及水利工程技术领域,尤其为一种水利工程堤坝保护装置,所述移动组件包括车板和四个移动轮,包括防护板两个支撑杆两个第二连接板两个所述支撑杆均固定设置在防护板的顶部两端。该装置有清理部位可便于调节加强板,洗刷件竖直设置在车板上,喷洗件设置在车板的后端底部。此装置设计精巧可对堤坝防护,具有有用性。
IMHAM 本实用新型涉及水利工程技术领域,尤其为一种水利工程堤坝保护装置,包括防护板两个支撑杆两个第二连接板两个所述支撑杆均固定设置在防护板的顶部两端。该装置有清理部位可便于调节加强板,适配水位,卷盘旋转,调节位置,垃圾收集。此装置设计精巧对堤坝可防护,具有实用性。
Summarizes Examples of Different Models
模型 Encoder_Hidden_Layer=128 Encoder_Hidden_Layer=256 Encoder_Hidden_Layer=512
Baseline-Multi-Encoder-NoChoosen 0.483 0.485 0.481
Baseline-Multi-Encoder 0.484 0.488 0.485
Baseline-Multi-Decoder-NoChoosen 0.503 0.507 0.506
Baseline-Multi-Decoder 0.509 0.513 0.508
IMHAM 0.518 0.529 0.522
The Effect of Hidden Layer Size of the Encoder on Rouge-1
模型 Decoder_Hidden_Layer=128 Decoder_Hidden_Layer=256 Decoder_Hidden_Layer=512
Baseline-Multi-Encoder-NoChoosen 0.484 0.487 0.483
Baseline-Multi-Encoder 0.485 0.490 0.486
Baseline-Multi-Decoder-NoChoosen 0.505 0.508 0.506
Baseline-Multi-Decoder 0.510 0.514 0.510
IMHAM 0.522 0.531 0.526
The Effect of Hidden Layer Size of the Decoder on Rouge-1
模型 Multihead_Count=2 Multihead_Count=3 Multihead_Count=4 Multihead_Count=5
Baseline-Multi-Encoder-NoChoosen 0.483 0.486 0.488 0.487
Baseline-Multi-Encoder 0.485 0.489 0.491 0.490
Baseline-Multi-Decoder-NoChoosen 0.503 0.506 0.509 0.508
Baseline-Multi-Decoder 0.509 0.511 0.515 0.513
IMHAM 0.521 0.532 0.535 0.533
The Effect of Numbers of Multi-head Count on Rouge-1
[1] Tan J W, Wan X J, Xiao J G. Abstractive Document Summarization with A Graph-based Attentional Neural Model[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2017: 1171-1181.
[2] Rush A M, Chopra S, Weston J. A Neural Attention Model for Abstractive Sentence Summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 379-389.
[3] See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-Generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2017: 1073-1083.
[4] 朱永清, 赵鹏, 赵菲菲, 等. 基于深度学习的生成式文本摘要技术综述[J]. 计算机工程, 2021, 47(11): 11-21, 28.
doi: 10.19678/j.issn.1000-3428.0061174
[4] (Zhu Yongqing, Zhao Peng, Zhao Feifei, et al. Survey on Abstractive Text Summarization Technologies Based on Deep Learning[J]. Computer Engineering, 2021, 47(11): 11-21, 28.)
doi: 10.19678/j.issn.1000-3428.0061174
[5] Landauer T K, Dumais S T. A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge[J]. Psychological Review, 1997, 104(2): 211-240.
doi: 10.1037/0033-295X.104.2.211
[6] Landauer T K, Foltz P W, Laham D. An Introduction to Latent Semantic Analysis[J]. Discourse Processes, 1998, 25(2-3): 259-284.
doi: 10.1080/01638539809545028
[7] Mohamed A H T, Mohamed B A, Abdelmajid B H. Computing Semantic Relatedness Using Wikipedia Features[J]. Knowledge-Based Systems, 2013, 50: 260-278.
doi: 10.1016/j.knosys.2013.06.015
[8] Sinoara R A, Camacho-Collados J, Rossi R G, et al. Knowledge-Enhanced Document Embeddings for Text Classification[J]. Knowledge-Based Systems, 2019, 163: 955-971.
doi: 10.1016/j.knosys.2018.10.026
[9] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. 2014: 3104-3112.
[10] R J, Anami B S, Poornima B K. Text Document Summarization Using POS Tagging for Kannada Text Documents[C]// Proceedings of the 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence). 2021: 423-426.
[11] Pan M, Wang J M, Huang X J, et al. A Probabilistic Framework for Integrating Sentence-Level Semantics via BERT into Pseudo-Relevance Feedback[J]. Information Processing and Management, 2022, 59(1): 102734.
doi: 10.1016/j.ipm.2021.102734
[12] Wang X P, Liu X X, Guo J, et al. A Deep Person re-Identification Model with Multi Visual-Semantic Information Embedding[J]. Multimedia Tools and Applications, 2021, 80(5): 6853-6870.
doi: 10.1007/s11042-020-09957-5
[13] McMillan-Major A, Osei S, Rodriguez J D, et al. Reusable Templates and Guides for Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards[C]// Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021). 2021: 121-135.
[14] Alami N, Mallahi M El, Amakdouf H, et al. Hybrid Method for Text Summarization Based on Statistical and Semantic Treatment[J]. Multimedia Tools and Applications, 2021, 80(13): 19567-19600.
doi: 10.1007/s11042-021-10613-9
[15] Shao Y N, Lin J C W, Srivastava G, et al. Self-Attention-Based Conditional Random Fields Latent Variables Model for Sequence Labeling[J]. Pattern Recognition Letters, 2021, 145(C): 157-164.
[16] Badanidiyuru A, Karbasi A, Kazemi E, et al. Submodular Maximization Through Barrier Functions[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020: 524-534.
[17] Faris M R, Ibrahim H M, Abdulrahman K Z, et al. Fuzzy Logic Model for Optimal Operation of Darbandikhan Reservoir, Iraq[J]. International Journal of Design & Nature and Ecodynamics, 2021, 16(4): 335-343.
[18] Ghumade T G, Deshmukh R A. A Document Classification Using NLP and Recurrent Neural Network[J]. International Journal of Engineering and Advanced Technology, 2019, 8(6): 633-636.
[19] Shi T, Keneshloo Y, Ramakrishnan N, et al. Neural Abstractive Text Summarization with Sequence-to-Sequence Models[J]. ACM/IMS Transactions on Data Science, 2021, 2(1): 1-37.
[20] Sultana M, Chakraborty P, Choudhury T. Bengali Abstractive News Summarization Using Seq2Seq Learning with Attention[C]// Proceedings of Cyber Intelligence and Information Retrieval. 2021: 279-289.
[21] Yang M, Qu Q, Shen Y, et al. Cross-Domain Aspect/Sentiment-Aware Abstractive Review Summarization by Combining Topic Modeling and Deep Reinforcement Learning[J]. Neural Computing and Applications, 2020, 32(11): 6421-6433.
doi: 10.1007/s00521-018-3825-2
[22] Chen Y B, Ma Y, Mao X D, et al. Multi-Task Learning for Abstractive and Extractive Summarization[J]. Data Science and Engineering, 2019, 4(1): 14-23.
doi: 10.1007/s41019-019-0087-7
[23] Choi H, Cho K, Bengio Y. Context-Dependent Word Representation for Neural Machine Translation[J]. Computer Speech & Language, 2017, 45: 149-160.
[24] Yun H, Hwang Y, Jung K. Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34: 9498-9506.
[25] Munir K, Zhao H, Li Z C. Adaptive Convolution for Semantic Role Labeling[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021, 29: 782-791.
doi: 10.1109/TASLP.6570655
[26] Hou K K, Hou T T, Cai L L. Public Attention about COVID-19 on Social Media: An Investigation Based on Data Mining and Text Analysis[J]. Personality and Individual Differences, 2021, 175: 110701.
doi: 10.1016/j.paid.2021.110701
[27] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[28] Tao C Y, Gao S, Shang M Y, et al. Get the Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4418-4424.
[29] Jean S, Cho K, Memisevic R, et al. On Using Very Large Target Vocabulary for Neural Machine Translation[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2015: 1-10.
[30] Kumar A, Seth S, Gupta S, et al. Sentic Computing for Aspect-Based Opinion Summarization Using Multi-Head Attention with Feature Pooled Pointer Generator Network[J]. Cognitive Computation, 2022, 14(1): 130-148.
doi: 10.1007/s12559-021-09835-8
[31] Gill H S, Khehra B S, Singh A, et al. Teaching-Learning-Based Optimization Algorithm to Minimize Cross Entropy for Selecting Multilevel Threshold Values[J]. Egyptian Informatics Journal, 2018, 20(1): 11-25.
doi: 10.1016/j.eij.2018.03.006
[32] Mihalcea R, Tarau P. TextRank: Bringing Order into Texts[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2003: 404-411.
[33] Nallapati R, Zhai F F, Zhou B W. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2013: 3075-3081.
[34] Pan Y C, Chen Q C, Peng W H, et al. MedWriter: Knowledge-Aware Medical Text Generation[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 2363-2368.
[35] Lin C Y. Rouge: A Package for Automatic Evaluation of Summaries[C]// Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. 2004: 74-81.
[1] Xu Kang, Yu Shengnan, Chen Lei, Wang Chuandong. Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
[2] Han Pu, Zhong Yule, Lu Haojie, Ma Shiwen. Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning[J]. 数据分析与知识发现, 2023, 7(3): 131-141.
[3] Tong Xinyu, Zhao Ruijie, Lu Yonghe. Multi-label Patent Classification with Pre-training Model[J]. 数据分析与知识发现, 2022, 6(2/3): 129-137.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn