Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
Research On Patent Text Abstract Generation Based On Improved Multi-Head Attention Mechanism
Guoliang Shi,Shu Zhou,Yunfeng Wang,Chunjiang Shi,Liang Liu
(Business School, Hohai University, Nanjing 211100, China) (Bank of Jiangsu, Nanjing 210006, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] In the field of patent text abstract generation, there are currently problems of single bias in the abstract generation due to the single input structure of patent text, and the abstract generation as a whole has the problems of repeated generation, insufficient concise and smooth, and loss of original information.

[Methods] Firstly, we design two algorithms based on cosine similarity to select the most important patent documents based on the logical structure of the patent text for the single structure problem. Secondly, we design a new sequence-to-sequence structure model with improved multi-head attention mechanism(IMHAM) to better learn the feature expression of the patent text, and add self-attentive layers in the encoder and decoder layers to solve the duplicate generation problem. Finally, we add the improved pointer network structure to solve the problem of missing original information.

[Results] Our model is 3.2%, 2.3%, 5.4% higher in evaluation metrics Rouge-1, Rouge-2, Rouge-L, respectively, compared to model MedWriter on the publicly available patent text data set.

[Limitations] This model is more applicable to the document system with multiple structures such as patents, and may lack room for improvement in extracting important documents for the text content of a single system.

[Conclusions] The proposed model has good generalization capability for quality improvement of text summary generation domains similar to those with multi-document structure systems.

Key words Patent Text      Abstract Generation      Multi-head Attention      Pointer Network      
Published: 11 November 2022
ZTFLH:  TP391,G350  

Cite this article:

Guoliang Shi, Shu Zhou, Yunfeng Wang, Chunjiang Shi, Liang Liu. Research On Patent Text Abstract Generation Based On Improved Multi-Head Attention Mechanism . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022-0530     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Quan Ankun, Li Honglian, Zhang Le, Lyu Xueqiang. Generating Chinese Abstracts with Content and Image Features[J]. 数据分析与知识发现, 2024, 8(3): 110-119.
[2] Lyu Xueqiang, Yang Yuting, Xiao Gang, Li Yuxian, You Xindong. Extracting Long Terms from Sparse Samples[J]. 数据分析与知识发现, 2024, 8(1): 135-145.
[3] Zhai Dongsheng, Lou Ying, Kan Huimin, He Xijun, Liang Guoqiang, Ma Zifei. Constructing TCM Knowledge Graph with Multi-Source Heterogeneous Data[J]. 数据分析与知识发现, 2023, 7(9): 146-158.
[4] Shi Guoliang, Zhou Shu, Wang Yunfeng, Shi Chunjiang, Liu Liang. Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(6): 61-72.
[5] Xu Kang, Yu Shengnan, Chen Lei, Wang Chuandong. Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
[6] Han Pu, Zhong Yule, Lu Haojie, Ma Shiwen. Identifying Named Entities of Adverse Drug Reaction with Adversarial Transfer Learning[J]. 数据分析与知识发现, 2023, 7(3): 131-141.
[7] Hu Jiming, Zheng Xiang. Abstracting Interactive Contents from New Media for Government Affairs Based on Topic Clustering[J]. 数据分析与知识发现, 2022, 6(6): 95-104.
[8] Tong Xinyu, Zhao Ruijie, Lu Yonghe. Multi-label Patent Classification with Pre-training Model[J]. 数据分析与知识发现, 2022, 6(2/3): 129-137.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn