Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (7): 59-69    DOI: 10.11925/infotech.2096-3467.2021.0089
RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning
Zhang Le1,Leng Jidong1,Lv Xueqiang1,Cui Zhuo2,Wang Lei1,You Xindong1()
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science & Technology University, Beijing 100101, China
2School of Information & Communication Engineering, Beijing Information Science & Technology University, Beijing 100101, China
[Objective] This paper proposes a rewriting model for Chinese patent abstracts based on reinforcement learning (RLCPAR), aiming to address the issues of sentence redundancy and low accuracy in rewriting multi-sentence abstracts. [Methods] First, we used the RLCPAR to extract key sentences from patent descriptions with the help of patent term dictionary and reinforcement learning. Then, we generated the candidate abstracts using the Transformer deep neural network. Finally, we merged the candidate abstracts with the original patent abstracts to obtain the rewritten abstracts after semantic de-duplication and sorting. [Results] The proposed model effectively finished the end-to-end rewriting of patent abstracts. The scores of RLCPAR were 56.95%, 37.21% and 51.24% with the ROUGE-1, ROUGE-2 and ROUGE-L criteria. [Limitations] The experimental data, which were mainly on Chinese medicine materials, needs to be expanded to other fields. [Conclusions] The PLCPAR model is much better than other sequence generation methods and improves the rewriting quality of Chinese patent abstracts.

Key wordsPatent Abstract      Automatic Rewriting      Reinforcement Learning      Neural Network      Text Generation     
Received: 27 January 2021      Published: 11 August 2021
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(61671070);Open Project Fund of the Tibetan Information Processing and Machine Translation Key Laboratory/the Key Laboratory of Tibetan Information Processing, Ministry of Education(2019Z002)
Corresponding Authors: You Xindong,ORCID: 0000-0002-3351-4599     E-mail:

Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning. Data Analysis and Knowledge Discovery, 2021, 5(7): 59-69.

Rewriting Framework of Chinese Patent Abstract
原句子 分词及词性标注
临床研究显示,枸杞多糖可明显改善脂肪肝患者的临床症状。 临床/JJ 研究/NN 显示/VV,/PU 枸杞多糖/NN 可/VV 明显/AD 改善/V脂肪肝/NN患者/NN 的/DEG 临床/JJ 症状/NN 。/PU
Examples of Sentence Preprocessing
字段 示例
专利申请号 CN201410640231
人工摘要 一种蓝莓猕猴桃果酱及其制备方法。制备原料包括,草莓酱蓝莓猕猴桃樱桃山竹榛子仁山楂山豆根草河车白头翁穿心莲柠檬酸蜂蜜和木糖醇。制备方法为,将猕猴桃山竹去皮,将蓝莓樱桃洗净;将榛子仁与猕猴桃山竹蓝莓樱桃加水煮制后,将所有物料打碎,得到混合果浆将山楂山豆根草河车白头翁穿心莲加水煎煮,过滤后将滤液进行喷雾干燥,得到中草药粉末将柠檬酸木糖醇蜂蜜与中草药粉末加入到混合果浆中,混匀,得到混合物料,小火煮沸,并不断搅拌使水分蒸发,直至较粘稠后,与草莓酱混匀,进行灌装密封消毒,即得。所述果酱,制备工艺简单,含有丰富的钙钾硒锌锗等微量元素和人体所需17种氨基酸,还含有丰富的维生素葡萄酸果糖柠檬酸苹果酸脂肪等,口感细腻温润,添加了山楂白头翁等中草药。使得具有美容养颜功效的同时还具有一定的助消化清热解毒的功效。
摘要 本发明公开了一种蓝莓猕猴桃果酱及其制备方法。所述蓝莓猕猴桃果酱由下列重量的原料组成:草莓酱蓝莓猕猴桃樱桃山竹榛子仁山楂山豆根草河车白头翁穿心莲柠檬酸蜂蜜木糖醇本发明制备的蓝莓猕猴桃果酱,制备工艺简单,含有丰富的钙钾硒锌锗等微量元素和人体所需17种氨基酸,还含有丰富的维生素葡萄酸果糖柠檬酸苹果酸脂肪等,口感细腻温润。\n添加了山楂白头翁等中草药,使得本发明具有美容养颜功效的同时还具有一定的助消化清热解毒的功效。
Examples of Patent Data
参数名 参数值
句子最大长度 100
词向量维度 128
隐藏层大小 256
批处理个数 64
学习率 0.000 1
提前终止 5
学习率衰减 0.5
强化学习的折扣因子 0.95
Parameter Setting
模型 ROUGE-1/% ROUGE-2/% ROUGE-L/%
Baseline 53.42 32.25 48.27
TextRank 37.60 18.21 31.21
PGN+RL 40.35 23.45 32.67
Top6+Seq2Seq 41.00 24.19 36.27
FASRS 51.68 32.45 44.88
RLCPAR 52.76 33.64 45.58
RLCPAR+Text 56.63 36.38 50.87
Experimental Results (Text of Instruction)
Contrast Test Results
模型 ROUGE-1/% ROUGE-2/% ROUGE-L/%
Baseline 53.42 32.25 48.27
TextRank 41.59 22.52 34.18
PGN+RL 44.85 26.00 36.15
Top6+Seq2Seq 45.09 27.33 39.33
FASRS 54.84 36.48 48.21
RLCPAR 55.89 36.96 49.73
RLCPAR+Text 56.95 37.21 51.24
Experimental Results (Original Abstract + Text of Instruction)
Contrast Test Results(Original Abstract + Text of Instruction)
对比项 内容
原始摘要 一种治疗复发性口腔溃疡的汤剂药物及制备方法,涉及治疗复发性口腔溃疡的中草药配方,其药物是由下述重量份的原料制成的:防风、栀子、藿香、炮姜、麦冬、连翘各8-10克,石膏15-18克,炒苍术、荷叶各4-5克。本发明的特点是取材容易、制备方便、费用低廉、见效快。
改写摘要 一种含有防风、栀子、藿香、炮姜等中药成分的汤剂药物及其制备方法。制备方法为:先将配比量、防风、栀子、藿香、炮姜、麦冬、连翘、石膏、炒苍术、荷叶等中草药放入煎药器具内,加入洁净水,入煎前将上述中草药浸泡半小时,使其充分湿润,以利药汁充分煎出。该药物具有清热泻火消肿排脓的功效,可用于治疗复发性口腔溃疡。
人工摘要 一种治疗复发性口腔溃疡的药物,将防风、栀子、藿香、炮姜、麦冬、连翘、石膏、炒苍术、荷叶加水浸泡后煎熬,除去药渣,得汤剂。该药物具有清热泻火、消肿排脓的功效,用于治疗复发性口腔溃疡。
分析 改写摘要在原始摘要的基础上添加了制备方法和功效。
原始摘要 一种治疗腰间盘突出的中药,其特征的具体成分及配重比是:制草乌:9克,制川乌:9克,麻黄:9克,白芷:9克,仓术:9克,透骨草:9克,地龙:9克,土元:9克,当归:10克,白芍:10克,黄芪:10克,白术:10克,党参:10克,远志:10克,合欢花:10克,细辛:4克,木瓜:10克,独活:9克。用该中药治疗腰间盘突出的病人,不需要开刀住院治疗,就可很快的治愈病人的病症并解除病人的痛苦,用该治疗方法,花费少、治愈快,特别适用于广大缺医少药的农村及工薪阶层,是造福人类的好药,极具推广应用的巨大价值。
改写摘要 一种治疗腰间盘突出的中药。一种中药组合物,原料为:白术 党参 远志 合欢花 细辛 木瓜 独活5.45%,以上中药精确度均为1%。该药物具有肝肾益气活血止痛的功效,可用于治疗气虚血损,风寒湿邪外袭,寒凝筋脉,气滞血瘀湿凝成痰。
人工摘要 一种治疗腰间盘突出的中药组合物。该组合物由制草乌、制川乌、麻黄、白芷、仓术、透骨草、地龙、土元、当归、白芍、黄芪、白术、党参、远志、合欢花、细辛、木瓜、独活等中药组成。将以上中药组合物浸泡在50度高粱酒内一个月得到成品。该中药提取物可治疗腰间盘突出的病人,不需要开刀住院,花费少、治愈快。
分析 改写摘要比原始摘要在功效描述上更加详细全面。
Examples of Patent Rewriting Results
