基于ChatGPT+Prompt的专利技术功效实体自动生成研究

doi:10.11925/infotech.2096-3467.2023.0737

数据分析与知识发现

本期目录 | 过刊浏览 | 高级检索

基于ChatGPT+Prompt的专利技术功效实体自动生成研究

白如江;陈启明;张玉洁;杨超

(山东理工大学信息管理研究院淄博 255000) (浙江大学公共管理学院杭州 310058)

Research on Automatic Entities Generation of Patent Technology Function Matrix based on ChatGPT+Prompt

Bai Rujiang;Chen Qiming;Zhang Yujie;Yang Chao

(Institute of Information Management, Shandong University of Technology, Zibo 255000, China) (School of Public Affairs Zhejiang University, Hangzhou 310058, China)

摘要
相关文章
Metrics

全文:
输出: BibTeX | EndNote (RIS)

摘要

[目的]本文重点突破专利技术功效实体的自动识别提取这一难题，智能感知生成专利文献中的关键技术功效，辅助专利技术功效矩阵高质量构建。[方法]本文提出了ChatGPT应用于专利技术功效实体抽取任务的新思路，使用ChatGPT+Prompt的方法实现专利技术词、功效词以及技术功效二元组的识别、提取和生成。[结果]本文识别生成了4个领域、3种语言的专利技术功效实体，跨领域、跨语言、提示样本数量对比的实验结果（ROUGE值）表明，该方法能够较为准确地识别技术功效二元组。通过对比ROUGE-1值可以看出，新能源汽车领域效果最佳，英文专利表现最优，跨域能力和跨语言能力显著，给予one-shot会显著提升模型性能。[局限]本文方法仍存在Prompt缺乏标准、生成内容的重复性、单轮或多轮问答的选择等问题。[结论]本文提出的方法具备合理性和可行性，有效降低技术功效实体生成的人力成本和任务门槛，拓展AIGC的应用场景，释放ChatGPT在专利文献挖掘的巨大潜力。同时，为用户理解ChatGPT辅助开展生成类任务提出了思考和建议。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ：专利技术功效矩阵, 技术词, 功效词, 实体识别, 生成式模型, ChatGPT, Prompt

Abstract：

[Objective] This paper focuses on breaking through the difficult problem of automatic identification and extraction of patent technology and function entities, intelligently perceiving and generating key technology and function in patent documents, and assisting in the high-quality construction of patent technology and function matrix. [Methods] In this paper, we propose a new idea of ChatGPT applied to patent technical efficacy entity extraction task, and use the method of ChatGPT+Prompt to realize the recognition, extraction and generation of patent technology words, function words and technology-function binary groups.[Results]This paper recognizes and generates patent technology and function entities in four domains and three languages, and the experimental results (ROUGE values) comparing cross-domain, cross-language, and prompted sample sizes show that the method can recognize technology-function binary groups more accurately. By comparing the ROUGE-1 values, it can be seen that the new energy automobile field has the best effect, English patents have the best performance, the cross-domain and cross-linguistic abilities are significant, and giving one-shot will significantly improve the model performance.[Limitations]The method in this paper still has limitations such as the lack of standards for Prompt, the duplicity of generated content, and the choice of one-round or multi-round Q&A.[Conclusions] The method in this paper still has problems such as the lack of standards for Prompt, the duplicity of generated content, and the choice of one-round or multi-round Q&A. [Conclusion] The method proposed in this paper possesses rationality and feasibility, effectively reduces the labor cost and task threshold of technology and function entity generation, expands the application scenarios of AIGC, and releases the great potential of ChatGPT in patent document mining. At the same time, we put forward thoughts and suggestions for users to understand ChatGPT to assist in carrying out generation-type tasks.

Key words： Patent Technology Function Matrix Technology Words Function Words Entity Recognition Generative Models ChatGPT Prompt

出版日期: 2024-03-15

ZTFLH:

G250，TP391

引用本文:

白如江, 陈启明, 张玉洁, 杨超. 基于ChatGPT+Prompt的专利技术功效实体自动生成研究 [J]. 数据分析与知识发现, 10.11925/infotech.2096-3467.2023.0737.
Bai Rujiang, Chen Qiming, Zhang Yujie, Yang Chao. Research on Automatic Entities Generation of Patent Technology Function Matrix based on ChatGPT+Prompt . Data Analysis and Knowledge Discovery, 0, (): 1-.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2023.0737 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y0/V/I/1

[1]	韩普, 顾亮, 叶东宇, 陈文祺. 基于多任务和迁移学习的中文医学文献实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(9): 136-145.
[2]	鲍彤, 章成志. ChatGPT中文信息抽取能力测评——以三种典型的抽取任务为例^*[J]. 数据分析与知识发现, 2023, 7(9): 1-11.
[3]	张颖怡, 章成志, 周毅, 陈必坤. 基于ChatGPT的多视角学术论文实体识别：性能测评与可用性研究^*[J]. 数据分析与知识发现, 2023, 7(9): 12-24.
[4]	邓宇扬, 吴丹. 面向藏族传统节日的汉藏双语命名实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(7): 125-135.
[5]	本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[6]	韩普, 仲雨乐, 陆豪杰, 马诗雯. 基于对抗性迁移学习的药品不良反应实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(3): 131-141.
[7]	赵朝阳, 朱贵波, 王金桥. ChatGPT给语言大模型带来的启示和多模态大模型新的发展思路^*[J]. 数据分析与知识发现, 2023, 7(3): 26-35.
[8]	张智雄, 于改红, 刘熠, 林歆, 张梦婷, 钱力. ChatGPT对文献情报工作的影响^*[J]. 数据分析与知识发现, 2023, 7(3): 36-42.
[9]	钱力, 刘熠, 张智雄, 李雪思, 谢靖, 许钦亚, 黎洋, 管铮懿, 李西雨, 文森. ChatGPT的技术基础分析^*[J]. 数据分析与知识发现, 2023, 7(3): 6-15.
[10]	裴伟, 孙水发, 李小龙, 鲁际, 杨柳, 吴义熔. 融合领域知识的医学命名实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(3): 142-154.
[11]	张华平, 李林翰, 李春锦. ChatGPT中文性能测评与风险应对^*[J]. 数据分析与知识发现, 2023, 7(3): 16-25.
[12]	段宇锋, 贺国秀. 面向中文医学文本命名实体识别的神经网络模块分解分析^*[J]. 数据分析与知识发现, 2023, 7(2): 26-37.
[13]	赵蕊洁, 佟昕瑀, 刘小桦, 路永和. 基于神经网络的医药科技论文实体识别与标注研究^*[J]. 数据分析与知识发现, 2022, 6(9): 100-112.
[14]	胡吉明, 钱玮, 文鹏, 吕晓光. 基于结构功能和实体识别的文本语义表示——以病历领域为例*[J]. 数据分析与知识发现, 2022, 6(8): 110-121.
[15]	张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别^*[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.

Viewed

Full text

Abstract

Cited

Shared

Discussed