Please wait a minute...
Advanced Search
数据分析与知识发现
  本期目录 | 过刊浏览 | 高级检索 |
融入词性的医疗命名实体识别研究
本妍妍,庞雪芹
(华中科技大学数学与统计学院 武汉  430074) (武汉科技大学档案馆 武汉  430081)
Research on Medical Named Entity Recognition with Word Information
Ben Yanyan,Pang Xueqin
(School of Mathematics and Statistics, Huazhong University of Science and Technology, 430074, China) (Archives of Wuhan University of science and technology, 430081, China)
全文:
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]针对命名实体边界识别困难问题,本文融入词信息以改进在线问诊记录中临床关键特征的识别与推断。[方法]基于MacBERT与条件随机场构建模型,对词位置和词性等词信息进行位置“软”嵌入,利用说话者角色嵌入引入对话文本信息。同时,引入加权多分类交叉熵解决实体类别不均衡问题。[结果]在春雨医生互联网在线问诊记录上开展实证研究,所提模型在命名实体识别任务上的F_1值为74.35%,相比直接利用MacBERT模型提高近2%。[局限]未设计专门对中文分词的模型。[结论]与直接利用MacBERT模型建模相比,融入词信息等更多纬度特征能有效地提升模型对临床发现关键特征的识别能力。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
关键词 中文命名实体识别在线医疗问诊词信息融入MacBERT加权交叉熵     
Abstract

[Objective]Aiming at the difficulty of identifying the boundaries of named entities, the word information is integrated to improve the identification and inference of the identification of key clinical features in online consultation records.

[Methods]The model is constructed based on MacBERT and conditional random fields, and the positional "soft" embedding of word information such as word position and part of speech is carried out, and the dialogue text information is introduced by the speaker role embedding. At the same time, weighted multi-class cross-entropy is introduced to solve the problem of entity category imbalance.

[Results]An empirical study was carried out on the online consultation records of Chunyu Doctor, and the F1 value of the proposed model in the named entity recognition task was 74.35%, an increase of nearly 2%.

[Limitations]No model is designed specifically for Chinese word segmentation.

[Conclusions]Compared with directly using the MacBERT model for modeling, incorporating more dimensional features such as word information can effectively improve the model's ability to recognize key features of clinical findings.


Key words Chinese entity recognition    Online medical question answering    Word information embedding    MacBERT    Weighted cross entropy
     出版日期: 2022-07-29
ZTFLH:  TP393,G250  
引用本文:   
本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究 [J]. 数据分析与知识发现, 10.11925/infotech.2096-3467.2022-0547.
Ben Yanyan, Pang Xueqin. Research on Medical Named Entity Recognition with Word Information . Data Analysis and Knowledge Discovery, 0, (): 1-.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022-0547      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y0/V/I/1
[1] 本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究*[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn