Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (8): 110-121     https://doi.org/10.11925/infotech.2096-3467.2021.1167
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于结构功能和实体识别的文本语义表示——以病历领域为例*
胡吉明1,2,钱玮1,2,文鹏3(),吕晓光4
1武汉大学信息管理学院 武汉 430072
2武汉大学信息检索与知识挖掘研究所 武汉 430072
3武汉大学马克思主义学院 武汉 430072
4武汉大学人民医院 武汉 430060
Text Semantic Representation with Structure-Function and Entity Recognition: Case Study of Medical Records
Hu Jiming1,2,Qian Wei1,2,Wen Peng3(),Lv Xiaoguang4
1School of Information Management, Wuhan University, Wuhan 430072, China
2Information Retrieval and Knowledge Mining Laboratory, Wuhan University, Wuhan 430072, China
3School of Marxism, Wuhan University, Wuhan 430072, China
4Renmin Hospital of Wuhan University, Wuhan 430060, China
全文: PDF (1254 KB)   HTML ( 16
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 融合中文病历的结构功能信息,丰富病历文本的语义内涵,提升文本表示的准确性和后续文本挖掘效果。【方法】 依据中文病历结构功能特征,创新文本语义表示策略,使用BiLSTM-CRF模型实现基于结构的命名实体智能识别,在词向量层面引入实体及结构信息,经由TextCNN模型进一步提取局部上下文特征,得到文本语义内涵更为丰富的向量表示形式。【结果】 在命名实体识别实验中,基于结构的医疗实体识别精确率、召回率和F值分别达93.20%、95.19%和94.19%;在文本表示的分类验证实验中,所提病历文本表示方法的分类准确率达到92.12%。【局限】 需进一步加强对更多类型文本的验证,细化结构识别过程,使所提方法更好地应用于文本挖掘工作。【结论】 本文将病历结构功能信息引入病历文本表示工作,实验证明了其既能有效提高命名实体识别准确度,又能进一步丰富文本语义内涵和提升文本表示效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
胡吉明
钱玮
文鹏
吕晓光
关键词 中文病历文本结构功能命名实体识别文本语义表示BiLSTM-CRF模型    
Abstract

[Objective] This paper tries to improve the accuracy of text representation and mining, with the help of structural and functional information from Chinese medical records. [Methods] First, we proposed a new semantic representation strategy for the texts of Chinese medical records based on their structure-function features. Then, we used the BiLSTM-CRF model to recognize named entities, which introduced structure information at the word vector level. Finally, we utilized the TextCNN model to extract local context features, which helped us obtain a vector representation with richer text semantic connotations. [Results] The precision, recall and F values of the new model reached 93.20%, 95.19% and 94.19% respectively, while the classification accuracy rate reached 92.12%. [Limitations] Future research is needed to evaluate our model with more texts and refine the structure recognition process. [Conclusions] The proposed method could effectively improve the accuracy of named entity recognition, and enrich the semantic connotation and representation of the texts.

Key wordsChinese Medical Records    Text Structure and Function    Named Entity Recognition    Text Semantic Representation    BiLSTM-CRF Model
收稿日期: 2021-10-14      出版日期: 2022-09-23
ZTFLH:  TP391  
基金资助:*国家自然科学基金面上项目(71874125);湖北省青年拔尖人才培养计划项目的研究成果之一
通讯作者: 文鹏,ORCID:0000-0002-0278-7391     E-mail: wenpeng@whu.edu.cn
引用本文:   
胡吉明, 钱玮, 文鹏, 吕晓光. 基于结构功能和实体识别的文本语义表示——以病历领域为例*[J]. 数据分析与知识发现, 2022, 6(8): 110-121.
Hu Jiming, Qian Wei, Wen Peng, Lv Xiaoguang. Text Semantic Representation with Structure-Function and Entity Recognition: Case Study of Medical Records. Data Analysis and Knowledge Discovery, 2022, 6(8): 110-121.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.1167      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I8/110
学者 研究视角 研究思路
Lu等[33] 文本块
本文 结构功能
Table 1  基于结构信息的文本表示研究方法对比
Fig.1  基于结构功能和实体识别的病历文本表示框架
Fig.2  基于结构功能的命名实体识别模型(CSF-BiLSTM-CRF)
Fig.3  TextCNN文本表示模型
序号 结构模块 内涵功能
1 入院情况 主诉、既往史、体查发现、主要辅助检查
2 入院诊断 疾病
3 治疗经过 入院检查、治疗方式、药物、病检
4 出院情况 主诉、体查发现
5 出院诊断 疾病
Table 2  中文病历的文本结构及其内涵功能
实体类型 类型定义 示例 标识符号
症状 患者主观描述症状,位于患者主诉中 腹痛、呕吐、腹胀 SYMPTOM
身体部位 身体的解剖学部位或器官 腹、胃、肝 BODY
化验和检查 化验主要指血、粪、尿实验室化验指标;检查主要指影像学、核医学等结果 T(体温)、胃镜、CT TEST&
EXAMINATION
疾病 各类疾病医学名词及缩写,位于患者既往疾病史及入院诊断和出院诊断中 胃癌、溃疡、高血压 DISEASE
体征 体格检查发现身体客观异常表现 压痛、反跳痛、呼吸 SIGN
治疗 止血、营养支持以及特殊手术名称 化疗、手术、营养 TREATMENT
药物 药物名称,位于既往疾病史、药物过敏史以及治疗经过中 奥沙利铂、替吉奥、维康达 DRUG
Table 3  中文病历实体类型
参数名称 参数值
初始学习率 1.0
Dropout 0.5
隐藏层大小 300
迭代次数 50
Batch_size 32
Table 4  CSF-BiLSTM-CRF模型参数设置
模型 P/% R/% F值/%
HMM 86.02 73.52 79.28
CRF 82.17 85.88 83.99
BiLSTM 81.42 78.21 79.78
BiLSTM-CRF 92.39 92.51 92.48
CSF-BiLSTM-CRF 93.20 95.19 94.19
Table 5  不同模型的实体识别结果
参数名称 参数值
文本维度 800
词维度 100
卷积核大小 3,4,5
Dropout 0.5
Batch_size 64
迭代次数 50
Table 6  TextCNN模型参数设置
序号 文本表示方法 Acc/% 类别 P/% R/% F值/%
1 Doc2Vec+结构(Baseline) 74.55 腺癌 72.58 64.29 68.18
胃癌 75.73 82.11 78.79
2 仅文本向量 55.76 腺癌 58.57 48.24 52.90
胃癌 53.68 63.75 58.29
3 文本向量+实体结构信息 56.36 腺癌 58.90 50.59 54.43
胃癌 54.35 62.50 58.14
4 仅文本向量(TextCNN) 87.27 腺癌 84.81 88.16 86.45
胃癌 89.53 86.52 88.00
5 文本向量+普通实体(TextCNN) 90.30 腺癌 90.54 88.16 89.33
胃癌 90.11 92.13 91.11
6 文本向量+实体结构信息(TextCNN) 92.12 腺癌 95.00 89.41 92.12
胃癌 89.41 95.00 92.12
Table 7  不同文本表示方法下的分类结果
[1] 杜琳, 曹东, 林树元, 等. 基于BERT与Bi-LSTM融合注意力机制的中医病历文本的提取与自动分类[J]. 计算机科学, 2020, 47(S2): 416-420.
[1] (Du Lin, Cao Dong, Lin Shuyuan, et al. Extraction and Automatic Classification of TCM Medical Records Based on Attention Mechanism of BERT and Bi-LSTM[J]. Computer Science, 2020, 47(S2): 416-420.)
[2] 中文信息处理发展报告(2016)[R]. 北京: 中国中文信息学会, 2016.
[2] (Chinese Information Processing Development Report(2016)[R]. Beijing: Chinese Information Processing Society of China, 2016.)
[3] 周昭涛, 卜东波, 程学旗. 文本的图表示初探[J]. 中文信息学报, 2005, 19(2): 36-43.
[3] (Zhou Zhaotao, Bu Dongbo, Cheng Xueqi. Towards Graph-Based Text Representation[J]. Journal of Chinese Information Processing, 2005, 19(2): 36-43.)
[4] 王倩, 曾金, 刘家伟, 等. 基于深度学习的学术文本段落结构功能识别研究[J]. 情报科学, 2020, 38(3): 64-69.
[4] (Wang Qian, Zeng Jin, Liu Jiawei, et al. Structure Function Recognition of Academic Text Paragraph Based on Deep Learning[J]. Information Science, 2020, 38(3): 64-69.)
[5] Ribeiro S, Yao J T, Rezende D A. Discovering IMRaD Structure with Different Classifiers[C]// Proceedings of the 2018 IEEE International Conference on Big Knowledge. 2018: 200-204.
[6] 国家质量监督检验检疫总局, 中国国家标准化管理委员会. 党政机关电子公文格式规范第1部分:公文结构: GB/T 33476.1—2016[S]. 北京: 中国标准出版社, 2016.
[6] (General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China, Standardization Administration of the People’s Republic of China. Format Specification for Electronic Official Document of Party and Government Organs—Part 1: Official Document Structure: GB/T 33476.1—2016[S]. Beijing: Standards Press of China, 2016.)
[7] 李凡姝, 姚登峰. 自然语言处理中的文本表示和语言模型综述[C]// 中国计算机用户协会网络应用分会2020年第24届网络新技术与应用年会论文集. 2020.
[7] (Li Fanshu, Yao Dengfeng. Text Representation and Language Model in Natural Language Processing[C]// Proceedings of the 24th Annual Conference on New Network Technologies and Applications. 2020.)
[8] Zhang Y, Jin R, Zhou Z H. Understanding Bag-of-Words Model: A Statistical Framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1(1-4): 43-52.
doi: 10.1007/s13042-010-0001-0
[9] Salton G, Wong A, Yang C S. A Vector Space Model for Automatic Indexing[J]. Communications of the ACM, 1975, 18(11): 613-620.
doi: 10.1145/361219.361220
[10] McMahon J, Smith F J. A Review of Statistical Language Processing Techniques[J]. Artificial Intelligence Review, 1998, 12: 347-391.
doi: 10.1023/A:1006517723917
[11] Bengio Y, Ducharme R, Vincent P, et al. A Neural Probabilistic Language Model[J]. The Journal of Machine Learning Research, 2003, 3:1137-1155.
[12] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[13] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
[14] Devlin J, Chang M, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 17th Conference of the North American Chapter of the Association for Computational Linguistics. 2019: 4171-4186.
[15] Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[C]// Proceedings of the 31st International Conference on International Conference on Machine Learning. 2014: 1188-1196.
[16] Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[17] Shen D H, Min M R, Li Y T, et al. Learning Context-Sensitive Convolutional Filters for Text Processing[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 1839-1848.
[18] 吴汉瑜, 严江, 黄少滨, 等. 用于文本分类的CNN_BiLSTM_Attention混合模型[J]. 计算机科学, 2020, 47(S2): 23-27, 34.
[18] (Wu Hanyu, Yan Jiang, Huang Shaobin, et al. CNN_BiLSTM_Attention Hybrid Model for Text Classification[J]. Computer Science, 2020, 47(S2): 23-27, 34.)
[19] Pa T L, Kumari M, Singh T, et al. Semantic Representations in Text Data[J]. International Journal of Grid and Distributed Computing, 2018, 11(9): 65-80.
[20] 聂维民, 陈永洲, 马静. 融合多粒度信息的文本向量表示模型[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[20] (Nie Weimin, Chen Yongzhou, Ma Jing. A Text Vector Representation Model Merging Multi-granularity Information[J]. Data Analysis and Knowledge Discovery, 2019, 3(9): 45-52.)
[21] 俞琰, 陈磊, 姜金德, 等. 结合词向量和统计特征的专利相似度测量方法[J]. 数据分析与知识发现, 2019, 3(9): 53-59.
[21] (Yu Yan, Chen Lei, Jiang Jinde, et al. Measuring Patent Similarity with Word Embedding and Statistical Features[J]. Data Analysis and Knowledge Discovery, 2019, 3(9): 53-59.)
[22] Liu W F, Liu P Y, Yang Y Z, et al. A Embedding Model for Text Classification[J]. Expert Systems, 2019, 36(6): e12460.
[23] Jiang Z L, Gao S, Chen L C. Study on Text Representation Method Based on Deep Learning and Topic Information[J]. Computing, 2020, 102(3): 623-642.
doi: 10.1007/s00607-019-00755-y
[24] 杨春霞, 吴佳君, 李欣栩. 融合实体信息的循环神经网络文本分类模型[J]. 小型微型计算机系统, 2020, 41(12): 2516-2521.
[24] (Yang Chunxia, Wu Jiajun, Li Xinxu. Text Classification Model Based on Recurrent Neural Network with Entity Information[J]. Journal of Chinese Computer Systems, 2020, 41(12): 2516-2521.)
[25] 黄露, 周恩国, 李岱峰. 融合特定任务信息注意力机制的文本表示学习模型[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[25] (Huang Lu, Zhou Enguo, Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-Specific Information[J]. Data Analysis and Knowledge Discovery, 2020, 4(9): 111-122.)
[26] 秦成磊, 章成志. 基于层次注意力网络模型的学术文本结构功能识别[J]. 数据分析与知识发现, 2020, 4(11): 26-42.
[26] (Qin Chenglei, Zhang Chengzhi. Recognizing Structure Functions of Academic Articles with Hierarchical Attention Network[J]. Data Analysis and Knowledge Discovery, 2020, 4(11): 26-42.)
[27] 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985.
[27] (Lu Wei, Huang Yong, Cheng Qikai. The Structure Function of Academic Text and Its Classification[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(9): 979-985.)
[28] 黄永, 陆伟, 程齐凯. 学术文本的结构功能识别——基于章节内容的识别[J]. 情报学报, 2016, 35(3): 293-300.
[28] (Huang Yong, Lu Wei, Cheng Qikai. The Structure Function Recognition of Academic Text——Chapter Content Based Recognition[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(3): 293-300.)
[29] 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——基于段落的识别[J]. 情报学报, 2016, 35(5): 530-538.
[29] (Huang Yong, Lu Wei, Cheng Qikai, et al. The Structure Function Recognition of Academic Text——Paragraph-Based Recognition[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(5): 530-538.)
[30] 胡吉明, 钱玮, 李雨薇, 等. 基于LDA2Vec的政策文本主题挖掘与结构化解析框架研究[J]. 情报科学, 2021, 39(10): 11-17.
[30] (Hu Jiming, Qian Wei, Li Yuwei, et al. Topic Mining and Structured Parse of Policy Text Based on LDA2Vec[J]. Information Science, 2021, 39(10): 11-17.)
[31] Laddha A, Joshi S, Shaikh S, et al. Joint Distributed Representation of Text and Structure of Semi-Structured Documents[C]// Proceedings of the 29th on Hypertext and Social Media. 2018: 25-32.
[32] 车蕾, 杨小平, 王良, 等. 面向文本结构的混合分层注意力网络的话题归类[J]. 中文信息学报, 2019, 33(5): 93-102, 112.
[32] (Che Lei, Yang Xiaoping, Wang Liang, et al. Text Structure Oriented Hybrid Hierarchical Attention Networks for Topic Classification[J]. Journal of Chinese Information Processing, 2019, 33(5): 93-102, 112.)
[33] Lu Y H, Zhai Y Y, Luo J Y, et al. MLPV: Text Representation of Scientific Papers Based on Structural Information and Doc2Vec[J]. American Journal of Information Science and Technology, 2019, 3(3): 62.
doi: 10.11648/j.ajist.20190303.12
[34] 孙镇, 王惠临. 命名实体识别研究进展综述[J]. 现代图书情报技术, 2010(6): 42-47.
[34] (Sun Zhen, Wang Huilin. Overview on the Advance of the Research on Named Entity Recognition[J]. New Technology of Library and Information Service, 2010(6): 42-47.)
[35] Goyal A, Gupta V, Kumar M. Recent Named Entity Recognition and Classification Techniques: A Systematic Review[J]. Computer Science Review, 2018, 29: 21-43.
doi: 10.1016/j.cosrev.2018.06.001
[36] 王若佳, 魏思仪, 王继民. BiLSTM-CRF模型在中文电子病历命名实体识别中的应用研究[J]. 文献与数据学报, 2019, 1(2): 53-66.
[36] (Wang Ruojia, Wei Siyi, Wang Jimin. Applied Research on Named Entity Recognition in Chinese Electronic Medical Record Based on BiLSTM-CRF Model[J]. Journal of Library and Data, 2019, 1(2): 53-66.)
[37] Lafferty J D, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001: 282-289.
[38] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
pmid: 9377276
[39] 易士翔, 尹宏鹏, 郑恒毅. 基于BiLSTM的公共安全事件触发词识别[J]. 工程科学学报, 2019, 41(9): 1201-1207.
[39] (Yi Shixiang, Yin Hongpeng, Zheng Hengyi. Public Security Event Trigger Identification Based on Bidirectional LSTM[J]. Chinese Journal of Engineering, 2019, 41(9): 1201-1207.)
[40] 余传明, 王曼怡, 林虹君, 等. 基于深度学习的词汇表示模型对比研究[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[40] (Yu Chuanming, Wang Manyi, Lin Hongjun, et al. A Comparative Study of Word Representation Models Based on Deep Learning[J]. Data Analysis and Knowledge Discovery, 2020, 4(8): 28-40.)
[41] Zhang J, Chang D. Semi-Supervised Patient Similarity Clustering Algorithm Based on Electronic Medical Records[J]. IEEE Access, 2019, 7: 90705-90714.
doi: 10.1109/ACCESS.2019.2923333
[42] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[43] Roberts A, Gaizauskas R, Hepple M, et al. Building a Semantically Annotated Corpus of Clinical Texts[J]. Journal of Biomedical Informatics, 2009, 42(5): 950-966.
doi: 10.1016/j.jbi.2008.12.013 pmid: 19535011
[44] 全国知识图谱与语义计算大会. CCKS 2020: 面向中文电子病历的医疗实体及事件抽取(一)医疗命名实体识别[EB/OL]. [2021-04-10]. https://www.biendata.net/competition/ccks_2020_2_1.
[44] (China Conference on Knowledge Graph and Semantic Computing. CCKS 2020: Medical Entity and Event Extraction for Chinese Electronic Medical Records (1) Medical Named Entity Recognition[EB/OL]. [2021-04-10]. https://www.biendata.net/competition/ccks_2020_2_1.)
[45] 王路路, 艾山·吾买尔, 吐尔根·依布拉音, 等. 基于深度神经网络的维吾尔文命名实体识别研究[J]. 中文信息学报, 2019, 33(3): 64-70.
[45] (Wang Lulu, Aishan Wumaier, Tuergen Yibulayin, et al. Uyghur Named Entity Recognition Based on Deep Neural Network[J]. Journal of Chinese Information Processing, 2019, 33(3): 64-70.)
[46] 陈培新. 文本语义的向量表示与建模方法研究[D]. 合肥: 中国科学技术大学, 2018.
[46] (Chen Peixin. The Research of Semantic Vector Representations and Modeling Approachesfor Text[D]. Hefei: University of Science and Technology of China, 2018.)
[47] Jieba[EB/OL]. [2020-08-25]. https://pypi.org/project/jieba/.
[48] Řehůřek R. Word2Vec Embeddings[EB/OL]. [2020-08-25]. https://radimrehurek.com/gensim/models/word2vec.html.
[49] 吕璐成, 韩涛, 周健, 等. 基于深度学习的中文专利自动分类方法研究[J]. 图书情报工作, 2020, 64(10): 75-85.
doi: 10.13266/j.issn.0252-3116.2020.10.009
[49] (Lv Lucheng, Han Tao, Zhou Jian, et al. Research on the Method of Chinese Patent Automatic Classification Based on Deep Learning[J]. Library and Information Service, 2020, 64(10): 75-85.)
doi: 10.13266/j.issn.0252-3116.2020.10.009
[50] 胡吉明, 郑翔, 程齐凯, 等. 基于BiLSTM-CRF的政府微博舆论观点抽取与焦点呈现[J]. 情报理论与实践, 2021, 44(1): 174-179, 137.
[50] (Hu Jiming, Zheng Xiang, Cheng Qikai, et al. Public Opinion Extraction and Focus Presentation in Government Microblog Based on BiLSTM-CRF[J]. Information Studies: Theory & Application, 2021, 44(1): 174-179, 137.)
[51] Kowsari K, Meimandi K J, Heidarysafa M, et al. Text Classification Algorithms: A Survey[J]. Information, 2019, 10(4): 150-218.
doi: 10.3390/info10040150
[1] 张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别*[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[2] 余传明, 林虹君, 张贞港. 基于多任务深度学习的实体和事件联合抽取模型*[J]. 数据分析与知识发现, 2022, 6(2/3): 117-128.
[3] 张芳丛, 秦秋莉, 姜勇, 庄润涛. 基于RoBERTa-WWM-BiLSTM-CRF的中文电子病历命名实体识别研究[J]. 数据分析与知识发现, 2022, 6(2/3): 251-262.
[4] 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究*[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[5] 高原,施元磊,张蕾,曹天奕,冯筠. 基于游记文本的游客游览行程重构*[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[6] 马建霞,袁慧,蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[7] 秦成磊,章成志. 基于层次注意力网络模型的学术文本结构功能识别*[J]. 数据分析与知识发现, 2020, 4(11): 26-42.
[8] 刘婧茹,宋阳,贾睿,张翼鹏,罗勇,马敬东. 基于BiLSTM-CRF中文临床文本中受保护的健康信息识别*[J]. 数据分析与知识发现, 2020, 4(10): 124-133.
[9] 黄菡,王宏宇,王晓光. 结合主动学习的条件随机场模型用于法律术语的自动识别*[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[10] 陈美杉,夏晨曦. 肝癌患者在线提问的命名实体识别研究:一种基于迁移学习的方法 *[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[11] 余丽,钱力,付常雷,赵华茗. 基于深度学习的文本中细粒度知识元抽取方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[12] 唐慧慧, 王昊, 张紫玄, 王雪颖. 基于汉字标注的中文历史事件名抽取研究*[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[13] 范馨月, 崔雷. 基于文本挖掘的药物副作用知识发现研究[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[14] 隋明爽,崔雷. 结合多种特征的CRF模型用于化学物质-疾病命名实体识别[J]. 现代图书情报技术, 2016, 32(10): 91-97.
[15] 汪润,何琳,王东波,黄水清,范远标. 面向文本挖掘的植物生长发育实体识别研究*[J]. 现代图书情报技术, 2014, 30(1): 24-27.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn