Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (12): 61-69    DOI: 10.11925/infotech.2096-3467.2019.0684
Current Issue | Archive | Adv Search |
Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning
Meishan Chen,Chenxi Xia()
School of Medicine and Health Management, Huazhong University of Science and Technology, Wuhan 430073, China
Download: PDF (597 KB)   HTML ( 21
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study utilizes annotated corpus with a pre-trained model, aiming to identify entities from corpus of limited annotation. [Methods] First, we collected online questions from patients with lung or liver cancers. Then we developed a KNN-BERT-BiLSTM-CRF framework combining instance and parameter transfer, which recognized named entities with small amount of labeled data. [Results] When the k value of instance-transfer was set to 3, we achieved the best performance of named entity recognition. Its F value was 96.10%, which was 1.98% higher than the performance of models with no instance-transfer techniques. [Limitations] The proposed method needs to be examined with entities of other diseases. [Conclusions] The cross-domain transfer learning method could improve the performance of entity identification.

Key wordsBERT      BiLSTM      Named Entity Recognition      Transfer Learning     
Received: 14 June 2019      Published: 25 December 2019
ZTFLH:  TP391  
Corresponding Authors: Chenxi Xia     E-mail: xcxxdy@hust.edu.cn

Cite this article:

Meishan Chen,Chenxi Xia. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning. Data Analysis and Knowledge Discovery, 2019, 3(12): 61-69.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0684     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I12/61

实体类型 简洁定义 例子 目标领域标注数量 源领域标注数量
身体部位 包括器官, 身体部位和组织 头部, 颈部 1 359 6 876
细胞实体 包括细胞、分子或细胞层面的解剖实体 血红蛋白,巨细胞 130 398
诊断程序 包括用于诊断的检测和活检程序 活检, CT, b超, 铁含量 156 1 102
药物 包括用于治疗目的的物质 华蟾素胶囊, 吗啡 259 1 805
度量 一个命名实体的核心属性, 如药物的剂量 10 mg, 2% 78 257
个体 包括个人(性别、年龄等)和人口群体 父亲, 女性, 16岁 1 188 2 506
问题 包括疾病、症状、异常和并发症 疼痛, 破裂, 肺癌, 肿瘤 4 975 25 427
治疗程序 指程序或医学、设备用于治疗以及未指明的植入预防手术干预 肾镜切除, 植入, 化疗 1 003 4 169
癌症分期 决定癌症发展与扩散程度的方法 早期, 前期, 晚期 1 142 4 304
名称 类型 数量(句) 标注情况
源领域数据集 肺癌 11 822 有标注
目标领域数据集 肝癌 2 000 有标注
网络层 参数 取值
Doc2Vec 算法 DM
窗口大小 5
最小词频 5
学习率 由0.025递减至0.001
向量维度 100
BERT 批处理大小 32
学习率 2e-5
样本最大长度 128
迭代次数 10
优化方法 Adam
BiLSTM L2正则化 0.001
迭代次数 10
Dropout 0.5
Word2Vec 算法 Skip-gram
窗口大小 5
学习率 由0.025递减至0.001
最小词频 3
向量维度 100
模型 P(%) R(%) F(%)
Word2Vec-BiLSTM-CRF 85.98 86.55 86.26
BERT-BiLSTM-CRF 92.91 95.36 94.12
模型 评价
指标
k=0 k=1 k=2 k=3 k=4 k=5 k=6
KNN-BERT-
BiLSTM-CRF
P 92.91 93.54 94.89 95.74 95.40 94.73 94.60
R 95.36 95.74 96.51 96.75 96.24 96.30 95.68
F 94.12 94.63 95.69 96.10 95.82 95.51 95.14
KNN-Word2Vec-BiLSTM-CRF P 85.98 88.73 90.45 91.48 91.65 91.03 90.77
R 86.55 89.57 91.30 92.48 92.62 92.05 91.90
F 86.26 89.15 90.87 91.98 92.13 91.54 91.33
模型 P(%) R(%) F(%)
Word2Vec-BiLSTM-CRF 85.98 86.55 86.26
KNN-Word2Vec-BiLSTM-CRF(k=4) 91.65 92.62 92.13
BERT-BiLSTM-CRF 92.91 95.36 94.12
KNN-BERT-BiLSTM-CRF(k=3) 95.47 96.75 96.10
[1] 中国互联网络信息中心. 第43次《中国互联网络发展状况统计报告》[R/OL]. ( 2019- 02- 28). http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201902/P020190318523029756345.pdf.
[1] ( CNNIC. The 43rd China Statistical Report on Internet Development in China[R/OL]. ( 2019- 02- 28). http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201902/P020190318523029756345.pdf
[2] Goh J M, Gao G, Agarwal R . The Creation of Social Value: Can an Online Health Community Reduce Rural-urban Health Disparities?[J]. MIS Quarterly, 2016,40(1):247-263.
[3] Moorhead S A, Hazlett D E, Harrison L , et al. A New Dimension of Health Care: Systematic Review of the Uses, Benefits, and Limitations of Social Media for Health Communication[J]. Journal of Medical Internet Research, 2013,15(4):e85.
[4] 孙安, 于英香, 罗永刚 , 等. 序列标注模型中的字粒度特征提取方案研究——以CCKS2017:Task2临床病历命名实体识别任务为例[J]. 图书情报工作, 2018,62(11):103-111.
[4] ( Sun An, Yu Yingxiang, Luo Yonggang , et al. Research on Feature Extraction Scheme of Chinese-character Granularity in Sequence Labeling Model: A Case Study About Clinical Named Entity Recognition of CCKS2017: Task2[J]. Library and Information Service, 2018,62(11):103-111.)
[5] 何林娜, 杨志豪, 林鸿飞 , 等. 基于特征耦合泛化的药名实体识别[J]. 中文信息学报, 2014,28(2):72-77.
[5] ( He Linna, Yang Zhihao, Lin Hongfei , et al. Drug Name Entity Recognition Based on Feature Coupling Generalization[J]. Journal of Chinese Information Processing, 2014,28(2):72-77.)
[6] Grishman R, Sundheim B . Message Understanding Conference-6: A Brief History [C]//Proceedings of the 16th International Conference on Computational Linguistics. 1996.
[7] Lafferty J, McCallum A, Pereira F C N . Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data [C]//Proceedings of the 18th International Conference on Machine Learning (ICML 2001). 2001: 282-289.
[8] Bikel D M, Miller S, Schwartz R , et al. Nymble: A High-performance Learning Name-finder [C]// Proceedings of the 5th Conference on Applied Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 1997: 194-201.
[9] Bender O, Och F J, Ney H . Maximum Entropy Models for Named Entity Recognition [C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 2003: 148-151.
[10] Goller C, Kuchler A . Learning Task-dependent Distributed Representations by Backpropagation Through Structure [C] //Proceedings of International Conference on Neural Networks (ICNN'96). IEEE, 1996,1:347-352.
[11] Hochreiter S, Schmidhuber J . Long Short-Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
[12] Graves A, Schmidhuber J . Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005,18(5-6):602-610.
[13] Sun P, Yang X, Zhao X , et al. An Overview of Named Entity Recognition [C]// Proceedings of the 2018 International Conference on Asian Language Processing (IALP). IEEE, 2018: 273-278.
[14] Blitzer J, McDonald R, Pereira F . Domain Adaptation with Structural Correspondence Learning [C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2006: 120-128.
[15] Jiang J, Zhai C X . Instance Weighting for Domain Adaptation in NLP [C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007: 264-271.
[16] Yang Z, Salakhutdinov R, Cohen W W . Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks[OL]. arXiv Preprint, arXiv: 1703.06345.
[17] Dai W, Yang Q, Xue G R , et al. Boosting for Transfer Learning [C]//Proceedings of the 24th International Conference on Machine Learning. ACM, 2007: 193-200.
[18] Dai W, Xue G R, Yang Q , et al. Transferring Naive Bayes Classifiers for Text Classification [C]// Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007: 540-545.
[19] Dai W, Xue G R, Yang Q , et al. Co-clustering Based Classification for Out-of-domain Documents [C]//Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2007: 210-219.
[20] Xue G R, Dai W, Yang Q , et al. Topic-bridged PLSA for Cross-domain Text Classification [C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2008: 627-634.
[21] Pan S J, Tsang I W, Kwok J T , et al. Domain Adaptation via Transfer Component Analysis[J]. IEEE Transactions on Neural Networks, 2010,22(2):199-210.
[22] Zhong E, Fan W, Peng J , et al. Cross Domain Distribution Adaptation via Kernel Mapping [C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2009: 1027-1036.
[23] 张博, 史忠植, 赵晓非 , 等. 一种基于跨领域典型相关性分析的迁移学习方法[J]. 计算机学报, 2015,38(7):1326-1336.
[23] ( Zhang Bo, Shi Zhongzhi, Zhao Xiaofei , et al. A Transfer Learning Based on Canonical Correlation Analysis Across Different Domains[J]. Chinese Journal of Computers, 2015,38(7):1326-1336.)
[24] Al-Stouhi S, Reddy C K . Transfer Learning for Class Imbalance Problems with Inadequate Data[J]. Knowledge and Information Systems, 2016,48(1):201-228.
[25] Ryu D, Jang J I, Baik J . A Transfer Cost-sensitive Boosting Approach for Cross-project Defect Prediction[J]. Software Quality Journal, 2017,25(1):235-272.
[26] Pan S J, Ni X, Sun J T , et al. Cross-domain Sentiment Classification via Spectral Feature Alignment [C] //Proceedings of the 19th International Conference on World Wide Web. ACM, 2010: 751-760.
[27] He Y, Lin C, Alani H . Automatically Extracting Polarity-bearing Topics for Cross-domain Sentiment Classification [C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 123-131.
[28] Tan B, Song Y, Zhong E , et al. Transitive Transfer Learning [C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015: 1155-1164.
[29] 周清清, 章成志 . 基于迁移学习微博情绪分类研究——以H7N9微博为例[J]. 情报学报, 2016,35(4):339-348.
[29] ( Zhou Qingqing, Zhang Chengzhi . Microblog Emotion Classification Based on Transfer Learning:A Case Study of Microblogs about H7N9[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(4):339-348.)
[30] Huang X, Rao Y, Xie H , et al. Cross-domain Sentiment Classification via Topic-related TrAdaBoost [C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. AAAI, 2017: 4939-4940.
[31] 余传明 . 基于深度循环神经网络的跨领域文本情感分析[J]. 图书情报工作, 2018,62(11):23-34.
[31] ( Yu Chuanming . A Cross-domain Text Sentiment Analysis Based on Deep Recurrent Neural Network[J]. Library and Information Service, 2018,62(11):23-34.)
[32] Giorgi J M, Bader G D . Transfer Learning for Biomedical Named Entity Recognition with Neural Networks[J]. Bioinformatics, 2018,34(23):4087-4094.
[33] Corbett P, Boyle J . Chemlistem: Chemical Named Entity Recognition Using Recurrent Neural Networks[J]. Journal of Cheminformatics, 2018,10(1):61-68.
[34] Gama J, Žliobaitė I, Bifet A , et al. A Survey on Concept Drift Adaptation[J]. ACM Computing Surveys (CSUR), 2014,46(4):1-44.
[35] Pan S J, Yang Q . A Survey on Transfer Learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2009,22(10):1345-1359.
[36] 高冰涛, 张阳, 刘斌 . BioTrHMM:基于迁移学习的生物医学命名实体识别算法[J]. 计算机应用研究, 2019,36(1):45-48.
[36] ( Gao Bingtao, Zhang Yang, Liu Bin . BioTrHMM: Named Entity Recognition Algorithm Based on Transfer Learning in Biomedical Texts[J]. Application Research of Computers, 2019,36(1):45-48.)
[37] 王红斌, 沈强, 线岩团 . 融合迁移学习的中文命名实体识别[J]. 小型微型计算机系统, 2017,38(2):346-351.
[37] ( Wang Hongbin, Shen Qiang, Xian Yantuan . Research on Chinese Named Entity Recognition Fusing Transfer Learning[J]. Journal of Chinese Computer Systems, 2017,38(2):346-351.)
[38] Pan S J, Toh Z, Su J . Transfer Joint Embedding for Cross-Domain Named Entity Recognition[J]. ACM Transactions on Information Systems (TOIS), 2013,31(2):1-27.
[39] Pennington J, Socher R, Manning C . GloVe: Global Vectors for Word Representation [C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 1532-1543.
[40] Devlin J, Chang M W, Lee K , et al. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[41] Peters M E, Neumann M, Iyyer M , et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv: 1802.05365.
[42] Radford A, Narasimhan K, Salimans T , et al. Improving Language Understanding by Generative Pre-training[OL]. [2019-04-05]. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
[43] Si Y, Wang J, Xu H , et al. Enhancing Clinical Concept Extraction with Contextual Embedding [OL]. arXiv Preprint, arXiv: 1902.08691.
[44] Lee J, Yoon W, Kim S , et al. Biobert: Pre-trained Biomedical Language Representation Model for Biomedical Text Mining [OL]. arXiv Preprint, arXiv: 1901.08746.
[45] Le Q, Mikolov T . Distributed Representations of Sentences and Documents [C] //Proceedings of the International Conference on Machine Learning. 2014: 1188-1196.
[46] Cover T M, Hart P . Nearest Neighbor Pattern Classification[J]. IEEE Transactions on Information Theory, 1967,13(1):21-27.
[47] 赵冬 . 健康领域中文自动问答的问题解析研究——以肺癌为例[D]. 武汉: 华中科技大学, 2019.
[47] ( Zhao Dong . Question Analysis of Chinese Automatic Question Answering in Health Field: A Case of Lung Cancer[D]. Wuhan:Huazhong University of Science and Technology, 2019.)
[48] Kilicoglu H, Abacha A B, Mrabet Y , et al. Semantic Annotation of Consumer Health Questions[J]. BMC Bioinformatics, 2018,19(1):34.
[49] Hripcsak G, Rothschild A S . Agreement, the F-measure, and Reliability in Information Retrieval[J]. Journal of the American Medical Informatics Association, 2005,12(3):296-298.
[50] Sang T K, De Meulder F . Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition [C]//Proceedings of CoNLL-2003, 2003: 142-147.
[51] 朱艳辉, 李飞, 冀相冰 , 等. 反馈式K近邻语义迁移学习的领域命名实体识别[J]. 智能系统学报, 2019(4):820-830.
[51] ( Zhu Yanhui, Li Fei, Ji Xiangbing , et al. Domain Named Entity Recognition Based on Feedback K-Nearest Semantic Transfer Learning[J]. CAAI Transactions on Intelligent Systems, 2019(4):820-830.)
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
[4] Li Wenna, Zhang Zhixiong. Entity Alignment Method for Different Knowledge Repositories with Joint Semantic Representation[J]. 数据分析与知识发现, 2021, 5(7): 1-9.
[5] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[6] Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. 数据分析与知识发现, 2021, 5(7): 26-35.
[7] Lu Quan, He Chao, Chen Jing, Tian Min, Liu Ting. A Multi-Label Classification Model with Two-Stage Transfer Learning[J]. 数据分析与知识发现, 2021, 5(7): 91-100.
[8] Liu Wenbin, He Yanqing, Wu Zhenfeng, Dong Cheng. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
[9] Yin Pengbo,Pan Weimin,Zhang Haijun,Chen Degang. Identifying Clickbait with BERT-BiGA Model[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[10] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[11] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[12] Wang Qian,Wang Dongbo,Li Bin,Xu Chao. Deep Learning Based Automatic Sentence Segmentation and Punctuation Model for Massive Classical Chinese Literature[J]. 数据分析与知识发现, 2021, 5(3): 25-34.
[13] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[14] Dong Miao, Su Zhongqi, Zhou Xiaobei, Lan Xue, Cui Zhigang, Cui Lei. Improving PubMedBERT for CID-Entity-Relation Classification Using Text-CNN[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[15] Liu Huan,Zhang Zhixiong,Wang Yufei. A Review on Main Optimization Methods of BERT[J]. 数据分析与知识发现, 2021, 5(1): 3-15.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn