Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning

doi:10.11925/infotech.2096-3467.2019.0549

Data Analysis and Knowledge Discovery

2020, Vol. 4

Issue (2/3): 39-47 DOI: 10.11925/infotech.2096-3467.2019.0549

Current Issue | Archive | Adv Search

Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning

Xiang Fei(

),Xie Yaotan

School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China

Download: PDF (890 KB) HTML ( 9 )
Export: BibTeX | EndNote (RIS)

Abstract

[Objective] This study proposes a new convolutional neural network model, aiming to process the imbalanced data of online patient reviews.[Methods] First, we established the new model with mixed sampling and transfer learning techniques. Then we used end-to-end deep learning architecture based on Word2Vector and convolutional neural network for the distributed representation, feature extraction and topic classification of online patient reviews.[Results] Compared with traditional machine learning algorithm represented by SVM and single convolutional neural network, the proposed model significantly improved the accuracy, recall and F1 values.[Limitations] The imbalanced data of this study was only from online patient reviews.[Conclusions] The proposed model could effectively improve the recognition results of imbalanced data.

Key words： Mixed Sampling Transfer Learning Imbalanced Data Convolutional Neural Network Patient Reviews Recognition

Received: 24 May 2019 Published: 26 April 2020

ZTFLH:

TP393

Corresponding Authors: Fei Xiang E-mail: xiangfei@hust.edu.cn

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Fei Xiang
	Yaotan Xie

Cite this article:

Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 39-47.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0549 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I2/3/39

Recognition Framework of Multi-label Data Based on Mixed Sampling and Transfer Learning

Mixed Sampling Process

Patient Reviews Recognition Model Based on End-to-End CNN

Skip-Gram Model

Description of Experimental Data Set

Co-occurrence of Topic Labels

Parameters of Word2Vec Training

Parameters of CNN

Accuracy of Classification Models for Different Topic Datasets

Recall of Classification Models for Different Topic Datasets

F1 Value of Classification Models for Different Topic Datasets

[1]	Hao H, Zhang K, Wang W , et al. A Tale of Two Countries: International Comparison of Online Doctor Reviews Between China and the United States[J]. International Journal of Medical Informatics, 2017,99:37-44.
[2]	陈旭, 刘鹏鹤, 孙毓忠 , 等. 面向不均衡医学数据集的疾病预测模型研究[J]. 计算机学报, 2019,42(3):596-609.
[2]	( Chen Xu, Liu Penghe, Sun Yuzhong , et al. Research on Disease Prediction Models Based on Imbalanced Medical Data Sets[J]. Chinese Journal of Computers, 2019,42(3):596-609.)
[3]	Johns B T, Mewhort D J K, Jones M N . The Role of Negative Information in Distributional Semantic Learning[J]. Cognitive Science, 2019,43(5):e12730.
[4]	Liang H, Sun X, Sun Y , et al. Text Feature Extraction Based on Deep Learning: A Review[J]. EURASIP Journal on Wireless Communications and Networking, 2017: Article No. 211.
[5]	Luque C, Luna J M, Luque M , et al. An Advanced Review on Text Mining in Medicine[J]. Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery, 2019,9(3):e1302.
[6]	Lu Y, Wu Y, Liu J , et al. Understanding Health Care Social Media Use from Different Stakeholder Perspectives: A Content Analysis of an Online Health Community[J]. Journal of Medical Internet Research, 2017,19(4):e109.
[7]	Hao H, Zhang K . The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews[J]. Journal of Medical Internet Research, 2016,18(5):e108.
[8]	Rivas R, Montazeri N, Le N X T , et al. Automatic Classification of Online Doctor Reviews: Evaluation of Text Classifier Algorithms[J]. Journal of Medical Internet Research, 2018,20(11):e11141.
[9]	金旭, 王磊, 孙国梓 , 等. 一种基于质心空间的不均衡数据欠采样方法[J]. 计算机科学, 2019,46(2):50-55.
[9]	( Jin Xu, Wang Lei, Sun Guozi , et al. Under-Sampling Method for Unbalanced Data Based on Centroid Space[J]. Computer Science, 2019,46(2):50-55.)
[10]	Wilson D L . Asymptotic Properties of Nearest Neighbor Rules Using Edited Data[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1972,2(3):408-421.
[11]	Kermanidis K, Maragoudakis M, Fakotakis N , et al. Learning Greek Verb Complements: Addressing the Class Imbalance [C]//Proceedings of the 20th International Conference on Computational Linguistics. 2004: 1065-1071.
[12]	古平, 欧阳源遊 . 基于混合采样的非平衡数据集分类研究[J]. 计算机应用研究, 2015,32(2):379-381.
[12]	( Gu Ping, Ouyang Yuanyou . Classification Research for Unbalanced Data Based on Mixed-Sampling[J]. Application Research of Computers, 2015,32(2):379-381.)
[13]	Chawla N V, Bowyer K W, Hall L O , et al. SMOTE: Synthetic Minority Over-Sampling Technique[J]. Journal of Artificial Intelligence Research, 2002,16:321-357.
[14]	Han H, Wang W Y, Mao B H . Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning [C]// Proceedings of the 2005 International Conference on Intelligent Computing. 2005: 878-887.
[15]	Perez-Ortiz M, Gutierrez P A, Hervas-Martinez C . Borderline Kernel Based Over-Sampling [C]// Proceedings of the 8th International Conference on Hybrid Artificial Intelligence Systems. 2013: 472-481.
[16]	Ling X, Dai W, Xue G R , et al. Spectral Domain-Transfer Learning [C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2008: 488-496.
[17]	Dai W, Chen Y, Xue G R , et al. Translated Learning: Transfer Learning Across Different Feature Spaces [C]// Proceedings of the 22nd Annual Conference on Neural Information Processing Systems. 2008: 353-360.
[18]	Pan S J, Ni X, Sun J , et al. Cross-Domain Sentiment Classification via Spectral Feature Alignment [C]// Proceedings of the 19th International Conference on World Wide Web. 2010: 751-760.
[19]	Pan S J, Kwok J T, Yang Q . Transfer Learning via Dimensionality Reduction [C]// Proceedings of the 23rd AAAI Conference on Artificial Intelligence. AAAI, 2008: 677-682.
[20]	Si S, Tao D, Geng B . Bregman Divergence-Based Regularization for Transfer Subspace Learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010,22(7):929-942.
[21]	Bonilla E V, Chai K M A, Williams C K I . Multi-Task Gaussian Process Prediction[J]. Advances in Neural Information Processing Systems, 2008,20:153-160.
[22]	Dai W Y, Yang Q, Xue G R , et al. Boosting for Transfer Learning [C]// Proceedings of the 24th International Conference on Machine Learning. 2007: 193-200.
[23]	Davis J, Domingos P . Deep Transfer via Second-Order Markov Logic [C]// Proceedings of the 26th International Conference on Machine Learning. 2009: 217-224.
[24]	Artem B, Victor L . Aggregating Deep Convolutional Features for Image Retrieval [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015: 1269-1277.
[25]	Zhou B, Khosla A, Lapedriza A , et al. Object Detectors Emerge in Deep Scene CNNs[OL]. arXiv Preprint, arXiv:1412.6856.
[26]	Jaipurkar S S, Jie W, Zeng Z , et al. Automated Classification Using End-to-End Deep Learning [C]// Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2018: 706-709.
[27]	Kim Y . Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint,arXiv:1408.5882.
[28]	Mikolov T, Chen K, Corrado G , et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv:1301.3781.
[29]	Alcala-Fdez J, Fernandez A, Luengo J , et al. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework[J]. Journal of Multiple-Valued Logic and Soft Computing, 2011,17:255-287.

[1]	Lu Quan, He Chao, Chen Jing, Tian Min, Liu Ting. A Multi-Label Classification Model with Two-Stage Transfer Learning[J]. 数据分析与知识发现, 2021, 5(7): 91-100.
[2]	Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[3]	Weng Mengjuan,Yao Changqing,Han Hongqi,Wang Lijun,Ran Yaxin. Classification and Indexing Method with CNN for Imbalanced Datasets[J]. 数据分析与知识发现, 2020, 4(7): 87-95.
[4]	Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[5]	Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[6]	Zhao Ping,Sun Lianying,Tu Shuai,Bian Jianling,Wan Ying. Identifying Scenic Spot Entities Based on Improved Knowledge Transfer[J]. 数据分析与知识发现, 2020, 4(5): 118-126.
[7]	Liu Tong,Ni Weijian,Sun Yujian,Zeng Qingtian. Predicting Remaining Business Time with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 134-142.
[8]	Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[9]	Wang Shuyi,Liu Sai,Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
[10]	Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[11]	Kan Liu,Lu Chen. Deep Neural Network Learning for Medical Triage[J]. 数据分析与知识发现, 2019, 3(6): 99-108.
[12]	Lianjie Xiao,Mengrui Gao,Xinning Su. An Under-sampling Ensemble Classification Algorithm Based on Fuzzy C-Means Clustering for Imbalanced Data[J]. 数据分析与知识发现, 2019, 3(4): 90-96.
[13]	Meishan Chen,Chenxi Xia. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[14]	Xu Yuemei,Lv Sining,Cai Lianqiao,Zhang Xiaoya. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec[J]. 数据分析与知识发现, 2018, 2(9): 31-41.
[15]	Wu Jiehua,Shen Jing,Zhou Bei. Classifying Multilayer Social Network Links Based on Transfer Component Analysis[J]. 数据分析与知识发现, 2018, 2(9): 88-99.

Viewed

Full text

Abstract

Cited

Shared

Discussed