Automatic Triage of Online Doctor Services Based on Machine Learning
Ruojia Wang1,2,Lu Zhang1,Jimin Wang1()
1 Department of Information Management, Peking University, Beijing 100871, China 2 Institute of Ocean Research, Peking University, Beijing 100871, China
[Objective] This paper compares the performance of various machine learning algorithms for automatic triage, aiming to improve their effectiveness through analyzing mis-classification data. [Methods] First, we retrieved 33,073 real patients’ questions from a website named “chunyu doctor”. Then, we compared the accuracy of two text vectorization methods and six classification models. Finally, we analyzed the mis-classification data and extracted new features to improve the performance of models. [Results] The best automatic triage model used TF-IDF as text vectorization method and support vector machine as classification algorithm. After adding age and gender characteristics, the classification accuracy rate reached 76.3%. The classifier had the lowest accuracy rate for surgery department due to the setting of this platform’s categories. [Limitations] We assumed that the department selection of the patient was correct. [Conclusions] Machine learning techniques could improve the performance of automatic triage services of the online health consulting platforms.
王若佳,张璐,王继民. 基于机器学习的在线问诊平台智能分诊研究[J]. 数据分析与知识发现, 2019, 3(9): 88-97.
Ruojia Wang,Lu Zhang,Jimin Wang. Automatic Triage of Online Doctor Services Based on Machine Learning. Data Analysis and Knowledge Discovery, 2019, 3(9): 88-97.
Pineda A L, Ye Y, Visweswaran S , et al. Comparison of Machine Learning Classifiers for Influenza Detection from Emergency Department Free-text Reports[J]. Journal of Biomedical Informatics, 2015,58:60-69.
( Kong Qian, Wang Dujuan, Wang Yanzhang , et al. Multi-Objective Neural Network-Based Diagnostic Model of Prostatic Cancer[J]. Systems Engineering - Theory & Practice, 2018,38(2):532-544.)
[3]
Nikfarjam A, Sarker A, O’connor K , et al. Pharmacovigilance from Social Media: Mining Adverse Drug Reaction Mentions Using Sequence Labeling with Word Embedding Cluster Features[J]. Journal of the American Medical Informatics Association, 2015,22(3):671-681.
[4]
Kose I, Gokturk M, Kilic K . An Interactive Machine- Learning-Based Electronic Fraud and Abuse Detection System in Healthcare Insurance[J]. Applied Soft Computing, 2015,36:283-299.
( Deng Zhaohua, Hong Ziying . An Empirical Study of Patient-physician Trust Impact Factors in Online Healthcare Services[J]. Journal of Management Science, 2017,30(1):43-52.)
( Fan Xiaoniu, Ai Shizhong . An Empirical Study on the Relationship Between Online Medical Community Participants’ Behaviors and Knowledge Exchange Effect[J]. Journal of Intelligence, 2016,35(7):173-178.)
[10]
Björk A B, Hillborg H, Augutis M , et al. Evolving Techniques in Text-Based Medical Consultation-Physicians’ Long-Term Experiences at an Ask the Doctor Service[J]. International Journal of Medical Informatics, 2017,105:83-88.
[11]
Umefjord G, Petersson G, Hamberg K . Reasons for Consulting a Doctor on the Internet: Web Survey of Users of an Ask the Doctor Service[J]. Journal of Medical Internet Research, 2003,5(4):e26.
[12]
Umefjord G, Sandström H, Malker H , et al. Medical Text-Based Consultations on the Internet: A 4-Year Study[J]. International Journal of Medical Informatics, 2008,77(2):114-121.
[13]
Ma X, Gui X, Fan J , et al. Professional Medical Advice at Your Fingertips: An Empirical Study of an Online[J]. Proceedings of the ACM on Human-Computer Interaction, 2018, 2: Article No. 116.
( Wu Jiang, Zhou Lusha . The Study of Knowledge Sharing Network and Users’ Knowledge Interaction in Online Health Community[J]. Information Science, 2017,35(3):144-151.)
( Wu Jiang, Shi Li . Study of the User Interaction Behavior in Online Health Community Based on Social Network Analysis[J]. Information Science, 2017,35(7):120-125.)
( Wu Jiang, Li Shanshan, Zhou Lusha , et al. Research on Dynamic Evolution of Users’ Relationship Network in Online Health Community Based on Stochastic Actor-oriented Model[J]. Journal of the China Society for Scientific and Technical Information, 2017,36(2):213-220.)
( Wu Jiang, Hou Shaoxin, Jin Mengmeng , et al. LDA Feature Selection Based Text Classification and User Clustering in Chinese Online Health Community[J]. Journal of the China Society for Scientific and Technical Information, 2017,36(11):1183-1191.)
( Liu Tong, Yang Jingcheng . Evaluating Online Healthcare Consultation Feedbacks Based on Signal Transmission Algorithm[J]. Data Analysis and Knowledge Discovery, 2017,1(11):29-36.)
[19]
Himmel W, Reincke U, Michelmann H W . Text Mining and Natural Language Processing Approaches for Automatic Categorization of Lay Requests to Web-Based Expert Forums[J]. Journal of Medical Internet Research, 2009,11(3):e25.
[20]
Abdaoui A, Azé J, Bringay S , et al. Assisting E-patients in an Ask the Doctor Service[J]. Studies in Health Technology and Informatics, 2015,210:572-576.
[21]
刁必颂 . 基于在线患者咨询数据的在线医生推荐系统研究[D]. 北京: 北京理工大学, 2016.
[21]
( Diao Bisong . Online Patient Counseling Data Based Online Doctor Recommend System Research[D]. Beijing: Beijing Institute of Technology, 2016.)
[22]
王静 . 在线问诊平台相似病例推荐[D]. 哈尔滨: 哈尔滨理工大学, 2017.
[22]
( Wang Jing . Similar Cases Recommendation on Online Medical Diagnose Platform[D]. Harbin: Harbin University of Science and Technology, 2017.)
( Liu Tong . An Application Research of Automatic Physician Matching Algorithm Based on Online Healthcare Consultation Records[J]. Information Studies: Theory & Application, 2018,41(6):147-152.)
Kibriya A M, Frank E, Pfahringer B, et al. Multinomial Naive Bayes for Text Categorization Revisited [C]// Proceedings of the Australasian Joint Conference on Artificial Intelligence. 2004: 488-499.
( Wang Hao, Ye Peng, Deng Sanhong . The Application of Machine- Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3):80-87.)
( Liu Liu, Wang Dongbo . Identifying Interdisciplinary Social Science Research Based on Article Classification[J]. Data Analysis and Knowledge Discovery, 2018,2(3):30-38.)
[31]
Ishikawa H, Hashimoto H, Kiuchi T . The Evolving Concept of “Patient-Centeredness” in Patient-Physician Communication Research[J]. Social Science & Medicine, 2013,96:147-153.
( Zhao Ming, Du Huifang, Dong Cuicui , et al. Diet Health Text Classification Based on Word2Vec and LSTM[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017,48(10):202-208.)