Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (4): 34-43    DOI: 10.11925/infotech.2096-3467.2019.0815
Current Issue | Archive | Adv Search |
Recommending Online Medical Experts with Labeled-LDA Model
Pan Youneng(),Ni Xiuli
School of Public Affairs, Zhejiang University, Hangzhou 310058, China
Download: PDF(1031 KB)   HTML ( 4
Export: BibTeX | EndNote (RIS)      

[Objective] This paper tries to modify the existing recommendation model for online medical experts, aiming to more effectively address health-related inquiries. [Methods] First, we identified the latent topics of online health questions with the help of Labeled-LDA model. Then, we defined the doctors’ specialties and better match them with questions. Finally, we evaluated the new model with data from [Results] The precision, recall and response adoption rates of the proposed method were 40.4%, 44.0% and 22.9%, which were much higher than those of the existing ones. [Limitations] Our method did not include factors like doctors’ responding time and their resumes. This method could not identify expertise of newly joined doctors who answered few questions. [Conclusions] The proposed model could effectively recommend physicians for patients asking questions online.

Key wordsLabeled-LDA      Expert Recommendation      Topic Model      Online Healthcare     
Received: 12 July 2019      Published: 01 June 2020
ZTFLH:  G350  
Corresponding Authors: Pan Youneng     E-mail:

Cite this article:

Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model. Data Analysis and Knowledge Discovery, 2020, 4(4): 34-43.

URL:     OR

Framework of the Recommendation Model for Online Medical Experts
Sample of the Topic Distribution of Physician
Part of the Health Topic Distribution
组别 准确率 召回率 MRR
组1 42% 41% 0.325
组2 37% 38% 0.301
组3 43% 42% 0.283
组4 35% 34% 0.247
组5 46% 44% 0.362
组6 43% 41% 0.344
组平均值 41% 40% 0.312
测试集总体情况 40% 44% 0.314
Results of the Recommendation for Online Medical Experts
组别 组1 组2 组3 组4 组5 组6 组平均值 总体情况
最佳推荐个数 273 240 280 228 300 279 267 1 588
Results of the Best Recommendations
对比指标 网站现有指标 专家推荐方法
内科健康问题总数 407 189 6 000
内科医生回答采纳次数 27 726 1 588
所有医生回答总次数 1 371 877 14 022
内科医生回答总次数 407 949 6 940
准确率 20.4% 40.4%
召回率 29.7% 44.0%
回答采纳比 6.8% 22.9%
Comparison of the Recommendation Methods
[1] 谢文照, 龚雪琴, 罗爱静 . 我国互联网医疗的发展现状及面临的挑战[J]. 中华医学图书情报杂志, 2016,25(9):6-9.
[1] ( Xie Wenzhao, Gong Xueqin, Luo Aijing . Current Situation and Challenges of Internet Medicine in Our Country[J]. Chinese Journal of Medical Library and Information Science, 2016,25(9):6-9.)
[2] 李全才 . “互联网+医疗”建设与应用模式探究[J]. 中国数字医学, 2015,10(11):1.
[2] ( Li Quancai . The Construction and Application Model of “Internet+Medicine”[J]. China Digital Medicine, 2015,10(11):1.)
[3] 朱利, 岳爱珍 . 健康问题和医生匹配机制的研究[J]. 西安交通大学学报, 2014,48(12):57-62.
[3] ( Zhu Li, Yue Aizhen . Routing Health-Oriented Questions to Appropriate Doctors[J]. Journal of Xi’an Jiaotong University, 2014,48(12):57-62.)
[4] Balog K, Azzopardi L, de Rijke M. A Language Modeling Framework for Expert Finding[J]. Information Processing & Management, 2009,45(1):1-19.
doi: 10.1016/j.ipm.2008.06.003
[5] 厉超 . 论坛专家发现系统的研究与实现[D]. 广州: 华南理工大学, 2009.
[5] ( Li Chao . Research and Implementation of BBS Expert Discovery System[D]. Guangzhou: South China University of Technology, 2009.)
[6] Cao Y, Liu J, Bao S, et al. Research on Expert Search at Enterprise Track of TREC 2005[C]// Proceedings of the 14th Text Retrieval Conference, Gaithersburg, Maryland, USA. 2005.
[7] Kleinberg J M . Authoritative Sources in a Hyperlinked Environment[J]. Journal of the ACM, 1999,46(5):604-632.
doi: 10.1145/324133.324140
[8] Page L . The PageRank Citation Ranking: Bringing Order to the Web[R]. Stanford InfoLab, 1999.
[9] Dom B, Eiron I, Cozzi A, et al. Graph-based Ranking Algorithms for E-mail Expertise Analysis[C]// Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. 2003: 42-48.
[10] Zhang J, Ackerman M S, Adamic L . Community Net Simulator: Using Simulations to Study Online Community Networks[A]// Steinfield C, Pentland B T, Ackerman M, et al. Communities and Technologies 2007[M]. Springer, 2007: 295-321.
[11] Jurczyk P, Agichtein E. Discovering Authorities in Question Answer Communities by Using Link Analysis[C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management, Lisbon, Portugal. 2007: 919-922.
[12] Bouguessa M, Dumoulin B, Wang S. Identifying Authoritative Actors in Question-Answering Forums: The Case of Yahoo! Answers[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2008: 866-874.
[13] Zhou G, Lai S, Liu K, et al. Topic-sensitive Probabilistic Model for Expert Finding in Question Answer Communities[C]// Proceedings of the 21st ACM International Conference on Information & Knowledge Management. 2012: 1662-1666.
[14] 戴秋敏 . 互动问答平台专家发现及问题推荐机制的研究[D]. 上海: 华东师范大学, 2014.
[14] ( Dai Qiumin . Research on Experts Finding and Question Recommendation Mechanism of User-interactive Q&A Platform[D]. Shanghai: East China Normal University, 2014.)
[15] Dumais S T, Furnas G W, Landauer T K, et al. Using Latent Semantic Analysis to Improve Access to Textual Information[C]// Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1988: 281-285.
[16] Hofmann T. Probabilistic Latent Semantic Indexing[C]// Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999: 50-57.
[17] Blei D M, Ng A Y, Jordan M I . Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003,3:993-1022.
[18] Tian Y, Kochhar P S, Lim E P, et al. Predicting Best Answerers for New Questions: An Approach Leveraging Topic Modeling and Collaborative Voting[C]// Proceedings of the 2013 International Conference on Social Informatics. Springer, 2013: 55-68.
[19] 林鸿飞, 王健, 熊大平 , 等. 基于类别参与度的社区问答专家发现方法[J]. 计算机工程与设计, 2014,35(1):333-338.
[19] ( Lin Hongfei, Wang Jian, Xiong Daping , et al. Category Participation-based Approach to Find Experts for Community Question Answer Services[J]. Computer Engineering and Design, 2014,35(1):333-338.)
[20] Li H, Jin S, Li S. A Hybrid Model for Experts Finding in Community Question Answering[C]// Proceedings of the 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE, 2015: 176-185.
[21] Cheng X, Zhu S, Chen G, et al. Exploiting User Feedback for Expert Finding in Community Question Answering[C]// Proceedings of the 2015 IEEE International Conference on Data Mining Workshop. IEEE, 2015: 295-302.
[22] Blei D M, Lafferty J D. Correlated Topic Models[C]// Proceedings of the 18th International Conference on Neural Information Processing Systems. 2005: 147-154.
[23] Li W, McCallum A. Pachinko Allocation: DAG-structured Mixture Models of Topic Correlations[C]// Proceedings of the 23rd International Conference on Machine Learning. ACM, 2006: 577-584.
[24] Rosen-Zvi M, Griffiths T, Steyvers M , et al. The Author-Topic Model for Authors and Documents[OL]. arXiv Preprint, arXiv: 1207. 4169.
[25] Guo X, Xiang Y, Chen Q , et al. LDA-based Online Topic Detection Using Tensor Factorization[J]. Journal of Information Science, 2013,39(4):459-469.
doi: 10.1177/0165551512473066
[26] Ramage D, Hall D, Nallapati R, et al. Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-labeled Corpora[C]// Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009: 248-256.
[27] 杨春艳, 潘有能, 赵莉 . 基于语义和引用加权的文献主题提取研究[J]. 图书情报工作, 2016,60(9):131-138.
[27] ( Yang Chunyan, Pan Youneng, Zhao Li . Study on Topic Extraction of Literatures Based on Weighted Semantic and Citation Relation[J]. Library and Information Service, 2016,60(9):131-138.)
[28] Dai G, Xu M, Xu J , et al. Mining Bursty Topics from Twitter Text Streams Based on Labeled-LDA[J]. Journal of Computational Information Systems, 2014,10(11):4905-4912.
[29] 王树锋, 王文, 费贤举 . 一种基于上下文信息的个性化推荐模型[J]. 常州工学院学报, 2014,27(2):27-31.
[29] ( Wang Shufeng, Wang Wen, Fei Xianju . An Personalized Recommendation Model Based on Context Information[J]. Journal of Changzhou Institute of Technology, 2014,27(2):27-31.)
[30] Zhu X, Hao R, Chi H, et al. Personalized Location Recommendations with Local Feature Awareness[C]// Proceedings of the 2016 IEEE Global Communications Conference. IEEE, 2016.
[31] 卢盛祺, 管连, 金敏 , 等. LDA模型在网络视频推荐中的应用[J]. 微型机与应用, 2016,35(11):74-79.
[31] ( Lu Shengqi, Guan Lian, Jin Min , et al. The Application of LDA in Online Video Recommendation[J]. Microcomputer & Its Applications, 2016,35(11):74-79.)
[32] 朱郁筱, 吕琳媛 . 推荐系统评价指标综述[J]. 电子科技大学学报, 2012,41(2):163-175.
[32] ( Zhu Yuxiao, Lü Linyuan . Evaluation Metrics for Recommender Systems[J]. Journal of University of Electronic Science and Technology of China, 2012,41(2):163-175.)
[1] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[2] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
[3] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[4] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[5] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[6] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[7] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[8] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[9] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[10] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[11] Tao Zhang,Haiqun Ma. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[12] Yan Yu,Naixuan Zhao. Weighted Topic Model for Patent Text Analysis[J]. 数据分析与知识发现, 2018, 2(4): 81-89.
[13] He Li,Linlin Zhu,Min Yan,Jincheng Liu,Chuang Hong. Identifying Useful Information from Open Innovation Community[J]. 数据分析与知识发现, 2018, 2(12): 12-22.
[14] Weilin He,Guohe Feng,Hongling Xie. Analyzing Scientific Literature with Content Similarity - Topics over Time Model[J]. 数据分析与知识发现, 2018, 2(11): 64-72.
[15] Tingting Wang,Yu Wang,Linjie Qin. Dividing Time Windows of Dynamic Topic Model[J]. 数据分析与知识发现, 2018, 2(10): 54-64.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938