Please wait a minute...
Advanced Search
数据分析与知识发现  2020, Vol. 4 Issue (6): 118-128     https://doi.org/10.11925/infotech.2096-3467.2019.1156
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
在线医疗社区中面向医生的协同标注研究*
叶佳鑫1,熊回香1(),童兆莉1,2,孟秋晴1
1华中师范大学信息管理学院 武汉 430079
2湖北交通职业技术学院 武汉 430079
Collaborative Tagging for Doctors in Online Medical Community
Ye Jiaxin1,Xiong Huixiang1(),Tong Zhaoli1,2,Meng Qiuqing1
1School of Information Management, Central China Normal University, Wuhan 430079, China
2Hubei Communication Technical College, Wuhan 430079, China
全文: PDF (854 KB)   HTML ( 9
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 通过挖掘文本特征寻找某医生的相似医生,以相似医生的特征为基础对医生进行标注,丰富对医生特征的描述。【方法】 利用Word2Vec词向量模型对医生的咨询文本、文章标题与咨询范围进行向量表示,在此基础上挖掘相似医生;进而分析挖掘的相似医生的特征,对标注的目标医生进行协同标注。【结果】 基于咨询文本、文章标题与咨询范围的医生标注结果,准确率分别为0.667、0.252与0.708,混合不同文本进行标注的准确率为1.000。【局限】 对文本语义特征的挖掘不够深入,以单一文本进行标注的准确率与召回率有待提高。【结论】 基于咨询文本产生的标签与患者即时需求较为紧密,基于文章标题产生的标签与医生兴趣具有较强联系,基于咨询范围与混合不同文本所得标签具有较高的准确率,从文本挖掘出发进行医生的协同标注能在一定程度上推荐合适的标签。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
叶佳鑫
熊回香
童兆莉
孟秋晴
关键词 Word2Vec协同标注医生标注标签推荐    
Abstract

[Objective] This paper tries to find similar doctors and improve the descriptions of their characteristics. [Methods] We generated vector representation for each doctor’s consulting texts, article titles and service scopes with the Word2Vec model, which helped us identify similar doctors. Then, we analyzed their common characteristics and collaboratively tag these doctors. [Results] The accuracy of tagging results based on doctor’s consulting texts, article titles and services were 0.667, 0.252 and 0.708, respectively. The accuracy of tagging results based on mixed texts was 1.000. [Limitations] The performance of single-text based tagging needs to be improved. [Conclusions] Tags based on consultation texts are closely related to the immediate needs of patients, while tags based on article titles are strongly related to doctor’s interests. Tags obtained from their services and mixed texts are more accurate.

Key wordsWord2Vec    Collaborative Tagging    Physician Tagging    Tag Recommendations
收稿日期: 2019-10-22      出版日期: 2020-07-07
ZTFLH:  G206  
基金资助:*本文系华中师范大学中央高校基本科研业务费人文社会科学类重大项目“基于语义网的在线健康信息的挖掘与推荐研究”(CCNU19Z02004);华中师范大学优秀博士学位论文培育计划项目的研究成果之一(2019YBZZ096)
通讯作者: 熊回香     E-mail: hxxiong@mail.ccnu.edu.cn
引用本文:   
叶佳鑫,熊回香,童兆莉,孟秋晴. 在线医疗社区中面向医生的协同标注研究*[J]. 数据分析与知识发现, 2020, 4(6): 118-128.
Ye Jiaxin,Xiong Huixiang,Tong Zhaoli,Meng Qiuqing. Collaborative Tagging for Doctors in Online Medical Community. Data Analysis and Knowledge Discovery, 2020, 4(6): 118-128.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2019.1156      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2020/V4/I6/118
Table 1  800位医生文本数据示例
Table 2  596位医生的训练文本示例
对比项 肺癌 肺部结节 肺部疾病 肺炎 糖尿病 不孕不育 呼吸衰竭
序号 1 2 3 4 5 6 204
频次 119 112 108 92 89 84 1
概率 0.200 0.188 0.181 0.154 0.149 0.141 0.002
Table 3  596位医生的患者投票
Fig.1  204类疾病投票频次折线图(示意)
Fig.2  协同标注模型框架
Table 4  部分语词的词向量
Table 5  测试医生10及其相似医生
Table 6  符合标注标准的医生数
投票 出现频次 出现概率 原出现概率 原出现概率×2
糖尿病 4 0.500 0.149 0.298
高血压 3 0.375 0.134 0.268
甲亢 3 0.375 0.129 0.258
甲减 3 0.375 0.112 0.224
内分泌疾病 2 0.250 0.020 0.040
不孕不育 1 0.125 0.141 0.282
乙肝 1 0.125 0.134 0.268
试管婴儿 1 0.125 0.119 0.238
感染 1 0.125 0.015 0.030
Table 7  测试医生10的相似医生投票(基于咨询范围)
Table 8  基于不同文本的医生标注结果
测试医生 标签
13 高血压;冠心病;心脏病;房颤
21 Null
28 肺炎;咳嗽;哮喘;支气管炎;支气管扩张
35 糖尿病;甲亢;甲减;甲状腺疾病
38 Null
63 哮喘;过敏
Table 9  混合不同文本的医生标注结果
Table 10  标注效果评估
[1] 孙国强, 由丽孪, 陈思, 等. 互联网+医疗模式的初步探索[J]. 中国数字医学, 2015,10(6):15-18.
[1] ( Sun Guoqiang, You Liluan, Chen Si, et al. Preliminary Exploration of Internet + Medical Model[J]. China Digital Medicine, 2015,10(6):15-18.)
[2] 高山, 刘炜, 崔勇, 等. 一种融合多种用户行为的协同过滤推荐算法[J]. 计算机科学, 2016,43(9):227-231.
[2] ( Gao Shan, Liu Wei, Cui Yong, et al. Collaborative Filtering Algorithm Integrating Multiple User Behaviors[J]. Computer Science, 2016,43(9):227-231.)
[3] Huang Z X, Lu X D, Duan H L, et al. Collaboration-based Medical Knowledge Recommendation[J]. Artificial Intelligence in Medicine, 2012,55(1):13-24.
doi: 10.1016/j.artmed.2011.10.002
[4] Jelassi M N, Yahia S B, Nguifo E M. Towards More Targeted Recommendations in Folksonomies[J]. Social Network Analysis and Mining, 2015, 5(1): Article No. 68.
doi: 10.1007/s13278-015-0307-8
[5] Bertram R, Schrimpf G, Stamm-Wilbrandt H. System and Method for Item Recommendations: USA, US8700448B2[P]. 2014-04-15.
[6] 熊回香, 杨雪萍. 社会化标注系统中的个性化信息推荐研究[J]. 情报学报, 2016,35(5):549-560.
[6] ( Xiong Huixiang, Yang Xueping. Personalized Information Recommendation Research Based on Combined Condition in Folksonomies[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(5):549-560.)
[7] 李枫林, 陈德鑫, 梁少星. 基于语义关联和情景感知的个性化推荐方法研究[J]. 情报杂志, 2015,34(10):189-195.
[7] ( Li Fenglin, Chen Dexin, Liang Shaoxing. Research on Personalized Recommendation Method Based on Semantic Association and Context Awareness[J]. Journal of Intelligence, 2015,34(10):189-195.)
[8] Chawda V L, Mahalle V S. Learning to Recommend Descriptive Tags for Health Seekers Using Deep Learning [C]//Proceedings of the 2017 International Conference on Inventive Systems and Control (ICISC). IEEE, 2017: 1-7.
[9] Qassimi S, Abdelwahed E H, Hafidi M, et al. A Graph-Based Model for Tag Recommendations in Clinical Decision Support System [C]//Proceedings of the 8th International Conference on Model and Data Engineering. Springer, 2018: 292-300.
[10] Qassimi S, Abdelwahed E H, Hafidi M, et al. The Role of Recommender System of Tags in Clinical Decision Support [C]// Proceedings of the 2018 International Conference on Advanced Intelligent Systems for Sustainable Development. Springer, 2018: 273-285.
[11] 魏建良, 朱庆华. 社会化标注理论研究综述[J]. 中国图书馆学报, 2009,35(6):88-96.
[11] ( Wei Jianliang, Zhu Qinghua. A Review of the Study of Social Tagging Theory[J]. Journal of Library Science in China, 2009,35(6):88-96.)
[12] 向菲, 彭昱欣, 邰杨芳. 一种基于协同过滤的图书资源标签推荐方法研究[J]. 图书馆学研究, 2018(15):46-52.
[12] ( Xiang Fei, Peng Yuxin, Tai Yangfang. Research on a Book Resource Tag Recommendation Method Based on the Collaborative Filtering[J]. Research on Library Science, 2018(15):46-52.)
[13] 成全. 基于协同标注的科研社区知识融合机制研究[J]. 情报理论与实践, 2011,34(8):20-25.
[13] ( Cheng Quan. Research on the Mechanism of Knowledge Integration in Research-oriented Community Based on Collaborative Annotation[J]. Information Studies: Theory & Application, 2011,34(8):20-25.)
[14] 祝锡永, 周益辉, 李晟. 语义Web环境中基于本体推理的协同标注[J]. 浙江理工大学学报, 2012,29(4):555-559.
[14] ( Zhu Xiyong, Zhou Yihui, Li Sheng. Collaborative Annotation Based on Ontology Reasoning in Semantic Web Environment[J]. Journal of Zhejiang Sci-Tech University, 2012,29(4):555-559.)
[15] 杜红乐, 滕少华, 张燕. 协同标注的直推式支持向量机算法[J]. 小型微型计算机系统, 2016,37(11):2443-2447.
[15] ( Du Hongle, Teng Shaohua, Zhang Yan. Transductive Support Vector Machine Based on Cooperative Labeling[J]. Journal of Chinese Computer Systems, 2016,37(11):2443-2447.)
[16] 杜红乐, 张燕. 基于聚类和协同标注的TSVM算法[J]. 河南科学, 2017,35(1):22-27.
[16] ( Du Hongle, Zhang Yan. Transductive Support Vector Machine Algorithm Based on Cluster and Cooperative Labeling[J]. Henan Science, 2017,35(1):22-27.)
[17] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013,2:3111-3119.
[18] 熊富林, 邓怡豪, 唐晓晟. Word2Vec的核心架构及其应用[J]. 南京师范大学学报: 工程技术版, 2015,15(1):43-48.
[18] ( Xiong Fulin, Deng Yihao, Tang Xiaosheng. The Architecture of Word2Vec and Its Applications[J]. Journal of Nanjing Normal University: Engineering and Technology Edition, 2015,15(1):43-48.)
[19] Zhu Y, Yan E, Wang F. Semantic Relatedness and Similarity of Biomedical Terms: Examining the Effects of Recency, Size, and Section of Biomedical Publications on the Performance of Word2Vec[J]. BMC Medical Informatics and Decision Making, 2017, 17: Article No. 95.
doi: 10.1186/1472-6947-12-95 pmid: 22947211
[20] Xu C, Liu D. Chinese Text Summarization Algorithm Based on Word2Vec[C]//Proceedings of the 2018 International Conference on Control Engineering and Artificial Intelligence. IOP Publishing, 2018,976:012006.
[21] 好大夫在线简介[EB/OL]. [2019-07-24]. https://www.haodf.com/info/aboutus.php.
[21] (An Introduction of “Hao Daifu” [EB/OL]. [2019-07-24]. https://www.haodf.com/info/aboutus.php. )
[22] 好大夫在线[EB/OL]. [2019-07-03]. https://www.haodf.com/.
[22] (Hao Daifu[EB/OL]. [2019-07-03]. https://www.haodf.com/. )
[23] 李心蕾, 王昊, 刘小敏, 等. 面向微博短文本分类的文本向量化方法比较研究[J]. 数据分析与知识发现, 2018,2(8):41-50.
[23] ( Li Xinlei, Wang Hao, Liu Xiaomin, et al. Comparing Text Vector Generators for Weibo Short Text Classification[J]. Data Analysis and Knowledge Discovery, 2018,2(8):41-50.)
[24] 陈梅梅, 薛康杰. 基于改进张量分解模型的个性化推荐算法研究[J]. 数据分析与知识发现, 2017,1(3):38-45.
[24] ( Chen Meimei, Xue Kangjie. Personalized Recommendation Algorithm Based on Modified Tensor Decomposition Model[J]. Data Analysis and Knowledge Discovery, 2017,1(3):38-45.)
[25] 徐文青, 双林平. 融合热门度因子基于标签的个性化图书推荐算法[J]. 图书情报研究, 2015,8(3):82-86.
[25] ( Xu Wenqing, Shuang Linping. Personalized Tag-based Book Recommendation Algorithm Combined with the Factor of Popularity[J]. Library and Information Studies, 2015,8(3):82-86.)
[1] 唐晓波,高和璇. 基于关键词词向量特征扩展的健康问句分类研究 *[J]. 数据分析与知识发现, 2020, 4(7): 66-75.
[2] 岳丽欣,刘自强,胡正银. 面向趋势预测的热点主题演化分析方法研究*[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
[3] 陶兴,张向先,郭顺利,张莉曼. 学术问答社区用户生成内容的W2V-MMR自动摘要方法研究*[J]. 数据分析与知识发现, 2020, 4(4): 109-118.
[4] 叶佳鑫,熊回香,蒋武轩. 一种融合患者咨询文本与决策机理的医生推荐算法*[J]. 数据分析与知识发现, 2020, 4(2/3): 153-164.
[5] 薛福亮,刘丽芳. 一种基于CRF与ATAE-LSTM的细粒度情感分析方法*[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[6] 龚丽娟,王昊,张紫玄,朱立平. Word2Vec对海关报关商品文本特征降维效果分析*[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
[7] 蒋翠清,郭轶博,刘尧. 基于中文社交媒体文本的领域情感词典构建方法研究*[J]. 数据分析与知识发现, 2019, 3(2): 98-107.
[8] 李心蕾,王昊,刘小敏,邓三鸿. 面向微博短文本分类的文本向量化方法比较研究*[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[9] 高永兵,杨贵朋,张娣,马占飞. 基于突显词博文聚类的官微事件检测方法*[J]. 数据分析与知识发现, 2017, 1(9): 57-64.
[10] 张琴,郭红梅,张智雄. 融合词嵌入表示特征的实体关系抽取方法研究*[J]. 数据分析与知识发现, 2017, 1(9): 8-15.
[11] 夏天. 词向量聚类加权TextRank的关键词抽取*[J]. 数据分析与知识发现, 2017, 1(2): 28-34.
[12] 刘睿伦,叶文豪,高瑞卿,唐梦嘉,王东波. 基于大数据岗位需求的文本聚类研究*[J]. 数据分析与知识发现, 2017, 1(12): 32-40.
[13] 罗文馨,陈翀,邓思艺. 基于Word2Vec及大众健康信息源的疾病关联探测[J]. 现代图书情报技术, 2016, 32(9): 78-87.
[14] 宁建飞,刘降珍. 融合Word2vec与TextRank的关键词抽取研究[J]. 现代图书情报技术, 2016, 32(6): 20-27.
[15] 黄红霞, 章成志. 中文微博用户标签的调查分析——以新浪微博为例[J]. 现代图书情报技术, 2012, (10): 49-54.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn