Please wait a minute...
Advanced Search
数据分析与知识发现  2016, Vol. 32 Issue (3): 50-57     https://doi.org/10.11925/infotech.1003-3513.2016.03.07
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
面向临床决策的电子病历文本潜在语义分析*
李国垒1,陈先来1,2,3,夏冬4,杨荣5()
1中南大学信息安全与大数据研究院 长沙 410013
2医学信息研究湖南省普通高等学校重点实验室(中南大学) 长沙 410013
3湖南省高等学校医学大数据2011协同创新中心 长沙 410013
4中国科学院成都文献情报中心 成都 610041
5中南大学湘雅医院 长沙 410078
Latent Semantic Analysis of Electronic Medical Record Text for Clinical Decision Making
Li Guolei1,Chen Xianlai1,2,3,Xia Dong4,Yang Rong5()
1Information Security and Big Data Research Institute, Central South University, Changsha 410013, China
2Key Laboratory of Medical Information Research (Central South University), College of Hunan Province, Changsha 410013, China
3Hunan Province Cooperative Innovation Center of Medical Big Data, Changsha 410013, China
4Chengdu Documentation and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
5Xiangya Hospital, Central South University, Changsha 410078, China
全文: PDF (801 KB)   HTML ( 58
输出: BibTeX | EndNote (RIS)       背景资料
文章导读  
摘要 

目的】通过对电子病历中重要文本进行语义分析, 提取辅助临床治疗方案选择的决策知识, 实现电子病历的临床决策支持功能。【方法】使用词典和统计相结合的分词算法, 对训练样本中出院记录文本进行分词处理, 从中提取临床术语及治疗方案, 并对其进行潜在语义分析, 找出临床术语与治疗方案之间的潜在语义联系, 建立胃癌治疗方案辅助选择的潜在语义模型。【结果】利用测试样本对语义模型进行测试, 在三维语义空间内, 发现1 000份测试样本中有605份可以从临床症状的描述准确地推算出其所对应的治疗方案, 正确率为60.5%。【局限】仅以出院记录文本为研究对象, 没有对其他病历文本进行分词处理。【结论】潜在语义分析方法能够有效地处理临床文本, 辅助医生的临床决策, 对于电子病历的开发应用具有重要意义。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
夏冬
杨荣
李国垒
陈先来
关键词 电子病历中文文本切分潜在语义分析胃癌临床决策支持治疗方案选择    
Abstract

[Objective] This study aims to extract knowledge for clinical decision from electronic medical records through semantic analysis. [Methods] We first extracted clinical terms from the training samples by the word segmentation algorithm with the help of custom dictionary and statistical method. Then, we used latent semantic analysis to find the potential correlations between clinical terms and treatment plans. Finally, we established a latent semantic model to support gastric cancer treatments. [Results] We successfully extracted 605 treatment plans from 1000 test samples based on the discharge summary texts. [Limitations] Only discharge record texts were examined for this study. [Conclusions] The latent semantic analysis could effectively process electronic medical records to assist doctors’ clinical decision-making work, which posed positive effects to the development of electronic medical record applications.

Key wordsElectronic medical record    Chinese text segmentation    Latent Semantic Analysis    Gastric cancer    Clinical decision support    Selection of treatment plans
收稿日期: 2015-09-28      出版日期: 2016-04-12
:     
基金资助:*本文系国家社会科学基金项目“面向临床决策的电子病历潜在语义分析及应用研究”(项目编号:13BTQ052)的研究成果之一
引用本文:   
李国垒, 陈先来, 夏冬, 杨荣. 面向临床决策的电子病历文本潜在语义分析*[J]. 数据分析与知识发现, 2016, 32(3): 50-57.
Li Guolei, Chen Xianlai, Xia Dong, Yang Rong. Latent Semantic Analysis of Electronic Medical Record Text for Clinical Decision Making. Data Analysis and Knowledge Discovery, 2016, 32(3): 50-57.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.03.07      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2016/V32/I3/50
[1] Landauer T K.A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge[J]. Psychological Review, 1997, 104(2): 211-240.
[2] Cohen T, Blatter B, Patel V.Simulating Expert Clinical Comprehension: Adapting Latent Semantic Analysis to Accurately Extract Clinical Concepts from Psychiatric Narrative[J]. Journal of Biomedical Informatics, 2008, 41(6): 1070-1087.
[3] Cohen T, Blatter B, Patel V.Exploring Dangerous Neighborhoods: Latent Semantic Analysis and Computing Beyond the Bounds of the Familiar [C]. In: Proceedings of the Annual Symposium of American Medical Informatics Association. 2005: 151-155.
[4] Ginter F, Suominen H, Pyysalo S, et al.Combining Hidden Markov Models and Latent Semantic Analysis for Topic Segmentation and Labeling: Method and Clinical Application[J]. International Journal of Medical Informatics, 2009, 78(12): 1-6.
[5] Wild F, Haley D.Using Latent-Semantic Analysis and Network Analysis for Monitoring Conceptual Development[J]. Journal for Language Technology and Computational Linguistics, 2011, 26(1): 9-21.
[6] Wang J, Sun X P, Nahavandi S, et al.Multichannel Biomedical Time Series Clustering via Hierarchical Probabilistic Latent Semantic Analysis[J]. Computer Methods and Programs in Biomedicine, 2014, 117(2): 238-246.
[7] Abate F, Acquaviva A, Ficarra E, et al.A New Latent Semantic Analysis Based Methodology for Knowledge Extraction from Biomedical Literature and Biological Pathways Databases [C]. In: Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms, Rome, Italy. 2011: 66-74.
[8] 甘艳芳, 倪子伟, 林凡. 潜在语义分析在中医证候分类中的应用[J]. 厦门大学学报: 自然科学版, 2012, 51(6): 991-994.
[8] (Gan Yanfang, Ni Ziwei, Lin Fan. The Application of LSA in Traditional Chinese Medicine Syndromes Classification [J]. Journal of Xiamen University: Natural Science, 2012, 51(6): 991-994).
[9] 雷蕾, 张早华, 温先荣, 等. 概率潜在语义分析(PLSA)在中药新药处方发现中的应用[J]. 世界科学技术(中医药现代化), 2012(5): 1976-1980.
[9] (Lei Lei, Zhang Zaohua, Wen Xianrong, et al. Study on Application of Probability Latent Semantic Analysis (PLSA) in Herbal Prescription Development [J]. World Science and Technology (Modernization of Traditional Chinese Medicine and Materi Medica), 2012(5): 1976-1980).
[10] 中华人民共和国国家卫生和计划生育委员会. 胃癌规范化诊治指南(试行)[J]. 中国医学前沿杂志(电子版), 2013, 5(8): 29-36.
[10] (National Health and Family Planning Commission of the People’s Republic of China. Gastric Standardized Treatment Guidelines (Trial)[J]. Chinese Journal of the Frontiers of Medical Science (Electronic Version), 2013, 5(8): 29-36.)
[11] 王思力. 面向大规模信息检索的中文分词技术研究[D]. 北京: 中国科学院研究生院, 2006.
[11] (Wang Sili.Research on Chinese Word Segmentation for Large Scale Information Retrieval [D]. Beijing: Graduate School of Chinese Academy of Sciences, 2006.)
[12] Chung Y M, Lee J Y.A Corpus-based Approach to Comparative Evaluation of Statistical Term Association Measure[J]. Journal of the American Society for Information Science and Technology, 2001, 52(4): 283-296.
[13] Idris I.Python数据分析基础教程NumPy学习指南[M]. 张驭宇译. 北京: 人民邮电出版社, 2014: 110.
[13] (Idris I.NumPy Beginner’s Guide [M]. Translated by Zhang Yuyu. Beijing: Posts & Telecom Press, 2014: 110.)
[14] Bendersky M, Croft W B.Discovering Key Concepts in Verbose Queries [C]. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008: 491-498.
[15] 李国垒, 陈先来. 潜在语义分析在关键词—叙词对照系统构建中的应用[J]. 情报理论与实践, 2014, 37(4): 127-130, 133.
[15] (Li Guolei, Chen Xianlai.The Application of Latent Semantic Analysis to Construction of Keyword-Descriptor Comparison System[J]. Information Studies: Theory & Application, 2014, 37(4): 127-130, 133.)
[16] 夏冬, 肖晓旦, 李国垒, 等. 基于潜在语义分析的关键词-分类号对应关系研究[J]. 现代图书情报技术, 2014(12): 92-96.
[16] (Xia Dong, Xiao Xiaodan, Li Guolei, et al.Research on Correspondence Between Keyword and Chinese Library Classification Based on Latent Semantic Analysis[J]. New Technology of Library and Information Service, 2014(12): 92-96.)
[17] 盖杰, 王怡, 武港山. 基于潜在语义分析的信息检索[J]. 计算机工程, 2004, 30(2): 58-60.
[17] (Gai Jie, Wang Yi, Wu Gangshan.Text Information Retrieval Based on Latent Semantic Analysis[J]. Computer Engineering, 2004, 30(2): 58-60.)
[1] 徐良辰, 郭崇慧. 基于集成学习的胃癌生存预测模型研究*[J]. 数据分析与知识发现, 2021, 5(8): 86-99.
[2] 朱超宇, 刘雷. 基于知识图谱的医学决策支持应用综述*[J]. 数据分析与知识发现, 2020, 4(12): 26-32.
[3] 胡佳慧,方安,赵琬清,杨晨柳,任慧玲. 面向知识发现的中文电子病历标注方法研究 *[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[4] 刘勘,陈露. 面向医疗分诊的深度神经网络学习*[J]. 数据分析与知识发现, 2019, 3(6): 99-108.
[5] 牟冬梅, 王萍, 赵丹宁. 高维电子病历的数据降维策略与实证研究*[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
[6] 田世海, 吕德丽. 改进潜在语义分析和支持向量机算法用于突发安全事件舆情预警*[J]. 数据分析与知识发现, 2017, 1(2): 11-18.
[7] 牟冬梅,任珂. 三种数据挖掘算法在电子病历知识发现中的比较*[J]. 现代图书情报技术, 2016, 32(6): 102-109.
[8] 赵夷平,毕强. 关联数据在学术资源网相似文献发现中的应用研究*[J]. 现代图书情报技术, 2016, 32(3): 41-49.
[9] 吴妮, 赵捧未, 秦春秀. 基于语义分析和相似强度的微博热点发现方法[J]. 现代图书情报技术, 2015, 31(5): 57-64.
[10] 夏冬, 肖晓旦, 李国垒, 陈先来. 基于潜在语义分析的关键词-分类号对应关系研究[J]. 现代图书情报技术, 2014, 30(12): 92-96.
[11] 刘飒 章成志. 多语言文本表示研究综述*[J]. 现代图书情报技术, 2010, 26(6): 33-41.
[12] 王嵩,代逸生,李保珍. 基于PLSA的大众标注资源主题挖掘*[J]. 现代图书情报技术, 2010, 26(3): 47-51.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn