Please wait a minute...
Data Analysis and Knowledge Discovery  2016, Vol. 32 Issue (3): 50-57    DOI: 10.11925/infotech.1003-3513.2016.03.07
Orginal Article Current Issue | Archive | Adv Search |
Latent Semantic Analysis of Electronic Medical Record Text for Clinical Decision Making
Li Guolei1,Chen Xianlai1,2,3,Xia Dong4,Yang Rong5()
1Information Security and Big Data Research Institute, Central South University, Changsha 410013, China
2Key Laboratory of Medical Information Research (Central South University), College of Hunan Province, Changsha 410013, China
3Hunan Province Cooperative Innovation Center of Medical Big Data, Changsha 410013, China
4Chengdu Documentation and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
5Xiangya Hospital, Central South University, Changsha 410078, China
Export: BibTeX | EndNote (RIS)       Supporting Info

[Objective] This study aims to extract knowledge for clinical decision from electronic medical records through semantic analysis. [Methods] We first extracted clinical terms from the training samples by the word segmentation algorithm with the help of custom dictionary and statistical method. Then, we used latent semantic analysis to find the potential correlations between clinical terms and treatment plans. Finally, we established a latent semantic model to support gastric cancer treatments. [Results] We successfully extracted 605 treatment plans from 1000 test samples based on the discharge summary texts. [Limitations] Only discharge record texts were examined for this study. [Conclusions] The latent semantic analysis could effectively process electronic medical records to assist doctors’ clinical decision-making work, which posed positive effects to the development of electronic medical record applications.

Key wordsElectronic medical record      Chinese text segmentation      Latent Semantic Analysis      Gastric cancer      Clinical decision support      Selection of treatment plans     
Received: 28 September 2015      Published: 12 April 2016

Cite this article:

Li Guolei, Chen Xianlai, Xia Dong, Yang Rong. Latent Semantic Analysis of Electronic Medical Record Text for Clinical Decision Making. Data Analysis and Knowledge Discovery, 2016, 32(3): 50-57.

URL:     OR

[1] Landauer T K.A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge[J]. Psychological Review, 1997, 104(2): 211-240.
[2] Cohen T, Blatter B, Patel V.Simulating Expert Clinical Comprehension: Adapting Latent Semantic Analysis to Accurately Extract Clinical Concepts from Psychiatric Narrative[J]. Journal of Biomedical Informatics, 2008, 41(6): 1070-1087.
[3] Cohen T, Blatter B, Patel V.Exploring Dangerous Neighborhoods: Latent Semantic Analysis and Computing Beyond the Bounds of the Familiar [C]. In: Proceedings of the Annual Symposium of American Medical Informatics Association. 2005: 151-155.
[4] Ginter F, Suominen H, Pyysalo S, et al.Combining Hidden Markov Models and Latent Semantic Analysis for Topic Segmentation and Labeling: Method and Clinical Application[J]. International Journal of Medical Informatics, 2009, 78(12): 1-6.
[5] Wild F, Haley D.Using Latent-Semantic Analysis and Network Analysis for Monitoring Conceptual Development[J]. Journal for Language Technology and Computational Linguistics, 2011, 26(1): 9-21.
[6] Wang J, Sun X P, Nahavandi S, et al.Multichannel Biomedical Time Series Clustering via Hierarchical Probabilistic Latent Semantic Analysis[J]. Computer Methods and Programs in Biomedicine, 2014, 117(2): 238-246.
[7] Abate F, Acquaviva A, Ficarra E, et al.A New Latent Semantic Analysis Based Methodology for Knowledge Extraction from Biomedical Literature and Biological Pathways Databases [C]. In: Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms, Rome, Italy. 2011: 66-74.
[8] 甘艳芳, 倪子伟, 林凡. 潜在语义分析在中医证候分类中的应用[J]. 厦门大学学报: 自然科学版, 2012, 51(6): 991-994.
[8] (Gan Yanfang, Ni Ziwei, Lin Fan. The Application of LSA in Traditional Chinese Medicine Syndromes Classification [J]. Journal of Xiamen University: Natural Science, 2012, 51(6): 991-994).
[9] 雷蕾, 张早华, 温先荣, 等. 概率潜在语义分析(PLSA)在中药新药处方发现中的应用[J]. 世界科学技术(中医药现代化), 2012(5): 1976-1980.
[9] (Lei Lei, Zhang Zaohua, Wen Xianrong, et al. Study on Application of Probability Latent Semantic Analysis (PLSA) in Herbal Prescription Development [J]. World Science and Technology (Modernization of Traditional Chinese Medicine and Materi Medica), 2012(5): 1976-1980).
[10] 中华人民共和国国家卫生和计划生育委员会. 胃癌规范化诊治指南(试行)[J]. 中国医学前沿杂志(电子版), 2013, 5(8): 29-36.
[10] (National Health and Family Planning Commission of the People’s Republic of China. Gastric Standardized Treatment Guidelines (Trial)[J]. Chinese Journal of the Frontiers of Medical Science (Electronic Version), 2013, 5(8): 29-36.)
[11] 王思力. 面向大规模信息检索的中文分词技术研究[D]. 北京: 中国科学院研究生院, 2006.
[11] (Wang Sili.Research on Chinese Word Segmentation for Large Scale Information Retrieval [D]. Beijing: Graduate School of Chinese Academy of Sciences, 2006.)
[12] Chung Y M, Lee J Y.A Corpus-based Approach to Comparative Evaluation of Statistical Term Association Measure[J]. Journal of the American Society for Information Science and Technology, 2001, 52(4): 283-296.
[13] Idris I.Python数据分析基础教程NumPy学习指南[M]. 张驭宇译. 北京: 人民邮电出版社, 2014: 110.
[13] (Idris I.NumPy Beginner’s Guide [M]. Translated by Zhang Yuyu. Beijing: Posts & Telecom Press, 2014: 110.)
[14] Bendersky M, Croft W B.Discovering Key Concepts in Verbose Queries [C]. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008: 491-498.
[15] 李国垒, 陈先来. 潜在语义分析在关键词—叙词对照系统构建中的应用[J]. 情报理论与实践, 2014, 37(4): 127-130, 133.
[15] (Li Guolei, Chen Xianlai.The Application of Latent Semantic Analysis to Construction of Keyword-Descriptor Comparison System[J]. Information Studies: Theory & Application, 2014, 37(4): 127-130, 133.)
[16] 夏冬, 肖晓旦, 李国垒, 等. 基于潜在语义分析的关键词-分类号对应关系研究[J]. 现代图书情报技术, 2014(12): 92-96.
[16] (Xia Dong, Xiao Xiaodan, Li Guolei, et al.Research on Correspondence Between Keyword and Chinese Library Classification Based on Latent Semantic Analysis[J]. New Technology of Library and Information Service, 2014(12): 92-96.)
[17] 盖杰, 王怡, 武港山. 基于潜在语义分析的信息检索[J]. 计算机工程, 2004, 30(2): 58-60.
[17] (Gai Jie, Wang Yi, Wu Gangshan.Text Information Retrieval Based on Latent Semantic Analysis[J]. Computer Engineering, 2004, 30(2): 58-60.)
[1] Xu Liangchen, Guo Chonghui. Predicting Survival Rates for Gastric Cancer Based on Ensemble Learning[J]. 数据分析与知识发现, 2021, 5(8): 86-99.
[2] Zhu Chaoyu, Liu Lei. A Review of Medical Decision Supports Based on Knowledge Graph[J]. 数据分析与知识发现, 2020, 4(12): 26-32.
[3] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[4] Kan Liu,Lu Chen. Deep Neural Network Learning for Medical Triage[J]. 数据分析与知识发现, 2019, 3(6): 99-108.
[5] Mu Dongmei,Wang Ping,Zhao Danning. Reducing Data Dimension of Electronic Medical Records: An Empirical Study[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
[6] Tian Shihai,Lyu Deli. An Early Warning Algorithm for Public Opinion of Safety Emergency[J]. 数据分析与知识发现, 2017, 1(2): 11-18.
[7] Mu Dongmei,Ren Ke. Discovering Knowledge from Electronic Medical Records with Three Data Mining Algorithms[J]. 现代图书情报技术, 2016, 32(6): 102-109.
[8] Zhao Yiping,Bi Qiang. Using Linked Data to Retrieve Similar Documents from the Academic Resource Websites[J]. 现代图书情报技术, 2016, 32(3): 41-49.
[9] Wu Ni, Zhao Pengwei, Qin Chunxiu. Microblog Hotspot Detection Based on Semantic Analysis and Similarity Strength[J]. 现代图书情报技术, 2015, 31(5): 57-64.
[10] Xia Dong, Xiao Xiaodan, Li Guolei, Chen Xianlai. Research on Correspondence Between Keyword and Chinese Library Classification Based on Latent Semantic Analysis[J]. 现代图书情报技术, 2014, 30(12): 92-96.
[11] Liu Sa Zhang Chengzhi. Survey of Multilingual Document Representation[J]. 现代图书情报技术, 2010, 26(6): 33-41.
[12] Wang Song,Dai Yisheng,Li Baozhen. Explore Network Resource Topics from Social Annotations System Based on PLSA[J]. 现代图书情报技术, 2010, 26(3): 47-51.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938