[Objective] This paper explores the strategy of reducing the data dimension of electronic medical records, aiming to improve the knowledge discovery. [Methods] First, we conducted preliminary dimension reduction through literature review. Then, we used three methods to finish the second round of dimension reduction. We extracted the factors with the eigenvalue greater than 1, with the cumulative contribution rate greater than 85%, as well as factors of significant differences. Finally, we compared results of the three methods with empirical research. [Results] The dimensional reduction methods extracted 8, 17 and 14 attributes respectively. After qualitative and quantitative evaluation, the principal component analysis method yielded the best result, whose dimension of the feature root was larger than 1. [Limitations] The sample size needs to be expanded for more in-depth analysis. [Conclusions] The proposed method could effectively reduce the data dimension of electronic medical records.
牟冬梅, 王萍, 赵丹宁. 高维电子病历的数据降维策略与实证研究*[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
Mu Dongmei,Wang Ping,Zhao Danning. Reducing Data Dimension of Electronic Medical Records: An Empirical Study. Data Analysis and Knowledge Discovery, 2018, 2(1): 88-98.
(Luo Xu, Liu Youjiang.Medical Big Data Research Status and Its Clinical Application[J]. Journal of Medical Informatics, 2015, 36(5): 10-14.)
[2]
Godinho T M, Costa C, Oliveira J L.Intelligent Generator of Big Data Medical Imaging Repositories[J]. IET Software, 2017, 11(3): 100-104.
doi: 10.1049/iet-sen.2016.0191
(Bi Datian, Qiu Changbo, Zhang Han.Research Status and Progress of Data Dimensionality Reduction Technology[J]. Information Studies: Theory & Application, 2013, 36(2): 125-128.)
(Lei Jianbo.Clinical Decision Support and the Core Value of Electronic Medical Record[J].China Digital Medicine, 2008, 3(3): 26-30.)
doi: 10.3969/j.issn.1673-7571.2008.03.009
[5]
Byrd R J, Steinhubl S R, Sun J, et al.Automatic Identification of Heart Failure Diagnostic Criteria, Using Text Analysis of Clinical Notes from Electronic Health Records[J]. International Journal of Medical Informatics, 2014, 83(12): 983-992.
doi: 10.1016/j.ijmedinf.2012.12.005
pmid: 23317809
[6]
Ye J, Farnum M, Yang E, et al.Sparse Learning and Stability Selection for Predicting MCI to AD Conversion Using Baseline ADNI Data[J]. BMC Neurology, 2012, 12: 46.
doi: 10.1186/1471-2377-12-46
pmid: 22731740
[7]
Kawata T, Daimon M, Miyazaki S, et al.Coronary Microvascular Function is Independently Associated with Left Ventricular Filling Pressure in Patients with Type 2 Diabetes Mellitus[J]. Cardiovascular Diabetology, 2015, 14: 98.
doi: 10.1186/s12933-015-0263-7
pmid: 4525728
[8]
郭珉江. 数据挖掘技术在疾病诊断相关分组中的应用[D]. 长沙: 中南大学, 2009.
[8]
(Guo Minjiang.Research on the Application of Data Mining Technology in Disease Related Groups[D]. Changsha: Central South University, 2009.)
[9]
Alvarez C A, Clark C A, Zhang S, et al.Predicting out of Intensive Care Unit Cardiopulmonary Arrest or Death Using Electronic Medical Record Data[J]. BMC Medical Informatics and Decision Making, 2013, 13: 28.
doi: 10.1186/1472-6947-13-28
pmid: 23442316
[10]
Matheny M E, Fitzhenry F, Speroff T, et al.Detection of Infectious Symptoms from VA Emergency Department and Primary Care Clinical Documentation[J]. International Journal of Medical Informatics, 2012, 81(3): 143-156.
doi: 10.1016/j.ijmedinf.2011.11.005
pmid: 22244191
[11]
Ciecholewski M.Ischemic Heart Disease Detection Using Selected Machine Learning Methods[J]. International Journal of Computer Mathematics, 2013, 90(8): 1734-1759.
doi: 10.1080/00207160.2012.742189
[12]
Ramírez J, Górriz J M, Salas-Gonzalez D, et al.Computer- aided Diagnosis of Alzheimer’s Type Dementia Combining Support Vector Machines and Discriminant Set of Features[J]. Information Sciences, 2013, 237: 59-72.
doi: 10.1016/j.ins.2009.05.012
Yao F, Coquery J, Cao K A L. Independent Principal Component Analysis for Biologically Meaningful Dimension Reduction of Large Biological Data Sets[J]. BMC Bioinformatics, 2012, 13: 24.
doi: 10.1186/1471-2105-13-24
pmid: 22305354
[15]
Gui J, Moore J H, Williams S M, et al.A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits[J]. PLOS One, 2013, 8(6): e66545.
doi: 10.1371/journal.pone.0066545
pmid: 3689797
[16]
周威光. 粗糙集理论处理海量电子病历的研究与应用[D].杭州: 浙江理工大学, 2017.
[16]
(Zhou Weiguang.Research and Application of Rough Set Theory in Dealing with Massive Electronic Medical Records [D]. Hangzhou: Zhejiang Sci-Tech University, 2017.)
(Tian Yuchi, Hu Liang.A Medical Data Analysis Model Based on SVM[J]. Journal of Northeast Normal University, 2015, 47(1): 77-82.)
doi: 10.16163/j.cnki.22-1123/n.2015.01.015
(She Kankan, Hu Kongfa, Wang Zhen.Research on Chinese Prescription Compatibility Based on Variable Precision Tolerance Model and Attribute Sensitivity Reduction[J]. Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology, 2014(6): 1222-1228.)
doi: 10.11842/wst.2014.06.003
[19]
Carter J T.Electronic Medical Records and Quality Improvement[J]. Neurosurgery Clinics of North America, 2015, 26(2): 245-251.
doi: 10.1016/j.nec.2014.11.018
(National Health and Family Planning Commission of the People’s Republic of China. Notice on Publishing 20 Health Industry Standards such as Basic Medical Data Record Part 1: Case Summary and Other Health Industry Standards [EB/OL]. (2014-06-19). [2017-04-15]. .)
[21]
Grauer R, Barber M, Scheeren J.Exploring Microsoft Office Excel 2007[M]. Prentice Hall, 2007.
[22]
Moore B.Principal Component Analysis in Linear Systems: Controllability, Observability, and Model Reduction[J]. IEEE Transactions on Automatic Control, 1981, 26(1): 17-32.
doi: 10.1109/TAC.1981.1102568
[23]
Hosmer D W, Lemeshow S.Applied Logistic Regression[M]. Wiley, 2000.
(Zhang Donghui, Tang Zhiliu, Li Lan, et al.A Systematic Review on Prevalence Rate of Diabetes in 2001-2010 in China[J]. Shanghai Journal of Preventive Medicine, 2012, 24(9): 492-495.)
doi: 10.3969/j.issn.1004-9231.2012.09.009
(Wang Tingjun, Yan Sunjie, Chen Chunxian.Relationship Between Blood Lipid, Blood Pressure and Osteoporosis in Male and Female Patients with Type 2 Diabetes[J]. Chinese Journal of Hypertension, 2012, 20(12): 1152-1156.)
(Huang Qiongdiao, Deng Wanxi, Huang Qinzhan, et al.Study on Correlation of Age and Constitution in Patients with Major Diabetes Complications[J]. World Chinese Medicine, 2013, 8(3): 288-290.)
doi: 10.3969/j.issn.1673-7202.2013.03.014
(Shi Ke, Zhang Yuezhi, Xie Lin, et al.Negative Regulation of Glucose Transport Alleciates Microvasculature Pathological Changes of Retinopathy in Diabetic Mice[J]. Academic Journal of Second Military Medical University, 2015, 36(2): 147-154.)
doi: 10.3724/SP.J.1008.2015.00147
(Zhang Yuanyuan, Zhang Rihua, Du Xinli, et al.Association Between Serum Eric Acid Concentration and the Metabolic Factors of Diabetes[J]. Acta Universitatis Medicinalis Nanjing, 2013(1): 62-67.)
[31]
Scanlon G, Connell P, Ratzlaff M, et al.Macular Pigment Optical Density is Lower in Type 2 Diabetes, Compared with Type 1 Diabetes and Normal Controls[J].Retina, 2015, 35(9): 1808-1816.
doi: 10.1097/IAE.0000000000000551
pmid: 25932554
(Yang Weina, Wang Xuan, Lan Qian, et al.Clinical Epidemiological Analysis of Type 2 Diabetes Patients with Peripheral Vascular Disease[J].Journal of Xi’an Jiaotong University: Medical Sciences, 2013, 34(1): 73-76.)
doi: 10.3969/j.issn.1671-8259.2013.01.017
[33]
Das R, Kerr R, Chakravarthy U, et al.Dyslipidemia and Diabetic Macular Edema: A Systematic Review and Meta- Analysis[J]. Ophthalmology, 2015, 122(9): 1820-1827.
doi: 10.1016/j.ophtha.2015.05.011
(Bai Zhouxia.Observation and Analysis of Serum Apolipoprotein A1, B and Lipoprotein (a) Levels in Type 2 Diabetes Mellitus[J]. International Journal of Laboratory Medicine, 2010, 31(10): 1146-1147.)
doi: 10.3969/j.issn.1673-4130.2010.10.041
(Wang Jingjing, Tian Chenguang.Application of Glycosylated Hemoglobin, Glycosylated Serum Protein and Peripheral Blood Cell Parametersin Senile Patients with Diabetic Microangiopathy[J]. Journal of Chinese Practical Diagnosis and Therapy, 2010, 24(2): 143-145.)
[37]
Takahara M, Katakami N, Osonoi T, et al.Different Impacts of Cardiovascular Risk Factors on Arterial Stiffness Versus Arterial Wall Thickness in Japanese Patients with Type 2 Diabetes Mellitus[J]. Journal of Atherosclerosis and Thrombosis, 2015, 22(9): 971-980.
doi: 10.5551/jat.29090
pmid: 25864887
[38]
Kaidonis G, Burdon K P, Gillies M C, et al.Common Sequence Variation in the VEGFC Gene is Associated with Diabetic Retinopathy and Diabetic Macular Edema[J]. Ophthalmology, 2015, 122(9): 1828-1836.
doi: 10.1016/j.ophtha.2015.05.004
(Du Wei, Liu Ziyang, Zhou Yanyan, et al.Relationship Between Diabetic Retinopathy and Serum Bilirubin Level[J]. Recent Advances in Ophthalmology, 2012, 32(5): 484-485.)
doi: 10.3969/j.issn.1006-7795.2010.01.027
(Xiang Min, Yang Hong, Ye Chengfu, et al.Level of Blood Trace Elements in Female Patients with Type 2 Diabetic Retinopathy and Its Related Factors Analysis[J]. China Medical Herald, 2014, 11(13): 9-11.)