Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (10): 94-104    DOI: 10.11925/infotech.2096-3467.2017.0641
Orginal Article Current Issue | Archive | Adv Search |
Extracting Disease-Gene-Drug Correlations Based on Data Cube
Wei Xing1,2, Hu Dehua1(), Yi Minhan1, Zhu Qizhen1, Zhu Wenjie2
1Institute of Information Security and Big Data, Central South University, Changsha 410083, China
2School of Basic Courses, Bengbu Medical College, Bengbu 233003, China
Download: PDF (2530 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to construct a disease-gene-drug correlation network for diabetes mellitus (DM). [Methods] First, we proposed a new data cube-based approach to construct a disease-gene-drug correlations network for the DM. Then, we measured the associations among the biological entities. [Results] We retrieved the needed data from the PubMed database and constructed three 1-D vertex cubes, three 2-D square cubes and one 3-D disease-gene-drug network, which revealed 411 associations among the 14 subclasses of DM, 23 genes, and 24 drugs. We also constructed 8 optimal disease-gene-drug subnetworks of DM. [Limitations] There were some subjective issues with the data analysis. The changing of user behaviors may also influence the results. [Conclusions] The proposed algorithm is better than the existing ones, which provides new directions for research on customized medical treatments.

Key wordsDisease      Gene      Drug      Data Cube      Association Rules      Correlations Network     
Received: 03 July 2017      Published: 08 November 2017
ZTFLH:  TP391 G202  

Cite this article:

Wei Xing,Hu Dehua,Yi Minhan,Zhu Qizhen,Zhu Wenjie. Extracting Disease-Gene-Drug Correlations Based on Data Cube. Data Analysis and Knowledge Discovery, 2017, 1(10): 94-104.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0641     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I10/94

营养性系统疾病下的分类 内分泌系统疾病下的分类 糖尿病并发症的分类
英文名称 中文名称 英文名称 中文名称 英文名称 中文名称
Diabetes Mellitus, Experimental 实验性糖尿病 Diabetes Complications 糖尿病并发症 Diabetic Angiopathies 糖尿病性血管病
Diabetes Mellitus, Type 1 1型糖尿病 Diabetes, Gestational 妊娠糖尿病 Diabetic Cardiomyopathies 糖尿病性心肌病
Diabetes Mellitus, Type 2 2型糖尿病 Diabetes Mellitus, Experimental 实验性糖尿病 Diabetic Coma 糖尿病性昏迷
Diabetes, Gestational 妊娠糖尿病 Diabetes Mellitus, Type 1 1型糖尿病 Diabetic Ketoacidosis 糖尿病性酮症酸中毒
Diabetic Ketoacidosis 糖尿病酮症酸中毒 Diabetes Mellitus, Type 2 2型糖尿病, Diabetic Nephropathies 糖尿病性肾病
Donohue Syndrome 多诺霍综合症 Donohue Syndrome 多诺霍综合症 Diabetic Neuropathies 糖尿病性神经病
Prediabetic State 糖尿病前期 Prediabetic State 糖尿病前期 Fetal Macrosomia 巨大胎儿(症)
Rel EN 1 Description 1 EN 2 Description 2
Disease-Gene Diabetic Neuropathies 糖尿病性神经病 IPF1 transcription factor 1
Diabetic Neuropathies 糖尿病性神经病 SUMO4 small ubiquitin-like modifier 4
Diabetic Nephropathies 糖尿病性肾病 IPF1 transcription factor 1
Diabetic Nephropathies 糖尿病性肾病 SUMO4 small ubiquitin-like modifier 4
Disease-Drug Iron Dextran 右旋糖酐铁 Diabetic Angiopathies 糖尿病性血管病
GFT505 治疗代谢综合征(MS)相关性血脂和血糖
异常的潜在新型候选药物
T2DM 2型糖尿病
Telmisartan 替米沙坦 Diabetic Neuropathies 糖尿病性神经病
Aleglitazar 阿格列扎 Diabetic Nephropathies 糖尿病性肾病
Gene-Drug IRS2 insulin receptor substrate 2 Icosapent 二十碳五烯酸
PPARG peroxisome proliferator-activated receptor gamma Icosapent 二十碳五烯酸
IRS2 insulin receptor substrate 2 Levosimendan 左西孟旦
GCK glucokinase (hexokinase 4) Levosimendan 左西孟旦
ENPP1 ectonucleotide pyrophosphatase/ phosphodiesterase 1 Myristic Acid 肉豆蔻酸
[1] Moreau Y, Tranchevent L C.Computational Tools for Prioritizing Candidate Genes: Boosting Disease Gene Discovery[J]. Nature Reviews Genetics, 2012, 13(8): 523-536.
doi: 10.1038/nrg3253 pmid: 22751426
[2] Fundel K, Kuffner R R.RelEx——Relation Extraction Using Dependency Parse Trees[J]. Bioinformatics, 2007, 23(3): 365-371.
doi: 10.1093/bioinformatics/btl616 pmid: 17142812
[3] Bui Q C, Sloot P M, van Mulligen E M, et al. A Novel Feature-Based Approach to Extract Drug-Drug Interactions from Biomedical Text[J]. Bioinformatics, 2014, 30(23): 3365-3371.
doi: 10.1093/bioinformatics/btu557 pmid: 25143286
[4] Xu R, Wang Q Q.Large-scale Extraction of Accurate Drug-Disease Treatment Pairs from Biomedical Literature for Drug Repurposing[J]. BMC Bioinformatics, 2013, 14(13): 1-11.
doi: 10.1186/1471-2105-14-1 pmid: 23323762
[5] Gray J, Bosworth A, Layman A, et al.Data Cube. A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total[J]. Data Mining & Knowledge Discovery, 1997, 1(1): 29-53.
doi: 10.1023/A:1009726021843
[6] Piro R M.Computational Approaches to Disease-Gene Prediction: Rationale, Classification and Successes[J]. Febs Journal, 2012, 279(5): 678-696.
doi: 10.1111/j.1742-4658.2012.08471.x pmid: 22221742
[7] Goh K I, Cusick M E, Valle D, et al.The Human Disease Network[J]. Proceedings of the National Academy of Sciences of the United States of America, 2007, 104(21): 8685-8690.
doi: 10.1073/pnas.0701361104
[8] Suthram S.Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets[J]. PLoS Computational Biology, 2010, 6(2): e1000662.
doi: 10.1371/journal.pcbi.1000662
[9] Arrell D K, Terzic A.Network Systems Biology for Drug Discovery[J]. Clinical Pharmacology & Therapeutics, 2010, 88(1): 120-125.
doi: 10.1038/clpt.2010.91 pmid: 20520604
[10] Lamb J, Craeford E D, Peck D, et al.The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease[J]. Science, 2006, 313(5795): 1929-1935.
doi: 10.1126/science.1132939 pmid: 17008526
[11] Natarajan N.Inductive Matrix Completion for Predicting Gene-Disease Associations[J]. Bioinformatics, 2014, 30(12): 60-68.
doi: 10.1093/bioinformatics/btu269 pmid: 4058925
[12] Odibat O, Reddy C K.Efficient Mining of Discriminative Co-clusters from Gene Expression Data[J]. Knowledge & Information Systems, 2014, 41(3): 667-696.
doi: 10.1007/s10115-013-0684-0 pmid: 4308820
[13] Li J, Edwards S M, Bo T, et al.A Random Set Scoring Model for Prioritization of Disease Candidate Genes Using Protein Complexes and Data-Mining of GeneRIF, OMIM and PubMed Records[J]. BMC Bioinformatics, 2014, 15(22): 3946-3959.
doi: 10.1186/1471-2105-15-315 pmid: 154876224409799996603
[14] Frijters R, Vugt M V, Smeets R, et al.Literature Mining for the Discovery of Hidden Connections Between Drugs, Genes and Diseases[J]. PLoS Computational Biology, 2010, 6(9): e10000943.
[15] Jenssen T K, Laegreid A, Komorowski J, et al.A Literature Network of Human Genes for High-Throughput Analysis of Gene Expression[J]. Nature Genetics, 2001, 28(1): 21-28.
doi: 10.1038/ng0501-21 pmid: 11326270
[16] Li C, Ooi B C, Tung A K H, et al. DADA: A Data Cube for Dominant Relationship Analysis[C]// Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. 2006: 659-670.
[17] Fang M, Shivakumar N, Garcia-Molina H, et al.Computing Iceberg Queries Efficiently[C]// Proceedings of the 24th International Conference on Very Large Data Bases. 1998: 299-310.
[18] Beyer K S, Ramakrishnan R.Bottom-Up Computation of Sparse and Iceberg CUBEs[C]// Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. 1999.
[19] Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2016, 17(1): 33-42.
doi: 10.1093/bib/bbv087 pmid: 4719073
[20] Development Core R Team. R: A Language and Environment for Statistical Computing[J]. Computing, 2013, 14: 12-21.
doi: 10.1890/0012-9658(2002)083[3097:CFHIWS]2.0.CO;2
[21] Hanley J A, Mcneil B J.The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve[J]. Radiology, 1982, 143(1): 29-36.
doi: 10.1148/radiology.143.1.7063747 pmid: 7063747
[22] Donna M, Jim O, Pruitt K D, et al.Entrez Gene: Gene-Centered Information at NCBI[J]. Nucleic Acids Research, 2007, 39(2): 54-58.
doi: 10.1093/nar/gki031 pmid: 17148475
[23] Pruitt K D, Tatiana T, Maglott D R. NCBI Reference Sequences (RefSeq): A Curated Non-Redundant Sequence Database of Genomes Transcripts and Proteins[J]. Nucleic Acids Research, 2008, 33: 501-504.
doi: 10.1093/nar/gki025 pmid: 15608248
[24] Ashburner M, Ball C A, Blake J A, et al.Gene Ontology: Tool for the Unification of Biology[J]. Nature Genetics, 2000, 25(1): 25-29.
doi: 10.1038/75556
[25] Hamosh A, Scott A F, Amberger J S, et al.Online Mendelian Inheritance in Man (OMIM), A Knowledgebase of Human Genes and Genetic Disorders[J]. Nucleic Acids Research, 2005, 33(1): 514-517.
doi: 10.1093/nar/gki033 pmid: 15608251
[26] Knox C, Law V, Jewison T, et al.DrugBank 3.0: A Comprehensive Resource for ‘Omics’ Research on Drugs[J]. Nucleic Acids Research, 2011, 39(S1): 1035-1041.
doi: 10.1093/nar/gkq1126 pmid: 3013709
[27] Lang V Y, Fatehi M, Light P E.Pharmacogenomic Analysis of ATP-Sensitive Potassium Channels Coexpressing the Common Type 2 Diabetes Risk Variants E23K and S1369A[J]. Pharmacogenetics & Genomics, 2012, 22(3): 206-214.
doi: 10.1097/FPC.0b013e32835001e7 pmid: 22209866
[28] Tenenbaum A, Fisman E Z.Balanced Pan-PPAR Activator Bezafibrate in Combination with Statin: Comprehensive Lipids Control and Diabetes Prevention?[J]. Cardiovascular Diabetology, 2012, 11(2): 140.
doi: 10.1186/1475-2840-11-140 pmid: 3502168
[29] Ke J T, Li M, Xu S Q, et al.Gliquidone Decreases Urinary Protein by Promoting Tubular Reabsorption in Diabetic Goto- Kakizaki Rats[J]. Journal of Endocrinology, 2014, 220(2): 129-141.
doi: 10.1530/JOE-13-0199 pmid: 24254365
[30] Hui Z, Min G, Zhou T, et al.An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide- Association-Study-Identified Diabetes Genes and Drug Discovery[J]. Cell Stem Cell, 2016, 9: 326-340.
doi: 10.1016/j.stem.2016.07.002 pmid: 27524441
[31] Nichols C G, Koster J C, Remedi M S.Beta-cell Hyperexcitability: From Hyperinsulinism to Diabetes[J]. Diabetes Obesity & Metabolism, 2007, 9(S2): 81-88.
doi: 10.1111/j.1463-1326.2007.00778.x pmid: 17919182
[32] 张闻. 英汉人类基因词典[M]. 北京: 人民卫生出版社, 2011.
[32] (Zhang Wen.English Chinese Dictionary of Human Genes [M]. Beijing: People’s Medical Publishing House, 2011.)
[33] Rudofsky G, Schlotterer A, Humpert P M, et al.A M55V Polymorphism in the SUMO4 Gene is Associated with a Reduced Prevalence of Diabetic Retinopathy in Patients with Type 1 Diabetes[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 116(1): 14-17.
doi: 10.1055/s-2007-985357 pmid: 17926234
[34] Esmatjes E, Jimenez A, Diaz G, et al.Neonatal Diabetes with End-Stage Nephropathy Pancreas Transplantation Decision[J]. Diabetes Care, 2008, 31(11): 2116-2117.
doi: 10.2337/dc08-0823
[35] Stefanski A, Majkowska L, Ciechanowicz A, et al.The Common C49620T Polymorphism in the Sulfonylurea Receptor Gene (ABCC8), Pancreatic Beta Cell Function and Long-Term Diabetic Complications in Obese Patients with Long-Lasting Type 2 Diabetes Mellitus[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 115(5): 317-321.
[36] Sun K, Liu H, Yeganova L, et al.Extracting Drug-Drug Interactions from Literature Using a Rich Feature-Based Linear Kernel Approach[J]. Journal of Biomedical Informatics, 2015, 55: 23-30.
doi: 10.1016/j.jbi.2015.03.002 pmid: 25796456
[37] Rong X, Wang Q Q.Large-scale Automatic Extraction of Side Effects Associated with Targeted Anticancer Drugs from Full-Text Oncological Articles[J]. Journal of Biomedical Informatics, 2015, 55: 64-72.
doi: 10.1016/j.jbi.2015.03.009 pmid: 25817969
[38] Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2015, 29: 1-10.
doi: 10.1093/bib/bbv087 pmid: 4719073
[39] Boulil K, Bimonte S, Pinet F.Conceptual Model for Spatial Data Cubes: A UML Profile and Its Automatic Implementation[J]. Computer Standards & Interfaces, 2014, 38: 113-132.
doi: 10.1016/j.csi.2014.06.004
[1] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[2] Li Tiejun,Yan Duanwu,Yang Xiongfei. Recommending Microblogs Based on Emotion-Weighted Association Rules[J]. 数据分析与知识发现, 2020, 4(4): 27-33.
[3] Zhang Runtong,Chen Donghua,Zhao Hongmei,Zhu Xiaomin. Computer-Assisted ICD-11 Coding Method Based on Chinese Semantic Analysis[J]. 数据分析与知识发现, 2020, 4(4): 44-55.
[4] Yu Chuanming,Zhong Yunci,Lin Aochen,An Lu. Author Name Disambiguation with Network Embedding[J]. 数据分析与知识发现, 2020, 4(2/3): 48-59.
[5] Yong Zhang,Shuqing Li,Yongshang Cheng. Mining Algorithm for Weighted Association Rules Based on Frequency Effective Length[J]. 数据分析与知识发现, 2019, 3(7): 85-93.
[6] Dongmei Mu,Hui Fa,Ping Wang,Jing Sun. Research on Disease Risk Factors on Structural Equation Model[J]. 数据分析与知识发现, 2019, 3(4): 80-89.
[7] Xiaoxiao Zhu,Zunqi Yang,Jing Liu. Construction of an Adverse Drug Reaction Extraction Model Based on Bi-LSTM and CRF[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[8] Wuxuan Jiang,Huixiang Xiong,Jiaxin Ye,Ning An. Creating Dynamic Tags for Social Networking Groups[J]. 数据分析与知识发现, 2019, 3(10): 98-109.
[9] Yanhua Xu,Yujie Miao,Lin Miao,Xueqiang Lv. Generating HSK Writing Essays with LDA Model[J]. 数据分析与知识发现, 2018, 2(9): 80-87.
[10] Dongmei Mu,Shan Jin,Yuanhong Ju. Finding Association Between Diseases and Genes from Literature Abstracts[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[11] Tingting Wang,Kaiping Wang,Guijie Qi. Analyzing Implemented Ideas from Open Innovation Platform with Sentiment Analysis: Case Study of Salesforce[J]. 数据分析与知识发现, 2018, 2(4): 38-47.
[12] Cuiqing Jiang,Kailun Song,Yong Ding,Yao Liu. Identifying Potential Customers Based on User-Generated Contents[J]. 数据分析与知识发现, 2018, 2(3): 1-8.
[13] Xinyue Fan,Lei Cui. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[14] Xinyue Fan,Lei Cui. Predicting Antineoplastic Drug Targets Based on Network Properties[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
[15] Xinhui Dun,Yunqiu Zhang,Kaixi Yang. Fine-grained Sentiment Analysis Based on Weibo[J]. 数据分析与知识发现, 2017, 1(7): 61-72.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn