Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (10): 94-104    DOI: 10.11925/infotech.2096-3467.2017.0641
Orginal Article Current Issue | Archive | Adv Search |
Extracting Disease-Gene-Drug Correlations Based on Data Cube
Wei Xing1,2, Hu Dehua1(), Yi Minhan1, Zhu Qizhen1, Zhu Wenjie2
1Institute of Information Security and Big Data, Central South University, Changsha 410083, China
2School of Basic Courses, Bengbu Medical College, Bengbu 233003, China
Download: PDF (2530 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      

[Objective] This study aims to construct a disease-gene-drug correlation network for diabetes mellitus (DM). [Methods] First, we proposed a new data cube-based approach to construct a disease-gene-drug correlations network for the DM. Then, we measured the associations among the biological entities. [Results] We retrieved the needed data from the PubMed database and constructed three 1-D vertex cubes, three 2-D square cubes and one 3-D disease-gene-drug network, which revealed 411 associations among the 14 subclasses of DM, 23 genes, and 24 drugs. We also constructed 8 optimal disease-gene-drug subnetworks of DM. [Limitations] There were some subjective issues with the data analysis. The changing of user behaviors may also influence the results. [Conclusions] The proposed algorithm is better than the existing ones, which provides new directions for research on customized medical treatments.

Key wordsDisease      Gene      Drug      Data Cube      Association Rules      Correlations Network     
Received: 03 July 2017      Published: 08 November 2017
ZTFLH:  TP391 G202  

Cite this article:

Wei Xing,Hu Dehua,Yi Minhan,Zhu Qizhen,Zhu Wenjie. Extracting Disease-Gene-Drug Correlations Based on Data Cube. Data Analysis and Knowledge Discovery, 2017, 1(10): 94-104.

URL:     OR

营养性系统疾病下的分类 内分泌系统疾病下的分类 糖尿病并发症的分类
英文名称 中文名称 英文名称 中文名称 英文名称 中文名称
Diabetes Mellitus, Experimental 实验性糖尿病 Diabetes Complications 糖尿病并发症 Diabetic Angiopathies 糖尿病性血管病
Diabetes Mellitus, Type 1 1型糖尿病 Diabetes, Gestational 妊娠糖尿病 Diabetic Cardiomyopathies 糖尿病性心肌病
Diabetes Mellitus, Type 2 2型糖尿病 Diabetes Mellitus, Experimental 实验性糖尿病 Diabetic Coma 糖尿病性昏迷
Diabetes, Gestational 妊娠糖尿病 Diabetes Mellitus, Type 1 1型糖尿病 Diabetic Ketoacidosis 糖尿病性酮症酸中毒
Diabetic Ketoacidosis 糖尿病酮症酸中毒 Diabetes Mellitus, Type 2 2型糖尿病, Diabetic Nephropathies 糖尿病性肾病
Donohue Syndrome 多诺霍综合症 Donohue Syndrome 多诺霍综合症 Diabetic Neuropathies 糖尿病性神经病
Prediabetic State 糖尿病前期 Prediabetic State 糖尿病前期 Fetal Macrosomia 巨大胎儿(症)
Rel EN 1 Description 1 EN 2 Description 2
Disease-Gene Diabetic Neuropathies 糖尿病性神经病 IPF1 transcription factor 1
Diabetic Neuropathies 糖尿病性神经病 SUMO4 small ubiquitin-like modifier 4
Diabetic Nephropathies 糖尿病性肾病 IPF1 transcription factor 1
Diabetic Nephropathies 糖尿病性肾病 SUMO4 small ubiquitin-like modifier 4
Disease-Drug Iron Dextran 右旋糖酐铁 Diabetic Angiopathies 糖尿病性血管病
GFT505 治疗代谢综合征(MS)相关性血脂和血糖
T2DM 2型糖尿病
Telmisartan 替米沙坦 Diabetic Neuropathies 糖尿病性神经病
Aleglitazar 阿格列扎 Diabetic Nephropathies 糖尿病性肾病
Gene-Drug IRS2 insulin receptor substrate 2 Icosapent 二十碳五烯酸
PPARG peroxisome proliferator-activated receptor gamma Icosapent 二十碳五烯酸
IRS2 insulin receptor substrate 2 Levosimendan 左西孟旦
GCK glucokinase (hexokinase 4) Levosimendan 左西孟旦
ENPP1 ectonucleotide pyrophosphatase/ phosphodiesterase 1 Myristic Acid 肉豆蔻酸
[1] Moreau Y, Tranchevent L C.Computational Tools for Prioritizing Candidate Genes: Boosting Disease Gene Discovery[J]. Nature Reviews Genetics, 2012, 13(8): 523-536.
doi: 10.1038/nrg3253 pmid: 22751426
[2] Fundel K, Kuffner R R.RelEx——Relation Extraction Using Dependency Parse Trees[J]. Bioinformatics, 2007, 23(3): 365-371.
doi: 10.1093/bioinformatics/btl616 pmid: 17142812
[3] Bui Q C, Sloot P M, van Mulligen E M, et al. A Novel Feature-Based Approach to Extract Drug-Drug Interactions from Biomedical Text[J]. Bioinformatics, 2014, 30(23): 3365-3371.
doi: 10.1093/bioinformatics/btu557 pmid: 25143286
[4] Xu R, Wang Q Q.Large-scale Extraction of Accurate Drug-Disease Treatment Pairs from Biomedical Literature for Drug Repurposing[J]. BMC Bioinformatics, 2013, 14(13): 1-11.
doi: 10.1186/1471-2105-14-1 pmid: 23323762
[5] Gray J, Bosworth A, Layman A, et al.Data Cube. A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total[J]. Data Mining & Knowledge Discovery, 1997, 1(1): 29-53.
doi: 10.1023/A:1009726021843
[6] Piro R M.Computational Approaches to Disease-Gene Prediction: Rationale, Classification and Successes[J]. Febs Journal, 2012, 279(5): 678-696.
doi: 10.1111/j.1742-4658.2012.08471.x pmid: 22221742
[7] Goh K I, Cusick M E, Valle D, et al.The Human Disease Network[J]. Proceedings of the National Academy of Sciences of the United States of America, 2007, 104(21): 8685-8690.
doi: 10.1073/pnas.0701361104
[8] Suthram S.Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets[J]. PLoS Computational Biology, 2010, 6(2): e1000662.
doi: 10.1371/journal.pcbi.1000662
[9] Arrell D K, Terzic A.Network Systems Biology for Drug Discovery[J]. Clinical Pharmacology & Therapeutics, 2010, 88(1): 120-125.
doi: 10.1038/clpt.2010.91 pmid: 20520604
[10] Lamb J, Craeford E D, Peck D, et al.The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease[J]. Science, 2006, 313(5795): 1929-1935.
doi: 10.1126/science.1132939 pmid: 17008526
[11] Natarajan N.Inductive Matrix Completion for Predicting Gene-Disease Associations[J]. Bioinformatics, 2014, 30(12): 60-68.
doi: 10.1093/bioinformatics/btu269 pmid: 4058925
[12] Odibat O, Reddy C K.Efficient Mining of Discriminative Co-clusters from Gene Expression Data[J]. Knowledge & Information Systems, 2014, 41(3): 667-696.
doi: 10.1007/s10115-013-0684-0 pmid: 4308820
[13] Li J, Edwards S M, Bo T, et al.A Random Set Scoring Model for Prioritization of Disease Candidate Genes Using Protein Complexes and Data-Mining of GeneRIF, OMIM and PubMed Records[J]. BMC Bioinformatics, 2014, 15(22): 3946-3959.
doi: 10.1186/1471-2105-15-315 pmid: 154876224409799996603
[14] Frijters R, Vugt M V, Smeets R, et al.Literature Mining for the Discovery of Hidden Connections Between Drugs, Genes and Diseases[J]. PLoS Computational Biology, 2010, 6(9): e10000943.
[15] Jenssen T K, Laegreid A, Komorowski J, et al.A Literature Network of Human Genes for High-Throughput Analysis of Gene Expression[J]. Nature Genetics, 2001, 28(1): 21-28.
doi: 10.1038/ng0501-21 pmid: 11326270
[16] Li C, Ooi B C, Tung A K H, et al. DADA: A Data Cube for Dominant Relationship Analysis[C]// Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. 2006: 659-670.
[17] Fang M, Shivakumar N, Garcia-Molina H, et al.Computing Iceberg Queries Efficiently[C]// Proceedings of the 24th International Conference on Very Large Data Bases. 1998: 299-310.
[18] Beyer K S, Ramakrishnan R.Bottom-Up Computation of Sparse and Iceberg CUBEs[C]// Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. 1999.
[19] Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2016, 17(1): 33-42.
doi: 10.1093/bib/bbv087 pmid: 4719073
[20] Development Core R Team. R: A Language and Environment for Statistical Computing[J]. Computing, 2013, 14: 12-21.
doi: 10.1890/0012-9658(2002)083[3097:CFHIWS]2.0.CO;2
[21] Hanley J A, Mcneil B J.The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve[J]. Radiology, 1982, 143(1): 29-36.
doi: 10.1148/radiology.143.1.7063747 pmid: 7063747
[22] Donna M, Jim O, Pruitt K D, et al.Entrez Gene: Gene-Centered Information at NCBI[J]. Nucleic Acids Research, 2007, 39(2): 54-58.
doi: 10.1093/nar/gki031 pmid: 17148475
[23] Pruitt K D, Tatiana T, Maglott D R. NCBI Reference Sequences (RefSeq): A Curated Non-Redundant Sequence Database of Genomes Transcripts and Proteins[J]. Nucleic Acids Research, 2008, 33: 501-504.
doi: 10.1093/nar/gki025 pmid: 15608248
[24] Ashburner M, Ball C A, Blake J A, et al.Gene Ontology: Tool for the Unification of Biology[J]. Nature Genetics, 2000, 25(1): 25-29.
doi: 10.1038/75556
[25] Hamosh A, Scott A F, Amberger J S, et al.Online Mendelian Inheritance in Man (OMIM), A Knowledgebase of Human Genes and Genetic Disorders[J]. Nucleic Acids Research, 2005, 33(1): 514-517.
doi: 10.1093/nar/gki033 pmid: 15608251
[26] Knox C, Law V, Jewison T, et al.DrugBank 3.0: A Comprehensive Resource for ‘Omics’ Research on Drugs[J]. Nucleic Acids Research, 2011, 39(S1): 1035-1041.
doi: 10.1093/nar/gkq1126 pmid: 3013709
[27] Lang V Y, Fatehi M, Light P E.Pharmacogenomic Analysis of ATP-Sensitive Potassium Channels Coexpressing the Common Type 2 Diabetes Risk Variants E23K and S1369A[J]. Pharmacogenetics & Genomics, 2012, 22(3): 206-214.
doi: 10.1097/FPC.0b013e32835001e7 pmid: 22209866
[28] Tenenbaum A, Fisman E Z.Balanced Pan-PPAR Activator Bezafibrate in Combination with Statin: Comprehensive Lipids Control and Diabetes Prevention?[J]. Cardiovascular Diabetology, 2012, 11(2): 140.
doi: 10.1186/1475-2840-11-140 pmid: 3502168
[29] Ke J T, Li M, Xu S Q, et al.Gliquidone Decreases Urinary Protein by Promoting Tubular Reabsorption in Diabetic Goto- Kakizaki Rats[J]. Journal of Endocrinology, 2014, 220(2): 129-141.
doi: 10.1530/JOE-13-0199 pmid: 24254365
[30] Hui Z, Min G, Zhou T, et al.An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide- Association-Study-Identified Diabetes Genes and Drug Discovery[J]. Cell Stem Cell, 2016, 9: 326-340.
doi: 10.1016/j.stem.2016.07.002 pmid: 27524441
[31] Nichols C G, Koster J C, Remedi M S.Beta-cell Hyperexcitability: From Hyperinsulinism to Diabetes[J]. Diabetes Obesity & Metabolism, 2007, 9(S2): 81-88.
doi: 10.1111/j.1463-1326.2007.00778.x pmid: 17919182
[32] 张闻. 英汉人类基因词典[M]. 北京: 人民卫生出版社, 2011.
[32] (Zhang Wen.English Chinese Dictionary of Human Genes [M]. Beijing: People’s Medical Publishing House, 2011.)
[33] Rudofsky G, Schlotterer A, Humpert P M, et al.A M55V Polymorphism in the SUMO4 Gene is Associated with a Reduced Prevalence of Diabetic Retinopathy in Patients with Type 1 Diabetes[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 116(1): 14-17.
doi: 10.1055/s-2007-985357 pmid: 17926234
[34] Esmatjes E, Jimenez A, Diaz G, et al.Neonatal Diabetes with End-Stage Nephropathy Pancreas Transplantation Decision[J]. Diabetes Care, 2008, 31(11): 2116-2117.
doi: 10.2337/dc08-0823
[35] Stefanski A, Majkowska L, Ciechanowicz A, et al.The Common C49620T Polymorphism in the Sulfonylurea Receptor Gene (ABCC8), Pancreatic Beta Cell Function and Long-Term Diabetic Complications in Obese Patients with Long-Lasting Type 2 Diabetes Mellitus[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 115(5): 317-321.
[36] Sun K, Liu H, Yeganova L, et al.Extracting Drug-Drug Interactions from Literature Using a Rich Feature-Based Linear Kernel Approach[J]. Journal of Biomedical Informatics, 2015, 55: 23-30.
doi: 10.1016/j.jbi.2015.03.002 pmid: 25796456
[37] Rong X, Wang Q Q.Large-scale Automatic Extraction of Side Effects Associated with Targeted Anticancer Drugs from Full-Text Oncological Articles[J]. Journal of Biomedical Informatics, 2015, 55: 64-72.
doi: 10.1016/j.jbi.2015.03.009 pmid: 25817969
[38] Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2015, 29: 1-10.
doi: 10.1093/bib/bbv087 pmid: 4719073
[39] Boulil K, Bimonte S, Pinet F.Conceptual Model for Spatial Data Cubes: A UML Profile and Its Automatic Implementation[J]. Computer Standards & Interfaces, 2014, 38: 113-132.
doi: 10.1016/j.csi.2014.06.004
[1] Wang Hanxue,Cui Wenjuan,Zhou Yuanchun,Du Yi. Identifying Pathogens of Foodborne Diseases with Machine Learning[J]. 数据分析与知识发现, 2021, 5(9): 54-62.
[2] Wang Ruolin, Niu Zhendong, Lin Qika, Zhu Yifan, Qiu Ping, Lu Hao, Liu Donglei. Disambiguating Author Names with Embedding Heterogeneous Information and Attentive RNN Clustering Parameters[J]. 数据分析与知识发现, 2021, 5(8): 13-24.
[3] Jiang Yaren, Le Xiaoqiu. Continual Learning for One-to-many Entity Relationship Generation with Small Samples[J]. 数据分析与知识发现, 2021, 5(8): 45-53.
[4] Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[5] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[6] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[7] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[8] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[9] Ma Yingxue,Gan Mingxin,Xiao Kejun. A Matrix Factorization Recommendation Method with Tags and Contents[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[10] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[11] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[12] Hu Shaohu,Zhang Yingyi,Zhang Chengzhi. Review of Keyword Extraction Studies[J]. 数据分析与知识发现, 2021, 5(3): 45-59.
[13] Zhao Tianzi, Duan Liang, Yue Kun, Qiao Shaojie, Ma Zijuan. Generating News Clues with Biterm Topic Model[J]. 数据分析与知识发现, 2021, 5(2): 1-13.
[14] Lv Huakui,Liu Zhenghao,Qian Yuxing,Hong Xudong. Relationship Between Financial News and Stock Market Fluctuations[J]. 数据分析与知识发现, 2021, 5(1): 99-111.
[15] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938