Extracting Disease-Gene-Drug Correlations Based on Data Cube
Wei Xing1,2, Hu Dehua1(), Yi Minhan1, Zhu Qizhen1, Zhu Wenjie2
1Institute of Information Security and Big Data, Central South University, Changsha 410083, China 2School of Basic Courses, Bengbu Medical College, Bengbu 233003, China
[Objective] This study aims to construct a disease-gene-drug correlation network for diabetes mellitus (DM). [Methods] First, we proposed a new data cube-based approach to construct a disease-gene-drug correlations network for the DM. Then, we measured the associations among the biological entities. [Results] We retrieved the needed data from the PubMed database and constructed three 1-D vertex cubes, three 2-D square cubes and one 3-D disease-gene-drug network, which revealed 411 associations among the 14 subclasses of DM, 23 genes, and 24 drugs. We also constructed 8 optimal disease-gene-drug subnetworks of DM. [Limitations] There were some subjective issues with the data analysis. The changing of user behaviors may also influence the results. [Conclusions] The proposed algorithm is better than the existing ones, which provides new directions for research on customized medical treatments.
Moreau Y, Tranchevent L C.Computational Tools for Prioritizing Candidate Genes: Boosting Disease Gene Discovery[J]. Nature Reviews Genetics, 2012, 13(8): 523-536.
doi: 10.1038/nrg3253
pmid: 22751426
[2]
Fundel K, Kuffner R R.RelEx——Relation Extraction Using Dependency Parse Trees[J]. Bioinformatics, 2007, 23(3): 365-371.
doi: 10.1093/bioinformatics/btl616
pmid: 17142812
[3]
Bui Q C, Sloot P M, van Mulligen E M, et al. A Novel Feature-Based Approach to Extract Drug-Drug Interactions from Biomedical Text[J]. Bioinformatics, 2014, 30(23): 3365-3371.
doi: 10.1093/bioinformatics/btu557
pmid: 25143286
[4]
Xu R, Wang Q Q.Large-scale Extraction of Accurate Drug-Disease Treatment Pairs from Biomedical Literature for Drug Repurposing[J]. BMC Bioinformatics, 2013, 14(13): 1-11.
doi: 10.1186/1471-2105-14-1
pmid: 23323762
[5]
Gray J, Bosworth A, Layman A, et al.Data Cube. A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total[J]. Data Mining & Knowledge Discovery, 1997, 1(1): 29-53.
doi: 10.1023/A:1009726021843
[6]
Piro R M.Computational Approaches to Disease-Gene Prediction: Rationale, Classification and Successes[J]. Febs Journal, 2012, 279(5): 678-696.
doi: 10.1111/j.1742-4658.2012.08471.x
pmid: 22221742
[7]
Goh K I, Cusick M E, Valle D, et al.The Human Disease Network[J]. Proceedings of the National Academy of Sciences of the United States of America, 2007, 104(21): 8685-8690.
doi: 10.1073/pnas.0701361104
[8]
Suthram S.Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets[J]. PLoS Computational Biology, 2010, 6(2): e1000662.
doi: 10.1371/journal.pcbi.1000662
[9]
Arrell D K, Terzic A.Network Systems Biology for Drug Discovery[J]. Clinical Pharmacology & Therapeutics, 2010, 88(1): 120-125.
doi: 10.1038/clpt.2010.91
pmid: 20520604
[10]
Lamb J, Craeford E D, Peck D, et al.The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease[J]. Science, 2006, 313(5795): 1929-1935.
doi: 10.1126/science.1132939
pmid: 17008526
Odibat O, Reddy C K.Efficient Mining of Discriminative Co-clusters from Gene Expression Data[J]. Knowledge & Information Systems, 2014, 41(3): 667-696.
doi: 10.1007/s10115-013-0684-0
pmid: 4308820
[13]
Li J, Edwards S M, Bo T, et al.A Random Set Scoring Model for Prioritization of Disease Candidate Genes Using Protein Complexes and Data-Mining of GeneRIF, OMIM and PubMed Records[J]. BMC Bioinformatics, 2014, 15(22): 3946-3959.
doi: 10.1186/1471-2105-15-315
pmid: 154876224409799996603
[14]
Frijters R, Vugt M V, Smeets R, et al.Literature Mining for the Discovery of Hidden Connections Between Drugs, Genes and Diseases[J]. PLoS Computational Biology, 2010, 6(9): e10000943.
[15]
Jenssen T K, Laegreid A, Komorowski J, et al.A Literature Network of Human Genes for High-Throughput Analysis of Gene Expression[J]. Nature Genetics, 2001, 28(1): 21-28.
doi: 10.1038/ng0501-21
pmid: 11326270
[16]
Li C, Ooi B C, Tung A K H, et al. DADA: A Data Cube for Dominant Relationship Analysis[C]// Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. 2006: 659-670.
[17]
Fang M, Shivakumar N, Garcia-Molina H, et al.Computing Iceberg Queries Efficiently[C]// Proceedings of the 24th International Conference on Very Large Data Bases. 1998: 299-310.
[18]
Beyer K S, Ramakrishnan R.Bottom-Up Computation of Sparse and Iceberg CUBEs[C]// Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. 1999.
[19]
Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2016, 17(1): 33-42.
doi: 10.1093/bib/bbv087
pmid: 4719073
Hanley J A, Mcneil B J.The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve[J]. Radiology, 1982, 143(1): 29-36.
doi: 10.1148/radiology.143.1.7063747
pmid: 7063747
[22]
Donna M, Jim O, Pruitt K D, et al.Entrez Gene: Gene-Centered Information at NCBI[J]. Nucleic Acids Research, 2007, 39(2): 54-58.
doi: 10.1093/nar/gki031
pmid: 17148475
[23]
Pruitt K D, Tatiana T, Maglott D R. NCBI Reference Sequences (RefSeq): A Curated Non-Redundant Sequence Database of Genomes Transcripts and Proteins[J]. Nucleic Acids Research, 2008, 33: 501-504.
doi: 10.1093/nar/gki025
pmid: 15608248
[24]
Ashburner M, Ball C A, Blake J A, et al.Gene Ontology: Tool for the Unification of Biology[J]. Nature Genetics, 2000, 25(1): 25-29.
doi: 10.1038/75556
[25]
Hamosh A, Scott A F, Amberger J S, et al.Online Mendelian Inheritance in Man (OMIM), A Knowledgebase of Human Genes and Genetic Disorders[J]. Nucleic Acids Research, 2005, 33(1): 514-517.
doi: 10.1093/nar/gki033
pmid: 15608251
[26]
Knox C, Law V, Jewison T, et al.DrugBank 3.0: A Comprehensive Resource for ‘Omics’ Research on Drugs[J]. Nucleic Acids Research, 2011, 39(S1): 1035-1041.
doi: 10.1093/nar/gkq1126
pmid: 3013709
[27]
Lang V Y, Fatehi M, Light P E.Pharmacogenomic Analysis of ATP-Sensitive Potassium Channels Coexpressing the Common Type 2 Diabetes Risk Variants E23K and S1369A[J]. Pharmacogenetics & Genomics, 2012, 22(3): 206-214.
doi: 10.1097/FPC.0b013e32835001e7
pmid: 22209866
[28]
Tenenbaum A, Fisman E Z.Balanced Pan-PPAR Activator Bezafibrate in Combination with Statin: Comprehensive Lipids Control and Diabetes Prevention?[J]. Cardiovascular Diabetology, 2012, 11(2): 140.
doi: 10.1186/1475-2840-11-140
pmid: 3502168
[29]
Ke J T, Li M, Xu S Q, et al.Gliquidone Decreases Urinary Protein by Promoting Tubular Reabsorption in Diabetic Goto- Kakizaki Rats[J]. Journal of Endocrinology, 2014, 220(2): 129-141.
doi: 10.1530/JOE-13-0199
pmid: 24254365
[30]
Hui Z, Min G, Zhou T, et al.An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide- Association-Study-Identified Diabetes Genes and Drug Discovery[J]. Cell Stem Cell, 2016, 9: 326-340.
doi: 10.1016/j.stem.2016.07.002
pmid: 27524441
[31]
Nichols C G, Koster J C, Remedi M S.Beta-cell Hyperexcitability: From Hyperinsulinism to Diabetes[J]. Diabetes Obesity & Metabolism, 2007, 9(S2): 81-88.
doi: 10.1111/j.1463-1326.2007.00778.x
pmid: 17919182
[32]
张闻. 英汉人类基因词典[M]. 北京: 人民卫生出版社, 2011.
[32]
(Zhang Wen.English Chinese Dictionary of Human Genes [M]. Beijing: People’s Medical Publishing House, 2011.)
[33]
Rudofsky G, Schlotterer A, Humpert P M, et al.A M55V Polymorphism in the SUMO4 Gene is Associated with a Reduced Prevalence of Diabetic Retinopathy in Patients with Type 1 Diabetes[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 116(1): 14-17.
doi: 10.1055/s-2007-985357
pmid: 17926234
[34]
Esmatjes E, Jimenez A, Diaz G, et al.Neonatal Diabetes with End-Stage Nephropathy Pancreas Transplantation Decision[J]. Diabetes Care, 2008, 31(11): 2116-2117.
doi: 10.2337/dc08-0823
[35]
Stefanski A, Majkowska L, Ciechanowicz A, et al.The Common C49620T Polymorphism in the Sulfonylurea Receptor Gene (ABCC8), Pancreatic Beta Cell Function and Long-Term Diabetic Complications in Obese Patients with Long-Lasting Type 2 Diabetes Mellitus[J]. Experimental & Clinical Endocrinology & Diabetes, 2007, 115(5): 317-321.
[36]
Sun K, Liu H, Yeganova L, et al.Extracting Drug-Drug Interactions from Literature Using a Rich Feature-Based Linear Kernel Approach[J]. Journal of Biomedical Informatics, 2015, 55: 23-30.
doi: 10.1016/j.jbi.2015.03.002
pmid: 25796456
[37]
Rong X, Wang Q Q.Large-scale Automatic Extraction of Side Effects Associated with Targeted Anticancer Drugs from Full-Text Oncological Articles[J]. Journal of Biomedical Informatics, 2015, 55: 64-72.
doi: 10.1016/j.jbi.2015.03.009
pmid: 25817969
[38]
Gonzalez G H, Tahsin T, Goodale B C, et al.Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery[J]. Briefings in Bioinformatics, 2015, 29: 1-10.
doi: 10.1093/bib/bbv087
pmid: 4719073
[39]
Boulil K, Bimonte S, Pinet F.Conceptual Model for Spatial Data Cubes: A UML Profile and Its Automatic Implementation[J]. Computer Standards & Interfaces, 2014, 38: 113-132.
doi: 10.1016/j.csi.2014.06.004