Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (5): 73-79    DOI: 10.11925/infotech.1003-3513.2015.05.10
Current Issue | Archive | Adv Search |
The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data
Fan Yunman, Hong Na, Qian Qing, Fang An
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download: PDF(1016 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] To explore the new idea and method, accumulate first-hand experience from the aspects of importing, storaging, retrievaling and bulk exporting the large-scale biomedical data. [Methods] Analyze the characteristics of the large-scale biomedical data, and compare the technologies, the advantages and disadvantages for solving the big data problem of the traditional relational databases (the representative Oracle) and the NoSQL database (the representative HBase), from the aspects of theoretic and test results. Take a drug database of genomic data storage systems as an example, and make a test for the performances of Oracle and HBase. [Results] HBase in practical application has a large advantage over Oracle when process large data. [Limitations] Lacking the deep mining and analysing to the pharmacogenomics data, the future research needs an in-depth technical optimization for Hadoop/HBase. [Conclusions] In this experiment, HBase can meet storage requirements for the large-scale biomedical data.

Key wordsBiomedicine      Big data      RDBMS      NoSQL      Hadoop      Hbase     
Received: 05 November 2014      Published: 11 June 2015
:  G352  

Cite this article:

Fan Yunman, Hong Na, Qian Qing, Fang An. The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data. New Technology of Library and Information Service, 2015, 31(5): 73-79.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.05.10     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I5/73

[1] PubMed [EB/OL]. [2014-10-30]. http://www.ncbi.nlm.nih. gov/pubmed.
[2] Unified Medical Language System (UMLS) [EB/OL]. [2014-10-30]. http://www.nlm.nih.gov/research/umls/.
[3] UniProt [EB/OL]. [2014-10-30]. http://www.uniprot.org/.
[4] 王培建. 云计算环境下大规模数据存储技术研究[D]. 北京: 北京邮电大学, 2013. (Wang Peijian. The Research of Big Data Storage Technology in Cloud Computing [D]. Beijing: Beijing University of Posts and Telecommunications, 2013.)
[5] 李青. 基于NoSQL的大数据处理的研究[D]. 西安: 西安电子科技大学, 2014. (Li Qing. Processing of Big Data Based on NoSQL [D]. Xi'an: Xidian University, 2014.)
[6] 卓海艺. 基于HBase的海量数据实时查询系统设计与实现[D]. 北京: 北京邮电大学, 2013. (Zhuo Haiyi. The Design and Implementation of Real-time Query System for Mass Data Based on HBase [D]. Beijing: Beijing University of Posts and Telecommunications, 2013.)
[7] 潘洪志. 高性能NoSQL存储系统的研究与实现[D]. 长春: 吉林大学, 2014. (Pan Hongzhi. Research and Implementation of High-performance Storage Systems NoSQL [D]. Chang-chun: Jilin University, 2014.)
[8] 边耐政, 郑小裕. SQL与NoSQL数据库的统一查询模型的设计与实现 [C]. 见: 电子教育, 电子商务与信息管理国际会议, 上海, 中国. 2014. (Bian Naizheng, Zheng Xiaoyu. Design and Implementation of Relation Database and Non-Relation Database Unified Query Model [C]. In: Proceedings of the 2014 International Conference on E-Education, E-Business and Information Management, Shanghai, China. 2014.)
[9] Cattell R. Scalable SQL and NoSQL Data Stores [J]. ACM SIGMOD Record, 2010, 39(4): 12-27.
[10] Hadjigeorgiou C. RDBMS vs NoSQL: Performance and Scaling Comparison [EB/OL]. [2014-10-30]. http://static.ph. ed.ac.uk/dissertations/hpc-msc/2012-2013/RDBMS%20vs%20NoSQL%20-%20Performance%20and%20Scaling%20Comparison.pdf.
[11] Nance C, Losser T, Iype R, et al. NoSQL vs RDBMS-Why There is Room for Both [C]. In: Proceedings of the 2013 Southern Association for Information Systems. 2013.
[12] Moniruzzaman A B M, Hossain S A. NoSQL Database: New Era of Databases for Big Data Analytics-Classification, Characteristics and Comparison [OL]. arXiv, 2013. arXiv: 1307. 0191.
[13] Boicea A, Radulescu F, Agapin L I. MongoDB vs Oracle-Database Comparison [C]. In: Proceedings of the 3rd International Conference on Emerging Intelligent Data and Web Technologies, Bucharest, Romania. 2012.
[14] HBase [EB/OL]. [2014-10-30]. http://hbase.apache.org/.
[15] Chang F, Dean J, Ghemawat S, et al. Bigtable: A Distributed Storage System for Structured Data [J]. ACM Transactions on Computer Systems (TOCS), 2008, 26(2): Article No.4.
[16] Johnson J A. Pharmacogenetics: Potential for Individualized Drug Therapy Through Genetics [J]. Trends in Genet, 2003, 19(11): 660-666.

[1] Beibei Kong,Jing Xie,Li Qian,Zhijun Chang,Zhenxin Wu. Methodology and Tools to Enrich Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(7): 113-122.
[2] Xiaozhou Dong,Xinkang Chen. E-Coupon and Economic Performance of E-commerce[J]. 数据分析与知识发现, 2019, 3(6): 42-49.
[3] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[4] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[5] Li Qian,Jing Xie,Zhijun Chang,Zhenxin Wu,Dongrong Zhang. Designing Smart Knowledge Services with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 4-14.
[6] Jiying Hu,Jing Xie,Li Qian,Changlei Fu. Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph[J]. 数据分析与知识发现, 2019, 3(1): 55-62.
[7] Jing Xie,Li Qian,Hongbo Shi,Beibei Kong,Jiying Hu. Designing Framework for Precise Service of Scholarly Big Data[J]. 数据分析与知识发现, 2019, 3(1): 63-71.
[8] Zhihong Shen,Chang Yao,Yanfei Hou,Linhuan Wu,Yuepeng Li. Big Linked Data Management: Challenges, Solutions and Practices[J]. 数据分析与知识发现, 2018, 2(1): 9-20.
[9] Cao Yang, Fan Wenfei, Yuan Tengfei. Is Big Data Analytics Beyond the Reach of Small Companies?[J]. 数据分析与知识发现, 2017, 1(9): 1-7.
[10] Lemen Chao,Canjun Yang,Shengjie Wang,Junpeng Zhao,Mengtian Xu. Data Science Curriculums Around the World: An Empirical Study[J]. 数据分析与知识发现, 2017, 1(6): 12-21.
[11] Xuefeng Shen, Yongzhen Ke, Nan Yao. Visualization of Coalition Data Based on Multi View Cooperation[J]. 数据分析与知识发现, 2017, 1(3): 21-28.
[12] Ruilun Liu,Wenhao Ye,Ruiqing Gao,Mengjia Tang,Dongbo Wang. Research on Text Clustering Based on Requirements of Big Data Jobs[J]. 数据分析与知识发现, 2017, 1(12): 32-40.
[13] Cen Yonghua,Wang Yuefen. Social Public Opinion Analysis and Decision Making Support with Big Data[J]. 现代图书情报技术, 2016, 32(7-8): 3-11.
[14] Yang Aidong,Liu Dongsu. Hadoop Based Public Opinion Monitoring System for Micro-blogs[J]. 现代图书情报技术, 2016, 32(5): 56-63.
[15] Qian Gao, Yang Yang, Guangwei Hu, Chao Xu, Gaofeng Shen, Jian Zhao. Analyzing Return of Investment for New Energy Project with Big Data: Case Study of SG-ERP System in Y City[J]. 数据分析与知识发现, 2016, 32(12): 57-65.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn