Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (5): 73-79    DOI: 10.11925/infotech.1003-3513.2015.05.10
Current Issue | Archive | Adv Search |
The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data
Fan Yunman, Hong Na, Qian Qing, Fang An
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] To explore the new idea and method, accumulate first-hand experience from the aspects of importing, storaging, retrievaling and bulk exporting the large-scale biomedical data. [Methods] Analyze the characteristics of the large-scale biomedical data, and compare the technologies, the advantages and disadvantages for solving the big data problem of the traditional relational databases (the representative Oracle) and the NoSQL database (the representative HBase), from the aspects of theoretic and test results. Take a drug database of genomic data storage systems as an example, and make a test for the performances of Oracle and HBase. [Results] HBase in practical application has a large advantage over Oracle when process large data. [Limitations] Lacking the deep mining and analysing to the pharmacogenomics data, the future research needs an in-depth technical optimization for Hadoop/HBase. [Conclusions] In this experiment, HBase can meet storage requirements for the large-scale biomedical data.

Key wordsBiomedicine      Big data      RDBMS      NoSQL      Hadoop      Hbase     
Received: 05 November 2014      Published: 11 June 2015
:  G352  

Cite this article:

Fan Yunman, Hong Na, Qian Qing, Fang An. The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data. New Technology of Library and Information Service, 2015, 31(5): 73-79.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.05.10     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I5/73

[1] PubMed [EB/OL]. [2014-10-30]. http://www.ncbi.nlm.nih. gov/pubmed.
[2] Unified Medical Language System (UMLS) [EB/OL]. [2014-10-30]. http://www.nlm.nih.gov/research/umls/.
[3] UniProt [EB/OL]. [2014-10-30]. http://www.uniprot.org/.
[4] 王培建. 云计算环境下大规模数据存储技术研究[D]. 北京: 北京邮电大学, 2013. (Wang Peijian. The Research of Big Data Storage Technology in Cloud Computing [D]. Beijing: Beijing University of Posts and Telecommunications, 2013.)
[5] 李青. 基于NoSQL的大数据处理的研究[D]. 西安: 西安电子科技大学, 2014. (Li Qing. Processing of Big Data Based on NoSQL [D]. Xi'an: Xidian University, 2014.)
[6] 卓海艺. 基于HBase的海量数据实时查询系统设计与实现[D]. 北京: 北京邮电大学, 2013. (Zhuo Haiyi. The Design and Implementation of Real-time Query System for Mass Data Based on HBase [D]. Beijing: Beijing University of Posts and Telecommunications, 2013.)
[7] 潘洪志. 高性能NoSQL存储系统的研究与实现[D]. 长春: 吉林大学, 2014. (Pan Hongzhi. Research and Implementation of High-performance Storage Systems NoSQL [D]. Chang-chun: Jilin University, 2014.)
[8] 边耐政, 郑小裕. SQL与NoSQL数据库的统一查询模型的设计与实现 [C]. 见: 电子教育, 电子商务与信息管理国际会议, 上海, 中国. 2014. (Bian Naizheng, Zheng Xiaoyu. Design and Implementation of Relation Database and Non-Relation Database Unified Query Model [C]. In: Proceedings of the 2014 International Conference on E-Education, E-Business and Information Management, Shanghai, China. 2014.)
[9] Cattell R. Scalable SQL and NoSQL Data Stores [J]. ACM SIGMOD Record, 2010, 39(4): 12-27.
[10] Hadjigeorgiou C. RDBMS vs NoSQL: Performance and Scaling Comparison [EB/OL]. [2014-10-30]. http://static.ph. ed.ac.uk/dissertations/hpc-msc/2012-2013/RDBMS%20vs%20NoSQL%20-%20Performance%20and%20Scaling%20Comparison.pdf.
[11] Nance C, Losser T, Iype R, et al. NoSQL vs RDBMS-Why There is Room for Both [C]. In: Proceedings of the 2013 Southern Association for Information Systems. 2013.
[12] Moniruzzaman A B M, Hossain S A. NoSQL Database: New Era of Databases for Big Data Analytics-Classification, Characteristics and Comparison [OL]. arXiv, 2013. arXiv: 1307. 0191.
[13] Boicea A, Radulescu F, Agapin L I. MongoDB vs Oracle-Database Comparison [C]. In: Proceedings of the 3rd International Conference on Emerging Intelligent Data and Web Technologies, Bucharest, Romania. 2012.
[14] HBase [EB/OL]. [2014-10-30]. http://hbase.apache.org/.
[15] Chang F, Dean J, Ghemawat S, et al. Bigtable: A Distributed Storage System for Structured Data [J]. ACM Transactions on Computer Systems (TOCS), 2008, 26(2): Article No.4.
[16] Johnson J A. Pharmacogenetics: Potential for Individualized Drug Therapy Through Genetics [J]. Trends in Genet, 2003, 19(11): 660-666.

[1] Chang Zhijun,Qian Li,Xie Jing,Wu Zhenxin,Zhang Hu,Yu Qianqian,Wang Ying,Wang Yongji. Big Data Platform for Sci-Tech Literature Based on Distributed Technology[J]. 数据分析与知识发现, 2021, 5(3): 69-77.
[2] Chen Shiji, Qiu Junping, Yu Bo. Topic Analysis of LIS Big Data Research with Overlay Mapping[J]. 数据分析与知识发现, 2021, 5(10): 51-59.
[3] Zhao Yuxiang,Lian Jingwen. Review of Cultural Heritage Crowdsourcing in the Domain of Digital Humanities[J]. 数据分析与知识发现, 2021, 5(1): 36-55.
[4] Qiu Erli,He Hongwei,Yi Chengqi,Li Huiying. Research on Public Policy Support Based on Character-level CNN Technology[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[5] Wang Jiandong,Yu Shiyang. Principles on Constructing National Economic Brain[J]. 数据分析与知识发现, 2020, 4(7): 2-17.
[6] Jiandong Wang. Monitoring and Forecasting Economic Performance with Big Data[J]. 数据分析与知识发现, 2020, 4(1): 12-26.
[7] Beibei Kong,Jing Xie,Li Qian,Zhijun Chang,Zhenxin Wu. Methodology and Tools to Enrich Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(7): 113-122.
[8] Xiaozhou Dong,Xinkang Chen. E-Coupon and Economic Performance of E-commerce[J]. 数据分析与知识发现, 2019, 3(6): 42-49.
[9] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[10] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[11] Li Qian,Jing Xie,Zhijun Chang,Zhenxin Wu,Dongrong Zhang. Designing Smart Knowledge Services with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 4-14.
[12] Jiying Hu,Jing Xie,Li Qian,Changlei Fu. Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph[J]. 数据分析与知识发现, 2019, 3(1): 55-62.
[13] Jing Xie,Li Qian,Hongbo Shi,Beibei Kong,Jiying Hu. Designing Framework for Precise Service of Scholarly Big Data[J]. 数据分析与知识发现, 2019, 3(1): 63-71.
[14] Shen Zhihong,Yao Chang,Hou Yanfei,Wu Linhuan,Li Yuepeng. Big Linked Data Management: Challenges, Solutions and Practices[J]. 数据分析与知识发现, 2018, 2(1): 9-20.
[15] Yang Cao,Wenfei Fan,Tengfei Yuan. Is Big Data Analytics Beyond the Reach of Small Companies?[J]. 数据分析与知识发现, 2017, 1(9): 1-7.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn