Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (12): 80-88    DOI: 10.11925/infotech.1003-3513.2015.12.12
Current Issue | Archive | Adv Search |
Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data
Yang Lin, Li Jiao, Hou Li, Qian Qing
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download: PDF(2153 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims at designing an appropriate curation process to deal with cross-disciplinary data management in environmental health field in a stable and sustainable manner. [Methods] Referring to Digital Curation Center (DCC) Curation Lifecycle Model, the authors formulate environmental health data processing procedure in a standardized workflow and make the contents of each module with rigorous definition. [Results] The workflow is applied to curate climate data and hosptial registry data, that provides backend support for the environmental health part of the medical knowledge service system. The result shows it could practically help manage cross-disciplinary data. [Limitations] Due to the diversity of demand, the workflow needs further specification in data model, data standardization, etc. [Conclusions] The workflow could effectively incorporate curators with different backgrounds, take into account both the data quality and data size, and help curate cross-disciplinary data.

Received: 06 July 2015      Published: 06 April 2016
:  TP311  
  X18  

Cite this article:

Yang Lin, Li Jiao, Hou Li, Qian Qing. Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data. New Technology of Library and Information Service, 2015, 31(12): 80-88.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.12.12     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I12/80

[1] Boulton R, Campbell P, Collins B, et al. Science as an Open Enterprise [R]. London: Royal Society, 2012.
[2] eScience--A Transformed Scientific Method [EB/OL]. [2015- 06-02]. http://research.microsoft.com/en-us/um/people/gray/talks/ NRC-CSTB_eScience.ppt.
[3] Beagrie N, Pothen P. Digital Curation: Digital Archives, Libraries and e-Science Seminar [J/OL]. Ariadne, 2002(30): 98-102. [2015-06-04]. http://www.ariadne.ac.uk/issue30/digital- curation.
[4] Specialization in Data Curation [EB/OL]. [2015-06-20]. http://www.lis.illinois.edu/academics/degrees/specializations/data_curation.
[5] e-Science Data Curation [EB/OL]. [2015-06-20]. http://www. jisc.ac.uk/publications/generalpublications/2004/pub_escience.aspx.
[6] What is Digital Curation? [EB/OL]. [2015-06-20]. http:// www.dcc.ac.uk/digital-curation/what-digital-curation.
[7] 杨鹤林. 数据监护: 美国高校图书馆的新探索[J]. 大学图书馆学报, 2011, 29(2):18-21. (Yang Helin. Data Curation: A New Development of University Libraries in the U. S. [J]. Journal of Academic Libraries, 2011, 29(2): 18-21, 41.)
[8] 王芳, 慎金花. 国外数据管护(Data Curation)研究与实践进展[J]. 中国图书馆学报, 2014, 40(4): 116-128. (Wang Fang, Shen Jinhua. Advances in Data Curation Abroad: Research and Practice [J]. Journal of Library Science in China, 2014, 40(4): 116-128.)
[9] What is Biocuration? [EB/OL]. [2015-06-10]. http://www. biocurator.org/what.shtml.
[10] The 8th International Biocuration Conference [EB/OL]. [2015-06-02]. http://biocuration2015.big.ac.cn/.
[11] Zhang Z, Zhu W M, Luo J C. Bringing Biocuration to China [J]. Genomics, Proteomics & Bioinformatics, 2014, 12(4): 153-155.
[12] Atkinson R W, Kang S, Anderson H R, et al. Epidemiological Time Series Studies of PM2.5 and Daily Mortality and Hospital Admissions: A Systematic Review and Meta-analysis [J]. Thorax, 2014, 69(7): 660-665.
[13] Franchini M, Guida A, Tufano A, et al. Air Pollution, Vascular Disease and Thrombosis: Linking Clinical Data and Pathogenic Mechanisms [J]. Journal of Thrombosis and Haemostasis, 2012, 10(12): 2438-2451.
[14] Shang Y, Sun Z, Cao J, et al. Systematic Review of Chinese Studies of Short-term Exposure to Air Pollution and Daily Mortality [J]. Environment International, 2013, 54(4): 100-111.
[15] Trtanj J M, Houston T G. Climate Variability and Change Data and Information for Global Public Health [A].// Global Climate Change and Public Health[M]. Springer New York, 2014: 21-30.
[16] 张智雄, 吴振新, 刘建华, 等. Digital Curation和Digital Preservation之概念辨析[J]. 现代图书情报技术, 2014(1): 4-13. (Zhang Zhixiong, Wu Zhenxin, Liu Jianhua, et al. Analysis of the Difference Between Digital Curation and Digital Preservation [J]. New Technology of Library and Information Service, 2014(1): 4-13.)
[17] DCC Curation Lifecycle Model [EB/OL]. [2015-06-10]. http://www.dcc.ac.uk/resources/curation-lifecycle-model.
[18] Data Documentation Initiative [EB/OL]. [2015-06-10]. http://www.ddialliance.org/.
[19] Create & Manage Data Research Data Lifecycle [EB/OL]. [2015-06-11]. http://www.data-archive.ac.uk/create-manage/ life-cycle.
[20] Johnston L. A Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot [R/OL]. [2015-06-10]. University of Minnesota Digital Conservancy. http://hdl.handle.net/11299/ 162338.
[21] Salimi N, Vita R. The Biocurator: Connecting and Enhancing Scientific Data [J]. PLoS Computational Biology, 2006, 2(10): e125.
[22] Howe D, Costanzo M, Fey P, et al. Big Data: The Future of Biocuration [J]. Nature, 2008, 455(7209): 47-50.
[23] Biocuration in UniProt [EB/OL]. [2015-06-10]. http://www. uniprot.org/help/biocuration.
[24] Stieb D M, Szyszkowicz M, Rowe B H, et al. Air Pollution and Emergency Department Visits for Cardiac and Respiratory Conditions: A Multi-city Time-series Analysis [J]. Environmental Health: A Global Access Science Source, 2009, 8(13): 1841-1860.
[25] Dominici F, Peng R D, Bell M L, et al. Fine Particulate Air Pollution and Hospital Admission for Cardiovascular and Respiratory Diseases [J]. JAMA, 2006, 295(10): 1127-1134.
[26] Zhang Y, Feng C, Ma C, et al. The Impact of Temperature and Humidity Measures on Influenza A (H7N9) Outbreaks— Evidence from China [J]. International Journal of Infectious Diseases, 2015, 30: 122-124.
[27] Kan H, Wong C M, Vichit-Vadakan N, et al. Short-term Association Between Sulfur Dioxide and Daily Mortality: The Public Health and Air Pollution in Asia (PAPA) Study [J]. Environmental Research, 2010, 110(3): 258-264.
[28] Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and Effect Modification in the Short-term Effects of Ambient Particles on Total Mortality: Results from 29 European Cities within the APHEA2 Project [J]. Epidemiology, 2001, 12(5): 521-531.
[29] Wang T, Li G X, Sun J, et al. Association Between Ambient Particulate Matter and Daily Cause-specific Mortality in Tanggu, Tianjin Binhai New Area, China [J]. International Journal of Environmental Health Research, 2013, 23(3): 205-214.
[30] Almeida S M, Silva A V, Sarmento S. Effects of Exposure to Particles and Ozone on Hospital Admissions for Cardiorespiratory Diseases in SetúBal, Portugal [J]. Journal of Toxicology & Environmental Health Part A, 2014, 77(14-16): 837-848.
[31] Cox L A Jr, Popken D A, Ricci P F. Warmer is Healthier: Effects on Mortality Rates of Changes in Average Fine Particulate Matter (PM2.5) Concentrations and Temperatures in 100 US Cities [J]. Regulatory Toxicology and Pharmacology, 2013, 66(3): 336-346.
[32] Zanobetti A, Dominici F, Wang Y, et al. A National Case-crossover Analysis of the Short-term Effect of PM2.5 on Hospitalizations and Mortality in Subjects with Diabetes and Neurological Disorders [J]. Environmental Health, 2014, 13(1): 38.
[33] Stafoggia M, Samoli E, Alessandrini E, et al. Short-term Associations Between Fine and Coarse Particulate Matter and Hospitalizations in Southern Europe: Results from the MED-PARTICLES Project [J]. Environmental Health Perspectives, 2013, 121(9): 1026-l033.
[34] Sinclair A H, Edgerton E S, Wyzga R, et al. A Two-time- period Comparison of the Effects of Ambient Air Pollution on Outpatient Visits for Acute Respiratory Illnesses [J]. Journal of the Air & Waste Management Association, 2010, 60(2): 163-175.
[35] Launch of Health Theme of Climate. Data.Gov [EB/OL]. [2015-06-09]. https://www.data.gov/climate/humanhealth/ highlights.
[36] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-sampling Technique [J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
[37] Wang M Z, Zheng S, He S L, et al. The Association Between Diurnal Temperature Range and Emergency Room Admissions for Cardiovascular, Respiratory, Digestive and Genitourinary Disease among the Elderly: A Time Series Study [J]. Science of the Total Environment, 2013, 456-457: 370-375.

[1] Gao Feng, Xiong Jing, Liu Yongge. Research on the Extenics of Oracle Bone Inscriptions Interpretation Based on HowNet[J]. 现代图书情报技术, 2015, 31(7-8): 58-64.
[2] Liu Huoyu, Wang Dongbo. Research and Implementation of Data Preprocessing Oriented to Paper Similarity Detection[J]. 现代图书情报技术, 2015, 31(5): 50-56.
[3] Liu Haoxia, Peng Shanglian. A Community Detection Algorithm via Neighborhood Node Influence Based Label Propagation[J]. 现代图书情报技术, 2015, 31(4): 58-64.
[4] Zhuo Keqiu, Yu Wei, Su Xinning. Parallel Implementing Bursty Events Detection Using MapReduce[J]. 现代图书情报技术, 2015, 31(2): 46-54.
[5] Liu Zhen, Zhang Zhixiong. Survey of Technical Methods and Tools of RDB-to-RDF[J]. 现代图书情报技术, 2014, 30(11): 17-23.
[6] Tan Xueqing, He Shan. Research Review on Music Personalized Recommendation System[J]. 现代图书情报技术, 2014, 30(9): 22-32.
[7] Wang Chuanqing, Bi Qiang. System Model of Digital Library Automatic Semantic Annotation Tool[J]. 现代图书情报技术, 2014, 30(6): 17-24.
[8] Wang Yong, Xiao Shibin, Guo Yixiu, Lv Xueqiang. Research on Chinese Micro-blog Bursty Topics Detection[J]. 现代图书情报技术, 2013, 29(2): 57-62.
[9] Wang Huaqiu. Research of a Collaborative Filtering Algorithm Based on Harmony Search[J]. 现代图书情报技术, 2012, (12): 79-84.
[10] Yao Zhanlei, Guo Jinlong, Xu Xin. QA System Design and Implementation in Collaborative Virtual Reference Service[J]. 现代图书情报技术, 2012, (9): 15-22.
[11] Xiao Ming, Li Wenchao, Xia Qiuju. Mapping the Themes of Information Retrieval Based on Prefuse and Hierarchical Clustering[J]. 现代图书情报技术, 2012, 28(4): 35-40.
[12] Wang Jie. Application of Android in Book Acquisition Duplicate Checking System[J]. 现代图书情报技术, 2012, 28(4): 84-88.
[13] Liu Tian, Zhang Wende. Development of Copyright Valuation System in Profit Digital Library[J]. 现代图书情报技术, 2012, 28(4): 89-94.
[14] Tai Lijun, Hu Rufu, Zhao Han, Chen Caowei. Application Research of Improved Genetic Neural Network Algorithm in Sales Forecast[J]. 现代图书情报技术, 2012, 28(1): 63-67.
[15] Zhao Yan, Su Yuzhao, Guan Tao. A Method of Data Collecting to Improve the Precision of Filtering User Preference[J]. 现代图书情报技术, 2011, (11): 31-37.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn