Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (12): 80-88    DOI: 10.11925/infotech.1003-3513.2015.12.12
Current Issue | Archive | Adv Search |
Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data
Yang Lin, Li Jiao, Hou Li, Qian Qing
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims at designing an appropriate curation process to deal with cross-disciplinary data management in environmental health field in a stable and sustainable manner. [Methods] Referring to Digital Curation Center (DCC) Curation Lifecycle Model, the authors formulate environmental health data processing procedure in a standardized workflow and make the contents of each module with rigorous definition. [Results] The workflow is applied to curate climate data and hosptial registry data, that provides backend support for the environmental health part of the medical knowledge service system. The result shows it could practically help manage cross-disciplinary data. [Limitations] Due to the diversity of demand, the workflow needs further specification in data model, data standardization, etc. [Conclusions] The workflow could effectively incorporate curators with different backgrounds, take into account both the data quality and data size, and help curate cross-disciplinary data.

Received: 06 July 2015      Published: 06 April 2016
:  TP311  
  X18  

Cite this article:

Yang Lin, Li Jiao, Hou Li, Qian Qing. Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data. New Technology of Library and Information Service, 2015, 31(12): 80-88.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.12.12     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I12/80

[1] Boulton R, Campbell P, Collins B, et al. Science as an Open Enterprise [R]. London: Royal Society, 2012.
[2] eScience--A Transformed Scientific Method [EB/OL]. [2015- 06-02]. http://research.microsoft.com/en-us/um/people/gray/talks/ NRC-CSTB_eScience.ppt.
[3] Beagrie N, Pothen P. Digital Curation: Digital Archives, Libraries and e-Science Seminar [J/OL]. Ariadne, 2002(30): 98-102. [2015-06-04]. http://www.ariadne.ac.uk/issue30/digital- curation.
[4] Specialization in Data Curation [EB/OL]. [2015-06-20]. http://www.lis.illinois.edu/academics/degrees/specializations/data_curation.
[5] e-Science Data Curation [EB/OL]. [2015-06-20]. http://www. jisc.ac.uk/publications/generalpublications/2004/pub_escience.aspx.
[6] What is Digital Curation? [EB/OL]. [2015-06-20]. http:// www.dcc.ac.uk/digital-curation/what-digital-curation.
[7] 杨鹤林. 数据监护: 美国高校图书馆的新探索[J]. 大学图书馆学报, 2011, 29(2):18-21. (Yang Helin. Data Curation: A New Development of University Libraries in the U. S. [J]. Journal of Academic Libraries, 2011, 29(2): 18-21, 41.)
[8] 王芳, 慎金花. 国外数据管护(Data Curation)研究与实践进展[J]. 中国图书馆学报, 2014, 40(4): 116-128. (Wang Fang, Shen Jinhua. Advances in Data Curation Abroad: Research and Practice [J]. Journal of Library Science in China, 2014, 40(4): 116-128.)
[9] What is Biocuration? [EB/OL]. [2015-06-10]. http://www. biocurator.org/what.shtml.
[10] The 8th International Biocuration Conference [EB/OL]. [2015-06-02]. http://biocuration2015.big.ac.cn/.
[11] Zhang Z, Zhu W M, Luo J C. Bringing Biocuration to China [J]. Genomics, Proteomics & Bioinformatics, 2014, 12(4): 153-155.
[12] Atkinson R W, Kang S, Anderson H R, et al. Epidemiological Time Series Studies of PM2.5 and Daily Mortality and Hospital Admissions: A Systematic Review and Meta-analysis [J]. Thorax, 2014, 69(7): 660-665.
[13] Franchini M, Guida A, Tufano A, et al. Air Pollution, Vascular Disease and Thrombosis: Linking Clinical Data and Pathogenic Mechanisms [J]. Journal of Thrombosis and Haemostasis, 2012, 10(12): 2438-2451.
[14] Shang Y, Sun Z, Cao J, et al. Systematic Review of Chinese Studies of Short-term Exposure to Air Pollution and Daily Mortality [J]. Environment International, 2013, 54(4): 100-111.
[15] Trtanj J M, Houston T G. Climate Variability and Change Data and Information for Global Public Health [A].// Global Climate Change and Public Health[M]. Springer New York, 2014: 21-30.
[16] 张智雄, 吴振新, 刘建华, 等. Digital Curation和Digital Preservation之概念辨析[J]. 现代图书情报技术, 2014(1): 4-13. (Zhang Zhixiong, Wu Zhenxin, Liu Jianhua, et al. Analysis of the Difference Between Digital Curation and Digital Preservation [J]. New Technology of Library and Information Service, 2014(1): 4-13.)
[17] DCC Curation Lifecycle Model [EB/OL]. [2015-06-10]. http://www.dcc.ac.uk/resources/curation-lifecycle-model.
[18] Data Documentation Initiative [EB/OL]. [2015-06-10]. http://www.ddialliance.org/.
[19] Create & Manage Data Research Data Lifecycle [EB/OL]. [2015-06-11]. http://www.data-archive.ac.uk/create-manage/ life-cycle.
[20] Johnston L. A Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot [R/OL]. [2015-06-10]. University of Minnesota Digital Conservancy. http://hdl.handle.net/11299/ 162338.
[21] Salimi N, Vita R. The Biocurator: Connecting and Enhancing Scientific Data [J]. PLoS Computational Biology, 2006, 2(10): e125.
[22] Howe D, Costanzo M, Fey P, et al. Big Data: The Future of Biocuration [J]. Nature, 2008, 455(7209): 47-50.
[23] Biocuration in UniProt [EB/OL]. [2015-06-10]. http://www. uniprot.org/help/biocuration.
[24] Stieb D M, Szyszkowicz M, Rowe B H, et al. Air Pollution and Emergency Department Visits for Cardiac and Respiratory Conditions: A Multi-city Time-series Analysis [J]. Environmental Health: A Global Access Science Source, 2009, 8(13): 1841-1860.
[25] Dominici F, Peng R D, Bell M L, et al. Fine Particulate Air Pollution and Hospital Admission for Cardiovascular and Respiratory Diseases [J]. JAMA, 2006, 295(10): 1127-1134.
[26] Zhang Y, Feng C, Ma C, et al. The Impact of Temperature and Humidity Measures on Influenza A (H7N9) Outbreaks— Evidence from China [J]. International Journal of Infectious Diseases, 2015, 30: 122-124.
[27] Kan H, Wong C M, Vichit-Vadakan N, et al. Short-term Association Between Sulfur Dioxide and Daily Mortality: The Public Health and Air Pollution in Asia (PAPA) Study [J]. Environmental Research, 2010, 110(3): 258-264.
[28] Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and Effect Modification in the Short-term Effects of Ambient Particles on Total Mortality: Results from 29 European Cities within the APHEA2 Project [J]. Epidemiology, 2001, 12(5): 521-531.
[29] Wang T, Li G X, Sun J, et al. Association Between Ambient Particulate Matter and Daily Cause-specific Mortality in Tanggu, Tianjin Binhai New Area, China [J]. International Journal of Environmental Health Research, 2013, 23(3): 205-214.
[30] Almeida S M, Silva A V, Sarmento S. Effects of Exposure to Particles and Ozone on Hospital Admissions for Cardiorespiratory Diseases in SetúBal, Portugal [J]. Journal of Toxicology & Environmental Health Part A, 2014, 77(14-16): 837-848.
[31] Cox L A Jr, Popken D A, Ricci P F. Warmer is Healthier: Effects on Mortality Rates of Changes in Average Fine Particulate Matter (PM2.5) Concentrations and Temperatures in 100 US Cities [J]. Regulatory Toxicology and Pharmacology, 2013, 66(3): 336-346.
[32] Zanobetti A, Dominici F, Wang Y, et al. A National Case-crossover Analysis of the Short-term Effect of PM2.5 on Hospitalizations and Mortality in Subjects with Diabetes and Neurological Disorders [J]. Environmental Health, 2014, 13(1): 38.
[33] Stafoggia M, Samoli E, Alessandrini E, et al. Short-term Associations Between Fine and Coarse Particulate Matter and Hospitalizations in Southern Europe: Results from the MED-PARTICLES Project [J]. Environmental Health Perspectives, 2013, 121(9): 1026-l033.
[34] Sinclair A H, Edgerton E S, Wyzga R, et al. A Two-time- period Comparison of the Effects of Ambient Air Pollution on Outpatient Visits for Acute Respiratory Illnesses [J]. Journal of the Air & Waste Management Association, 2010, 60(2): 163-175.
[35] Launch of Health Theme of Climate. Data.Gov [EB/OL]. [2015-06-09]. https://www.data.gov/climate/humanhealth/ highlights.
[36] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-sampling Technique [J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
[37] Wang M Z, Zheng S, He S L, et al. The Association Between Diurnal Temperature Range and Emergency Room Admissions for Cardiovascular, Respiratory, Digestive and Genitourinary Disease among the Elderly: A Time Series Study [J]. Science of the Total Environment, 2013, 456-457: 370-375.

[1] Chang Zhijun,Qian Li,Xie Jing,Wu Zhenxin,Zhang Hu,Yu Qianqian,Wang Ying,Wang Yongji. Big Data Platform for Sci-Tech Literature Based on Distributed Technology[J]. 数据分析与知识发现, 2021, 5(3): 69-77.
[2] Su Qing,Chen Sizhao,Wu Weimin,Li Xiaomei,Huang Tiankuan. Personalized Recommendation Model Based on Collaborative Filtering Algorithm of Learning Situation[J]. 数据分析与知识发现, 2020, 4(5): 105-117.
[3] Yang Xu,Qian Xiaodong. Synchronous Clustering Algorithm for Social Networks Based on Improved Vicsek Model[J]. 数据分析与知识发现, 2020, 4(4): 119-128.
[4] Xinyu Zai,Xuedong Tian. Retrieving Scientific Documents with Formula Description Structure and Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 131-138.
[5] Li Jie,Yang Fang,Xu Chenxi. A Personalized Recommendation Algorithm with Temporal Dynamics and Sequential Patterns[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[6] Zhai Dongsheng,Hu Dengjin,Zhang Jie,He Xijun,Liu He. Hierarchical Classification Model for Invention Patents[J]. 数据分析与知识发现, 2017, 1(12): 63-73.
[7] Yang Jianlin,Liu Yang. Evaluating PU Learning Based on Associative Classification Algorithm[J]. 数据分析与知识发现, 2017, 1(11): 12-18.
[8] Han Pu,Wang Peng. Simulating Public Opinion Evolution with Scale-Free Network Model and Infectious Disease Model[J]. 数据分析与知识发现, 2017, 1(10): 53-63.
[9] Chen Runwen,Qiu Yong,Huang Wenbin,Wang Jun. Analyzing Private College Students’ Online Lifestyle with Web-logs[J]. 数据分析与知识发现, 2017, 1(8): 31-38.
[10] Xia Lixin,Yang Jinqing,Cheng Xiufeng. Collecting Mobile Data Based on Content Awareness——An Empirical Study[J]. 数据分析与知识发现, 2017, 1(5): 82-93.
[11] Qing Yaxian,Li Rui,Wu Huayi. Analyzing Academic Community Based on Co-author Network[J]. 数据分析与知识发现, 2017, 1(4): 20-29.
[12] Shen Xuefeng,Ke Yongzhen,Yao Nan. Visualization of Coalition Data Based on Multi View Cooperation[J]. 数据分析与知识发现, 2017, 1(3): 21-28.
[13] Xie Mengyao,Pan Xuwei. Constructing Dynamic Social Tag Cloud for User Interests[J]. 数据分析与知识发现, 2017, 1(2): 35-40.
[14] Huang Mingxuan. Cross Language Information Retrieval Model Based on Matrix-weighted Association Patterns Mining[J]. 数据分析与知识发现, 2017, 1(1): 26-36.
[15] Gao Feng, Xiong Jing, Liu Yongge. Research on the Extenics of Oracle Bone Inscriptions Interpretation Based on HowNet[J]. 现代图书情报技术, 2015, 31(7-8): 58-64.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn