Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (12): 80-88    DOI: 10.11925/infotech.1003-3513.2015.12.12
  应用论文 本期目录 | 过刊浏览 | 高级检索 |
跨领域数据审编(Curation)流程研究——以环境健康数据为例
杨林, 李姣, 侯丽, 钱庆
中国医学科学院医学信息研究所 北京 100020
Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data
Yang Lin, Li Jiao, Hou Li, Qian Qing
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
全文: PDF(2153 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]顺应环境健康跨领域科学数据管理需求, 探索跨领域数据的审编工作流程, 为推进相关领域数据管理工作提供可行性方案。[方法]基于环境健康领域研究, 在DCC审编生命周期模型指导下, 构建环境健康数据审编工作流程, 明确各审编模块的内容以及人工审编、自动化审编的边界。[结果]应用构建的工作流审编气象环境数据与医院就诊数据, 可支撑医药卫生知识服务系统中环境健康数据部分的审编工作, 结果显示环境健康数据审编工作流程具有一定的可操作性。[局限]由于需求的多样性, 流程在实际操作时需要在数据模型、数据规范化等方面进一步细化。[结论]环境健康数据审编流程能够有效地组织不同专业背景的审编人员, 兼顾数据质量和数据规模两方面的考量, 在解决跨领域数据审编方面具有一定的可操作性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] This study aims at designing an appropriate curation process to deal with cross-disciplinary data management in environmental health field in a stable and sustainable manner. [Methods] Referring to Digital Curation Center (DCC) Curation Lifecycle Model, the authors formulate environmental health data processing procedure in a standardized workflow and make the contents of each module with rigorous definition. [Results] The workflow is applied to curate climate data and hosptial registry data, that provides backend support for the environmental health part of the medical knowledge service system. The result shows it could practically help manage cross-disciplinary data. [Limitations] Due to the diversity of demand, the workflow needs further specification in data model, data standardization, etc. [Conclusions] The workflow could effectively incorporate curators with different backgrounds, take into account both the data quality and data size, and help curate cross-disciplinary data.

收稿日期: 2015-07-06     
:  TP311  
  X18  
基金资助:

本文系中央级公益性科研院所基本科研业务费资助课题“面向生物数据审编的科学数据管理系统框架研究”(项目编号:14R0105)和国家人口与健康科学数据共享平台资助的研究成果之一。

通讯作者: 钱庆, ORCID: 0000-0002-9072-586X, E-mail: qian.qing@imicams.ac.cn。     E-mail: qian.qing@imicams.ac.cn
作者简介: 作者贡献声明:杨林: 研究框架设计与实现, 论文撰写; 李姣: 审编流程设计与结果分析, 论文修改; 侯丽: 环境健康数据的收集与整理; 钱庆: 研究的总体设计与研究结果讨论。
引用本文:   
杨林, 李姣, 侯丽, 钱庆. 跨领域数据审编(Curation)流程研究——以环境健康数据为例[J]. 现代图书情报技术, 2015, 31(12): 80-88.
Yang Lin, Li Jiao, Hou Li, Qian Qing. Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2015.12.12.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.12.12

[1] Boulton R, Campbell P, Collins B, et al. Science as an Open Enterprise [R]. London: Royal Society, 2012.
[2] eScience--A Transformed Scientific Method [EB/OL]. [2015- 06-02]. http://research.microsoft.com/en-us/um/people/gray/talks/ NRC-CSTB_eScience.ppt.
[3] Beagrie N, Pothen P. Digital Curation: Digital Archives, Libraries and e-Science Seminar [J/OL]. Ariadne, 2002(30): 98-102. [2015-06-04]. http://www.ariadne.ac.uk/issue30/digital- curation.
[4] Specialization in Data Curation [EB/OL]. [2015-06-20]. http://www.lis.illinois.edu/academics/degrees/specializations/data_curation.
[5] e-Science Data Curation [EB/OL]. [2015-06-20]. http://www. jisc.ac.uk/publications/generalpublications/2004/pub_escience.aspx.
[6] What is Digital Curation? [EB/OL]. [2015-06-20]. http:// www.dcc.ac.uk/digital-curation/what-digital-curation.
[7] 杨鹤林. 数据监护: 美国高校图书馆的新探索[J]. 大学图书馆学报, 2011, 29(2):18-21. (Yang Helin. Data Curation: A New Development of University Libraries in the U. S. [J]. Journal of Academic Libraries, 2011, 29(2): 18-21, 41.)
[8] 王芳, 慎金花. 国外数据管护(Data Curation)研究与实践进展[J]. 中国图书馆学报, 2014, 40(4): 116-128. (Wang Fang, Shen Jinhua. Advances in Data Curation Abroad: Research and Practice [J]. Journal of Library Science in China, 2014, 40(4): 116-128.)
[9] What is Biocuration? [EB/OL]. [2015-06-10]. http://www. biocurator.org/what.shtml.
[10] The 8th International Biocuration Conference [EB/OL]. [2015-06-02]. http://biocuration2015.big.ac.cn/.
[11] Zhang Z, Zhu W M, Luo J C. Bringing Biocuration to China [J]. Genomics, Proteomics & Bioinformatics, 2014, 12(4): 153-155.
[12] Atkinson R W, Kang S, Anderson H R, et al. Epidemiological Time Series Studies of PM2.5 and Daily Mortality and Hospital Admissions: A Systematic Review and Meta-analysis [J]. Thorax, 2014, 69(7): 660-665.
[13] Franchini M, Guida A, Tufano A, et al. Air Pollution, Vascular Disease and Thrombosis: Linking Clinical Data and Pathogenic Mechanisms [J]. Journal of Thrombosis and Haemostasis, 2012, 10(12): 2438-2451.
[14] Shang Y, Sun Z, Cao J, et al. Systematic Review of Chinese Studies of Short-term Exposure to Air Pollution and Daily Mortality [J]. Environment International, 2013, 54(4): 100-111.
[15] Trtanj J M, Houston T G. Climate Variability and Change Data and Information for Global Public Health [A].// Global Climate Change and Public Health[M]. Springer New York, 2014: 21-30.
[16] 张智雄, 吴振新, 刘建华, 等. Digital Curation和Digital Preservation之概念辨析[J]. 现代图书情报技术, 2014(1): 4-13. (Zhang Zhixiong, Wu Zhenxin, Liu Jianhua, et al. Analysis of the Difference Between Digital Curation and Digital Preservation [J]. New Technology of Library and Information Service, 2014(1): 4-13.)
[17] DCC Curation Lifecycle Model [EB/OL]. [2015-06-10]. http://www.dcc.ac.uk/resources/curation-lifecycle-model.
[18] Data Documentation Initiative [EB/OL]. [2015-06-10]. http://www.ddialliance.org/.
[19] Create & Manage Data Research Data Lifecycle [EB/OL]. [2015-06-11]. http://www.data-archive.ac.uk/create-manage/ life-cycle.
[20] Johnston L. A Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot [R/OL]. [2015-06-10]. University of Minnesota Digital Conservancy. http://hdl.handle.net/11299/ 162338.
[21] Salimi N, Vita R. The Biocurator: Connecting and Enhancing Scientific Data [J]. PLoS Computational Biology, 2006, 2(10): e125.
[22] Howe D, Costanzo M, Fey P, et al. Big Data: The Future of Biocuration [J]. Nature, 2008, 455(7209): 47-50.
[23] Biocuration in UniProt [EB/OL]. [2015-06-10]. http://www. uniprot.org/help/biocuration.
[24] Stieb D M, Szyszkowicz M, Rowe B H, et al. Air Pollution and Emergency Department Visits for Cardiac and Respiratory Conditions: A Multi-city Time-series Analysis [J]. Environmental Health: A Global Access Science Source, 2009, 8(13): 1841-1860.
[25] Dominici F, Peng R D, Bell M L, et al. Fine Particulate Air Pollution and Hospital Admission for Cardiovascular and Respiratory Diseases [J]. JAMA, 2006, 295(10): 1127-1134.
[26] Zhang Y, Feng C, Ma C, et al. The Impact of Temperature and Humidity Measures on Influenza A (H7N9) Outbreaks— Evidence from China [J]. International Journal of Infectious Diseases, 2015, 30: 122-124.
[27] Kan H, Wong C M, Vichit-Vadakan N, et al. Short-term Association Between Sulfur Dioxide and Daily Mortality: The Public Health and Air Pollution in Asia (PAPA) Study [J]. Environmental Research, 2010, 110(3): 258-264.
[28] Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and Effect Modification in the Short-term Effects of Ambient Particles on Total Mortality: Results from 29 European Cities within the APHEA2 Project [J]. Epidemiology, 2001, 12(5): 521-531.
[29] Wang T, Li G X, Sun J, et al. Association Between Ambient Particulate Matter and Daily Cause-specific Mortality in Tanggu, Tianjin Binhai New Area, China [J]. International Journal of Environmental Health Research, 2013, 23(3): 205-214.
[30] Almeida S M, Silva A V, Sarmento S. Effects of Exposure to Particles and Ozone on Hospital Admissions for Cardiorespiratory Diseases in SetúBal, Portugal [J]. Journal of Toxicology & Environmental Health Part A, 2014, 77(14-16): 837-848.
[31] Cox L A Jr, Popken D A, Ricci P F. Warmer is Healthier: Effects on Mortality Rates of Changes in Average Fine Particulate Matter (PM2.5) Concentrations and Temperatures in 100 US Cities [J]. Regulatory Toxicology and Pharmacology, 2013, 66(3): 336-346.
[32] Zanobetti A, Dominici F, Wang Y, et al. A National Case-crossover Analysis of the Short-term Effect of PM2.5 on Hospitalizations and Mortality in Subjects with Diabetes and Neurological Disorders [J]. Environmental Health, 2014, 13(1): 38.
[33] Stafoggia M, Samoli E, Alessandrini E, et al. Short-term Associations Between Fine and Coarse Particulate Matter and Hospitalizations in Southern Europe: Results from the MED-PARTICLES Project [J]. Environmental Health Perspectives, 2013, 121(9): 1026-l033.
[34] Sinclair A H, Edgerton E S, Wyzga R, et al. A Two-time- period Comparison of the Effects of Ambient Air Pollution on Outpatient Visits for Acute Respiratory Illnesses [J]. Journal of the Air & Waste Management Association, 2010, 60(2): 163-175.
[35] Launch of Health Theme of Climate. Data.Gov [EB/OL]. [2015-06-09]. https://www.data.gov/climate/humanhealth/ highlights.
[36] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-sampling Technique [J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
[37] Wang M Z, Zheng S, He S L, et al. The Association Between Diurnal Temperature Range and Emergency Room Admissions for Cardiovascular, Respiratory, Digestive and Genitourinary Disease among the Elderly: A Time Series Study [J]. Science of the Total Environment, 2013, 456-457: 370-375.

[1] 高峰, 熊晶, 刘永革. 基于知网的甲骨卜辞释义问题的可拓性研究[J]. 现代图书情报技术, 2015, 31(7-8): 58-64.
[2] 刘伙玉, 王东波. 面向论文相似性检测的数据预处理研究[J]. 现代图书情报技术, 2015, 31(5): 50-56.
[3] 刘郝霞, 彭商濂. 一种基于邻近节点影响强度标签传播社区发现方法[J]. 现代图书情报技术, 2015, 31(4): 58-64.
[4] 卓可秋, 虞为, 苏新宁. 突发事件检测的MapReduce并行化实现[J]. 现代图书情报技术, 2015, 31(2): 46-54.
[5] 刘振, 张智雄. RDB-to-RDF的技术方法和工具综述[J]. 现代图书情报技术, 2014, 30(11): 17-23.
[6] 谭学清, 何珊. 音乐个性化推荐系统研究综述[J]. 现代图书情报技术, 2014, 30(9): 22-32.
[7] 王传清, 毕强. 数字图书馆自动化语义标注工具系统模型研究[J]. 现代图书情报技术, 2014, 30(6): 17-24.
[8] 王勇, 肖诗斌, 郭跇秀, 吕学强. 中文微博突发事件检测研究[J]. 现代图书情报技术, 2013, 29(2): 57-62.
[9] 王华秋. 一种基于和声搜索的协同过滤算法研究[J]. 现代图书情报技术, 2012, (12): 79-84.
[10] 姚占雷, 郭金龙, 许鑫. 联合虚拟参考咨询中的自动问答系统设计与实现[J]. 现代图书情报技术, 2012, (9): 15-22.
[11] 肖明, 栗文超, 夏秋菊. 基于Prefuse和层次聚类的信息检索主题知识图谱研究[J]. 现代图书情报技术, 2012, 28(4): 35-40.
[12] 王杰. Android在图书查重系统中的应用[J]. 现代图书情报技术, 2012, 28(4): 84-88.
[13] 刘田, 张文德. 营利性数字图书馆著作权评估系统开发[J]. 现代图书情报技术, 2012, 28(4): 89-94.
[14] 邰丽君, 胡如夫, 赵韩, 陈曹维. 改进遗传神经网络算法在销售预测中的应用研究[J]. 现代图书情报技术, 2012, 28(1): 63-67.
[15] 赵妍, 苏玉召, 管涛. 一种提高过滤用户偏好精度的数据采集方法[J]. 现代图书情报技术, 2011, (11): 31-37.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn