Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (12): 80-88     https://doi.org/10.11925/infotech.1003-3513.2015.12.12
  应用论文 本期目录 | 过刊浏览 | 高级检索 |
跨领域数据审编(Curation)流程研究——以环境健康数据为例
杨林, 李姣, 侯丽, 钱庆
中国医学科学院医学信息研究所 北京 100020
Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data
Yang Lin, Li Jiao, Hou Li, Qian Qing
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
全文: PDF (2153 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]顺应环境健康跨领域科学数据管理需求, 探索跨领域数据的审编工作流程, 为推进相关领域数据管理工作提供可行性方案。[方法]基于环境健康领域研究, 在DCC审编生命周期模型指导下, 构建环境健康数据审编工作流程, 明确各审编模块的内容以及人工审编、自动化审编的边界。[结果]应用构建的工作流审编气象环境数据与医院就诊数据, 可支撑医药卫生知识服务系统中环境健康数据部分的审编工作, 结果显示环境健康数据审编工作流程具有一定的可操作性。[局限]由于需求的多样性, 流程在实际操作时需要在数据模型、数据规范化等方面进一步细化。[结论]环境健康数据审编流程能够有效地组织不同专业背景的审编人员, 兼顾数据质量和数据规模两方面的考量, 在解决跨领域数据审编方面具有一定的可操作性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] This study aims at designing an appropriate curation process to deal with cross-disciplinary data management in environmental health field in a stable and sustainable manner. [Methods] Referring to Digital Curation Center (DCC) Curation Lifecycle Model, the authors formulate environmental health data processing procedure in a standardized workflow and make the contents of each module with rigorous definition. [Results] The workflow is applied to curate climate data and hosptial registry data, that provides backend support for the environmental health part of the medical knowledge service system. The result shows it could practically help manage cross-disciplinary data. [Limitations] Due to the diversity of demand, the workflow needs further specification in data model, data standardization, etc. [Conclusions] The workflow could effectively incorporate curators with different backgrounds, take into account both the data quality and data size, and help curate cross-disciplinary data.

收稿日期: 2015-07-06      出版日期: 2016-04-06
:  TP311  
  X18  
基金资助:

本文系中央级公益性科研院所基本科研业务费资助课题“面向生物数据审编的科学数据管理系统框架研究”(项目编号:14R0105)和国家人口与健康科学数据共享平台资助的研究成果之一。

通讯作者: 钱庆, ORCID: 0000-0002-9072-586X, E-mail: qian.qing@imicams.ac.cn。     E-mail: qian.qing@imicams.ac.cn
作者简介: 作者贡献声明:杨林: 研究框架设计与实现, 论文撰写; 李姣: 审编流程设计与结果分析, 论文修改; 侯丽: 环境健康数据的收集与整理; 钱庆: 研究的总体设计与研究结果讨论。
引用本文:   
杨林, 李姣, 侯丽, 钱庆. 跨领域数据审编(Curation)流程研究——以环境健康数据为例[J]. 现代图书情报技术, 2015, 31(12): 80-88.
Yang Lin, Li Jiao, Hou Li, Qian Qing. Cross-disciplinary Data Curation Workflow: A Case Study of Environmental Health Data. New Technology of Library and Information Service, 2015, 31(12): 80-88.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.12.12      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2015/V31/I12/80

[1] Boulton R, Campbell P, Collins B, et al. Science as an Open Enterprise [R]. London: Royal Society, 2012.
[2] eScience--A Transformed Scientific Method [EB/OL]. [2015- 06-02]. http://research.microsoft.com/en-us/um/people/gray/talks/ NRC-CSTB_eScience.ppt.
[3] Beagrie N, Pothen P. Digital Curation: Digital Archives, Libraries and e-Science Seminar [J/OL]. Ariadne, 2002(30): 98-102. [2015-06-04]. http://www.ariadne.ac.uk/issue30/digital- curation.
[4] Specialization in Data Curation [EB/OL]. [2015-06-20]. http://www.lis.illinois.edu/academics/degrees/specializations/data_curation.
[5] e-Science Data Curation [EB/OL]. [2015-06-20]. http://www. jisc.ac.uk/publications/generalpublications/2004/pub_escience.aspx.
[6] What is Digital Curation? [EB/OL]. [2015-06-20]. http:// www.dcc.ac.uk/digital-curation/what-digital-curation.
[7] 杨鹤林. 数据监护: 美国高校图书馆的新探索[J]. 大学图书馆学报, 2011, 29(2):18-21. (Yang Helin. Data Curation: A New Development of University Libraries in the U. S. [J]. Journal of Academic Libraries, 2011, 29(2): 18-21, 41.)
[8] 王芳, 慎金花. 国外数据管护(Data Curation)研究与实践进展[J]. 中国图书馆学报, 2014, 40(4): 116-128. (Wang Fang, Shen Jinhua. Advances in Data Curation Abroad: Research and Practice [J]. Journal of Library Science in China, 2014, 40(4): 116-128.)
[9] What is Biocuration? [EB/OL]. [2015-06-10]. http://www. biocurator.org/what.shtml.
[10] The 8th International Biocuration Conference [EB/OL]. [2015-06-02]. http://biocuration2015.big.ac.cn/.
[11] Zhang Z, Zhu W M, Luo J C. Bringing Biocuration to China [J]. Genomics, Proteomics & Bioinformatics, 2014, 12(4): 153-155.
[12] Atkinson R W, Kang S, Anderson H R, et al. Epidemiological Time Series Studies of PM2.5 and Daily Mortality and Hospital Admissions: A Systematic Review and Meta-analysis [J]. Thorax, 2014, 69(7): 660-665.
[13] Franchini M, Guida A, Tufano A, et al. Air Pollution, Vascular Disease and Thrombosis: Linking Clinical Data and Pathogenic Mechanisms [J]. Journal of Thrombosis and Haemostasis, 2012, 10(12): 2438-2451.
[14] Shang Y, Sun Z, Cao J, et al. Systematic Review of Chinese Studies of Short-term Exposure to Air Pollution and Daily Mortality [J]. Environment International, 2013, 54(4): 100-111.
[15] Trtanj J M, Houston T G. Climate Variability and Change Data and Information for Global Public Health [A].// Global Climate Change and Public Health[M]. Springer New York, 2014: 21-30.
[16] 张智雄, 吴振新, 刘建华, 等. Digital Curation和Digital Preservation之概念辨析[J]. 现代图书情报技术, 2014(1): 4-13. (Zhang Zhixiong, Wu Zhenxin, Liu Jianhua, et al. Analysis of the Difference Between Digital Curation and Digital Preservation [J]. New Technology of Library and Information Service, 2014(1): 4-13.)
[17] DCC Curation Lifecycle Model [EB/OL]. [2015-06-10]. http://www.dcc.ac.uk/resources/curation-lifecycle-model.
[18] Data Documentation Initiative [EB/OL]. [2015-06-10]. http://www.ddialliance.org/.
[19] Create & Manage Data Research Data Lifecycle [EB/OL]. [2015-06-11]. http://www.data-archive.ac.uk/create-manage/ life-cycle.
[20] Johnston L. A Workflow Model for Curating Research Data in the University of Minnesota Libraries: Report from the 2013 Data Curation Pilot [R/OL]. [2015-06-10]. University of Minnesota Digital Conservancy. http://hdl.handle.net/11299/ 162338.
[21] Salimi N, Vita R. The Biocurator: Connecting and Enhancing Scientific Data [J]. PLoS Computational Biology, 2006, 2(10): e125.
[22] Howe D, Costanzo M, Fey P, et al. Big Data: The Future of Biocuration [J]. Nature, 2008, 455(7209): 47-50.
[23] Biocuration in UniProt [EB/OL]. [2015-06-10]. http://www. uniprot.org/help/biocuration.
[24] Stieb D M, Szyszkowicz M, Rowe B H, et al. Air Pollution and Emergency Department Visits for Cardiac and Respiratory Conditions: A Multi-city Time-series Analysis [J]. Environmental Health: A Global Access Science Source, 2009, 8(13): 1841-1860.
[25] Dominici F, Peng R D, Bell M L, et al. Fine Particulate Air Pollution and Hospital Admission for Cardiovascular and Respiratory Diseases [J]. JAMA, 2006, 295(10): 1127-1134.
[26] Zhang Y, Feng C, Ma C, et al. The Impact of Temperature and Humidity Measures on Influenza A (H7N9) Outbreaks— Evidence from China [J]. International Journal of Infectious Diseases, 2015, 30: 122-124.
[27] Kan H, Wong C M, Vichit-Vadakan N, et al. Short-term Association Between Sulfur Dioxide and Daily Mortality: The Public Health and Air Pollution in Asia (PAPA) Study [J]. Environmental Research, 2010, 110(3): 258-264.
[28] Katsouyanni K, Touloumi G, Samoli E, et al. Confounding and Effect Modification in the Short-term Effects of Ambient Particles on Total Mortality: Results from 29 European Cities within the APHEA2 Project [J]. Epidemiology, 2001, 12(5): 521-531.
[29] Wang T, Li G X, Sun J, et al. Association Between Ambient Particulate Matter and Daily Cause-specific Mortality in Tanggu, Tianjin Binhai New Area, China [J]. International Journal of Environmental Health Research, 2013, 23(3): 205-214.
[30] Almeida S M, Silva A V, Sarmento S. Effects of Exposure to Particles and Ozone on Hospital Admissions for Cardiorespiratory Diseases in SetúBal, Portugal [J]. Journal of Toxicology & Environmental Health Part A, 2014, 77(14-16): 837-848.
[31] Cox L A Jr, Popken D A, Ricci P F. Warmer is Healthier: Effects on Mortality Rates of Changes in Average Fine Particulate Matter (PM2.5) Concentrations and Temperatures in 100 US Cities [J]. Regulatory Toxicology and Pharmacology, 2013, 66(3): 336-346.
[32] Zanobetti A, Dominici F, Wang Y, et al. A National Case-crossover Analysis of the Short-term Effect of PM2.5 on Hospitalizations and Mortality in Subjects with Diabetes and Neurological Disorders [J]. Environmental Health, 2014, 13(1): 38.
[33] Stafoggia M, Samoli E, Alessandrini E, et al. Short-term Associations Between Fine and Coarse Particulate Matter and Hospitalizations in Southern Europe: Results from the MED-PARTICLES Project [J]. Environmental Health Perspectives, 2013, 121(9): 1026-l033.
[34] Sinclair A H, Edgerton E S, Wyzga R, et al. A Two-time- period Comparison of the Effects of Ambient Air Pollution on Outpatient Visits for Acute Respiratory Illnesses [J]. Journal of the Air & Waste Management Association, 2010, 60(2): 163-175.
[35] Launch of Health Theme of Climate. Data.Gov [EB/OL]. [2015-06-09]. https://www.data.gov/climate/humanhealth/ highlights.
[36] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-sampling Technique [J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
[37] Wang M Z, Zheng S, He S L, et al. The Association Between Diurnal Temperature Range and Emergency Room Admissions for Cardiovascular, Respiratory, Digestive and Genitourinary Disease among the Elderly: A Time Series Study [J]. Science of the Total Environment, 2013, 456-457: 370-375.

[1] 常志军,钱力,谢靖,吴振新,张鹄,于倩倩,王颖,王永吉. 基于分布式技术的科技文献大数据平台的建设研究*[J]. 数据分析与知识发现, 2021, 5(3): 69-77.
[2] 苏庆,陈思兆,吴伟民,李小妹,黄佃宽. 基于学习情况协同过滤算法的个性化学习推荐模型研究*[J]. 数据分析与知识发现, 2020, 4(5): 105-117.
[3] 杨旭,钱晓东. 基于改进的Vicsek模型的社会网络同步聚类算法*[J]. 数据分析与知识发现, 2020, 4(4): 119-128.
[4] 宰新宇,田学东. 基于公式描述结构和词嵌入的科技文档检索方法*[J]. 数据分析与知识发现, 2020, 4(1): 131-138.
[5] 李杰, 杨芳, 徐晨曦. 考虑时间动态性和序列模式的个性化推荐算法*[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[6] 翟东升, 胡等金, 张杰, 何喜军, 刘鹤. 专利发明等级分类建模技术研究*[J]. 数据分析与知识发现, 2017, 1(12): 63-73.
[7] 杨建林, 刘扬. 基于关联分类算法的PU学习研究[J]. 数据分析与知识发现, 2017, 1(11): 12-18.
[8] 韩普, 王鹏. 基于无标度网络模型和传染病模型的舆论演化仿真研究*[J]. 数据分析与知识发现, 2017, 1(10): 53-63.
[9] 陈润文, 邱勇, 黄文彬, 王军. 基于日志分析的民办高校大学生网络生活类型研究[J]. 数据分析与知识发现, 2017, 1(8): 31-38.
[10] 夏立新, 杨金庆, 程秀峰. 基于情境感知技术的移动数据自动采集系统设计与实现*[J]. 数据分析与知识发现, 2017, 1(5): 82-93.
[11] 卿雅娴, 李锐, 吴华意. 基于论文合著网络的学术社区分析方法研究* ——以《美国地理学家联合会会刊》为例[J]. 数据分析与知识发现, 2017, 1(4): 20-29.
[12] 申雪锋, 柯永振, 姚楠. 多视图合作的联盟数据可视化分析[J]. 数据分析与知识发现, 2017, 1(3): 21-28.
[13] 谢梦瑶, 潘旭伟. 社会化标注中用户动态标签云构建研究*[J]. 数据分析与知识发现, 2017, 1(2): 35-40.
[14] 黄名选. 基于矩阵加权关联模式的印尼中跨语言信息检索模型*[J]. 数据分析与知识发现, 2017, 1(1): 26-36.
[15] 高峰, 熊晶, 刘永革. 基于知网的甲骨卜辞释义问题的可拓性研究[J]. 现代图书情报技术, 2015, 31(7-8): 58-64.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn