1Institute of Medical Information/Medical Library, Chinese Academy of Medical Science & Peking Union Medical College, Beijing 100020, China 2National Population Health Data Center, Beijing 100005, China
【目的】 为国家财政支持的人口健康领域科研项目数据汇交和管理提供重要基础支撑,重点介绍国家人口健康科学数据中心数据仓储 (Population Health Data Archive,PHDA)在科研项目数据汇交方面的功能设计和实施方法。【方法】 分析人口健康领域科研项目数据汇交流程特点,构建满足国家财政预算支持的科研项目数据汇交和管理迫切需求的数据仓储,设计形成灵活、可扩展的总体框架和友好易用的功能模块。【结果】 PHDA实现项目信息注册、项目数据汇交、大数据高速传输、安全保藏、数据唯一标识分配、分级分类存储、访问控制和凭证发放等功能,已有效支撑国家科技基础性工作专项14个项目,292个数据集的汇交工作。【局限】 还需运用数据语义化和深度学习等技术实现增强数据管理、数据语义融合和智能化数据分析服务,优化仓储功能。【结论】 PHDA实现人口健康领域科研项目数据汇交管理与共享利用,对国家人口健康领域科学数据的汇聚、积累和安全保障具有重要意义。
[Objective] This study focuses on the design and implementation of the Population Health Data Archive (PHDA), aiming to support data curation of research projects supported by the government. [Methods] First, we analyzed the data curation characteristics of research projects on population health. Then, we constructed a data archive for their urgent needs. Our system includes flexible and scalable framework, as well as user friendly functional modules. [Results] The PHDA finished the tasks of project registration, data collection, big data high-speed transmission, security preservation, distribution of unique dataset identifiers, effective storage, access control and voucher issuance. In 2019, our system administrated 292 datasets for 14 projects from the National Special Program on Basic Works for Science and Technology. [Limitations] The PHDA could be optimized with more data semantics and deep learning technologies (i.e., intelligent data analysis services). [Conclusions] The PHDA could effectively curate and disseminate shared research data in the field of national population health.
吴思竹, 钱庆, 周伟, 钟明, 王安然, 修晓蕾, 苟欢, 李赞梅, 李姣, 方安. 面向人口健康领域科研项目数据汇交的数据仓储设计与实现*[J]. 数据分析与知识发现, 2020, 4(12): 2-13.
Wu Sizhu, Qian Qing, Zhou Wei, Zhong Ming, Wang Anran, Xiu Xiaolei, Gou Huan, Li Zanmei, Li Jiao, Fang An. Data Archive for Research Projects in Population Health. Data Analysis and Knowledge Discovery, 2020, 4(12): 2-13.
National Science Foundation. Applications Grants.gov Application Guide: A Guide for Preparation and Submission of NSF Applications via Grants.gov [EB/OL]. [2020-07-07]. https://nsf.gov/pubs/policydocs/grantsgovguide0620.pdf.
[2]
National Institutes of Health. NIH Data Sharing Policy and Implementation Guidance [EB/OL].[2020-07-07]. https://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm.
[3]
National Institutes of Health. National Institutes of Health Genomic Data Sharing Policy [EB/OL].[2020-07-06]. https://osp.od.nih.gov/wp-content/uploads/NIH_GDS_Policy.pdf.
[4]
National Institutes of Health. DRAFT NIH Policy for Data Management and Sharing [EB/OL].[2020-07-06]. https://osp.od.nih.gov/wp-content/uploads/Draft_NIH_Policy_Data_Management_and_Sharing.pdf.
[5]
European Commission. A European Strategy for Data [EB/OL]. [2020-07-06].https://eur-lex.europa.eu/legal-content/EN/TXT/?qid=1593073685620&uri=CELEX%3A52020DC0066.
[6]
Directorate-General for Research and Innovation of European Commission. H2020 Programme Guidelines on FAIR Data Management in Horizon 2020 [EB/OL]. [2020-07-06]. https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
( Wang Juanle, Zhu Junxiang, Yang Yaping , et al. Edifying by Data Archiving Policy of International Science and Technology Research Program to China[J]. China Science & Technology Resources Review, 2013,45(2):17-23.)
( Si Li, Xing Wenming . Scientific Data Management and Sharing Policies in Foreign Countries: Investigation and Inspiration to Us[J]. Information and Documentation Services, 2013(1):62-67.)
( Tang Yuan, Wu Dan . Investigation on Foreign Medical Scientific Data Sharing Policies and Enlightenments to China[J]. Library and Information Service, 2015,59(18):8-15.)
( Sun Xiaokang, Wu Sizhu, Xiu Xiaolei , et al. Overseas Scientific Data Management and Sharing Policies and Its Enlightenment to Precision Medical Data Management in China[J]. Journal of Medical Intelligence, 2018,39(4):58-65.)
( Wang Juanle, Yang Yapping, Zhu Yunqiang , et al. Data Archiving Progress and Data Types Analysis of National Basic Research Program of China (973 Program) in Resource and Environment Field[J]. Advances in Earth Science, 2008,23(8):895-896.)
( Wang Juanle, Sun Jiulin, Yang Yaping , et al. Data Archiving Practice and Consideration of National Basic Research Program of China (973 Program) in Resource and Environment Field[J]. China Science & Technology Resources Review, 2011,43(3):1-5.)
( Yang Jie, Song Jia, Zhu Yunqiang , et al. Construction of Special Data Archiving and Sharing Platform for the Science and Technology Basic Work[J]. China Science & Technology Resources Review, 2017,49(5):52-59, 67.)
( Yang Yang, Yin Aining, Liu Jing , et al. Construction of Platform for Research Project Management and Collection of Chinese Academy of Chinese Medicine Sciences[C]// Proceedings of Annual Meeting of Institute of Information on Traditional Chinese Medicine, Chinese Academy of Chinese Medicine Sciences. 2008. )
( Liu Yan, Chen Yan, Yang Yanchen , et al. Analysis and Compiling of Data Materials about Achievements of Forestry Fundamental Special Projects[J]. China Science & Technology Resources Review, 2017,49(5):82-88.)
( Liu Jun . Discussion on the Collecting and Sharing of Scientific Data of the Projects of Shanxi’s Scientific and Technological Plans[J]. Sci-Tech Information Development & Economy, 2014,24(5):129-131.)
( Zhong Ming, Wu Sizhu, Qian Qing , et al. Research on the Investigation Strategy of Scientific Research Projects in China’s Population and Health Fields for Data Collection[J]. Chinese Journal of Medical Library and Information Science, 2018,27(10):20-27.)