Please wait a minute...
Advanced Search
现代图书情报技术  2013, Vol. Issue (5): 1-20     https://doi.org/10.11925/infotech.1003-3513.2013.05.01
  数字图书馆 本期目录 | 过刊浏览 | 高级检索 |
科研数据共享的挑战
Christine L. Borgman1(著), 青秀玲2(译)
1. UCLA Department of Information Studies, Los Angeles, CA 90095, USA;
2. 中国科学院国家科学图书馆 北京 100190
The Conundrum of Sharing Research Data
Christine L. Borgman
UCLA Department of Information Studies, Los Angeles, CA 90095, USA
全文: PDF (824 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

新型科研设备和研究方法的出现造成了前所未有的数据洪流。大量数据以及新型数据分布和挖掘方法激起资助机构、决策者和一般大众对于新的发现和创新的憧憬。众多利益相关者期盼数据可开放获取,然而至今数据共享也仅在天文学和基因组学等少数领域出现。在其他领域,一部分研究者会经常共享数据,其他研究者则从不共享数据,而其他大多数研究者则只愿意在某些时间共享某些数据。因此,数据共享仍是一个难题——一个错综复杂而又困难的问题。科研数据有很多形式,数据收集有很多目的,也采用很多方法,一旦离开了数据最初产生的背景则很难解释。本文以自然科学、社会科学和人文科学为例来分析说明数据类型和数据实践。作者考察了数据共享的4个理由:进行研究再现或验证;使公共资助研究的结果为公众所用;使其他人利用现有数据提出新的科学问题;提升研究和创新水平。因为不同的共享原因、不同的受益人、以及卷入其中的利益相关者的动机和激励,人们对这些理由的认识不同。数据共享的挑战就是理解什么数据应该被共享、被谁共享、和谁共享、在什么条件下共享、为什么共享以及要做什么努力等。回答这些问题将贯穿整个数据政策和数据实践。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
关键词 科研数据共享数据复用信息政策    
Abstract

Researchers are producing an unprecedented deluge of data by using new methods and instrumentation. Others may wish to mine these data for new discoveries and innovations. However, research data are not readily available as sharing is common in only a few fields such as astronomy and genomics.Data sharing practices in other fields vary widely. Moreover,research data take many forms, are handled in many ways, using many approaches, and often are difficult to interpret once removed from their initial context. Data sharing is thus a conundrum. Four rationales for sharing data are examined, drawing examples from the sciences, social sciences, and humanities: (1) to reproduce or to verify research, (2) to make results of publicly funded research available to the public, (3) to enable others to ask new questions of extant data, and (4) to advance the state of research and innovation. These rationales differ by the arguments for sharing, by beneficiaries, and by the motivations and incentives of the many stakeholders involved. The challenges are to understand which data might be shared, by whom, with whom, under what conditions, why, and to what effects. Answers will inform data policy and practice.

收稿日期: 2013-04-24      出版日期: 2013-07-03
:  G250  
通讯作者: Christine L. Borgman     E-mail: borgman@gseis.ucla.edu
引用本文:   
Christine L. Borgman(著), 青秀玲(译). 科研数据共享的挑战[J]. 现代图书情报技术, 2013, (5): 1-20.
Christine L. Borgman. The Conundrum of Sharing Research Data. New Technology of Library and Information Service, 2013, (5): 1-20.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2013.05.01      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2013/V/I5/1

[1] Hey, A. J. G. & Trefethen, A. (2003). The Data Deluge: An e-Science Perspective. InBerman, F., Fox, G. & Hey, A. J. G. (Eds.). Grid Computing: Making the GlobalInfrastructure a Reality. Chichester, Wiley. Retrieved from http://www.rcuk.ac.uk/escience/documents/report_datadeluge.pdf.

[2] Community cleverness required. (2008). Nature, 455(7209): 1-1.

[3] Data’s shameful neglect. (2009). Nature, 461(7261): 145-145.

[4] Dealing with data. (2011). Science, 331(6018): 692-729.

[5] Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired.http://www.wired.com/science/discoveries/magazine/16-07/pb_theory.

[6] Data, Data Everywhere. (2010). Economist: 16.

[7] The University’s Role in the Dissemination of Research and Scholarship. (2009).Association of Research Libraries.1-8. http://www.arl.org/disseminating_research_2009.

[8] Lyon, L. (2007). Dealing with Data: Roles, Rights, Responsibilities, and Relationships.UKOLN.http://www.jisc.ac.uk/whatwedo/programmes/programme_digital_repositories/project_dealing_with_data.aspx.

[9] Borgman, C. L. (2009). The Digital Future is Now: A Call to Action for the Humanities. Digital Humanities Quarterly, 3(4).http://digitalhumanities.org/dhq/vol/3/4/000077/000077.html.

[10] Hey, T., Tansley, S. & Tolle, K. (Eds.). (2009). The Fourth Paradigm: Data-IntensiveScientific Discovery. Redmond, WA: Microsoft. http://research.microsoft.com/en-us/collaboration/fourthparadigm/.

[11] Merriam-Webster’s Collegiate Dictionary. (2005). (11th ed.). Springfield, MA: Merriam-Webster.

[12] Piwowar, H. A., Becich, M. J., Bilofsky, H. & Crowley, R.S. (2008). Towards a Data Sharing Culture: Recommendations for Leadership from Academic Health Centers. Plos Medicine, 5(9): 1315-1319.

[13] Piwowar, H. A. & Chapman, W. W. (2010). Public Sharing of Research Datasets: A Pilotstudy of Associations. Journal of Informetrics, 4(2): 148-156.

[14] Piwowar, H. A., Day, R. S. & Fridsma, D. B. (2007). Sharing Detailed Research Data is Associated with Increased Citation Rate. Plos One, 2(3).

[15] Patterns of Information Use and Exchange: Case Studies of Researchers in the Lifesciences. (2009). British Library.http://www.rin.ac.uk/ourwork/using-and-accessing-information-resources/disciplinary-case-studies-lifesciences.

[16] Cragin, M. H., Palmer, C. L., Carlson, J. R. & Witt, M. (2010).Data Sharing, Small Science and Institutional Repositories. Philosophical Transactions of the Royal Society A:Mathematical, Physical and Engineering Sciences, 368(1926): 4023-4038.

[17] Palmer, C. L., Cragin, M. H., Heidorn, P. B. & Smith, L. C. (2007).Studies of Data Curation for the Long Tail of Science.3rd International Digital Curation Conference, Washington, DC, Digital Curation Center.http://www.dcc.ac.uk/events/dcc-2007/.

[18] Wynholds, L., Fearon Jr, D. S., Borgman, C. L. & Traweek, S. (2011). When Use Cases are not Useful: Data Practices, Astronomy, and Digital Libraries Joint Conference on Digital Libraries, Ottawa, ACM. http://portal.acm.org/citation.cfm?id=1998146.

[19] Mayernik, M. S. (2011). Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators. PhD Dissertation.Information Studies.UCLA. Los Angeles.http://beta.sensorbase.org/~mayernik/mayernik_dissertation_submitted_08June2011.pdf.

[20] Wallis, J. C., Mayernik, M. S., Borgman, C. L. & Pepe, A. (2010). Digital Libraries for Scientific Data Discovery and Reuse: From Vision to Practical Reality. Joint Conference on Digital Libraries, Gold Coast, Queensland, Australia, Association for Computing Machinery.

[21] Fienberg, S. E., Martin, M. E. & Straf, M. L. (Eds.). (1985). Sharing Research Data.Washington, DC: National Academy Press. http://books.nap.edu/catalog.php?record_id=2033.

[22] Preserving Scientific Data on Our Physical Universe.A New Strategy for Archiving the Nation’s Scientific Information Resources (1995). Washington, D.C.: National Academy Press. http://www.nap.edu/catalog.php?record_id=4871.

[23] Bits of Power: Issues in Global Access to Scientific Data. (1997). Washington, DC: National Academy Press. http://www.nap.edu.

[24] Long-Lived Digital Data Collections.(2005). National Science Board.http://www.nsf.gov/pubs/2005/nsb0540/.

[25] Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. (2009). Washington, D.C.: National Academy Press.

[26] Harnessing the Power of Digital Data for Science and Society. (2009).Report of the Interagency Working Group on Digital Data to the Committee on Science of the National Science and Technology Council.http://www.nitrd.gov/about/Harnessing_Power_Web.pdf.

[27] Berman, F., Lavoie B, et al. (2010). Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information. http://brtf.sdsc.edu/publications.html.

[28] Dalrymple, D. (2003). Scientific Knowledge as a Global Public Good: Contributions to Innovation and the Economy. In Esanu, J. M. & Uhlir, P. F. (Eds.).The Role of Scientific and Technical Data and Information in the Public Domain. Washington,DC, The National Academies Press: 35-51. http://books.nap.edu/catalog/10785.html.

[29] Esanu, J. M. & Uhlir, P. F. (Eds.). (2003). The Role of Scientific and Technical Data and Information in the Public Domain. Washington, DC: The National Academies Press.http://books.nap.edu/catalog/10785.html.

[30] Esanu, J. M. & Uhlir, P. F. (Eds.). (2004). Open Access and the Public Domain in Digital Data and Information for Science: Proceedings of an International Symposium.Washington, DC: The National Academies Press.

[31] Hanson, B., Sugden, A. & Alberts, B. (2011).Making Data Maximally Available. Science,331(6018): 649-649.

[32] Grant Policy Manual.(2001). National Science Foundation.http://www.nsf.gov/publications/.

[33] NSF Data Sharing Policy. (2010).National Science Foundation.http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4.

[34] NSF Data Management Plans. (2010).National Science Foundation.http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/gpg_2.jsp#dmp.

[35] NSF Proposal Preparation Instructions. (2011). Award and Administrative Guide:National Science Foundation.http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/gpg_2.jsp#dmp.

[36] Wellcome Trust Statement on Genome Data Release. (1997). http://www.wellcome.ac.uk/doc%5Fwtd002751.html.

[37] Wellcome Trust Policy on Access to Bioinformatics Resources by Trust-Funded Researchers.(2001). Wellcome Trust.http://www.wellcome.ac.uk/doc%5Fwtd002759.html.

[38] Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility. (2003). Meeting Organized by the Wellcome Trust, Fort Lauderdale, Florida, Wellcome Trust. http://www.wellcome.ac.uk/.../groups/corporatesite/@policy_communications/documents/web_document/wtd003207.pdf.

[39] ESRC Research Data Policy. (2010).Economic and Social Research Council.http://www.esrc.ac.uk/about-esrc/information/data-policy.aspx.

[40] DCC Data Management Plans.(2011). Digital Curation Centre.http://www.dcc.ac.uk/resources/data-management-plans.

[41] Abrams, S., Cruse, P. & Kunze, J. (2009). Preservation is not a Place. International Journal of Digital Curation, 4(1).http://www.ijdc.net/index.php/ijdc/article/viewFile/98/73.

[42] Witt, M., Carlson, J., Brandt, D. S. &Cragin, M. H. (2009).Constructing Data Curation Profiles.International Journal of Digital Curation, 4(3).http://www.ijdc.net/index.php/ijdc/article/viewFile/137/165.

[43] Summary of Principles. (1996). International Strategy Meeting on Human Genome Sequencing, Bermuda, The Wellcome Trust. http://www.gene.ucl.ac.uk/hugo/bermuda.htm.

[44] Genome Canada Data Release and Sharing Policy. (2005).http://www.genomecanada.ca/xcorporate/policies/DataReleasePolicy.pdf.

[45] Berman, H. M., Westbrook, J., Feng, J., Gilliland, G., Bhat, T. N., Wessig, H., Shindyalov, I. N. & Bourne, P. E. (2000).The Protein Data Bank. Nucleic Acids Research, 28: 235-242.

[46] Hilgartner, S. (1998).Data Access Policy in Genome Research.In Thakray, A. (Ed.).Private Science. Oxford, Oxford University Press: 202-218.

[47] Protein Data Bank. (2011). Retrieved from http://www.rcsb.org/pdb/ on 29 April 2011.

[48] Dryad. (2011). Joint Data Archiving Policy.http://datadryad.org/jdap.

[49] Whitlock, M. C., McPeek, M. A., Rausher, M. D., Rieseberg, L. & Moore, A. J. (2010).Data Archiving. American Naturalist, 175(2): E45-146.

[50] Data Citation Standards and Practices. (2010). International Council for Science :Committee on Data for Science and Technology.http://www.codata.org/taskgroups/TGdatacitation/index.html.

[51] Developing Data Attribution and Citation Practices and Standards: An International Symposium and Workshop. (2011). Berkeley, CA, US CODATA and the Board on Research Data and Information, in Collaboration with CODATA-ICSTITask Group on Data Citation Standards and Practices. http://sites.nationalacademies.org/PGA/brdi/PGA_064019.

[52] Buckland, M. K. (1991). Information as Thing. Journal of the American Society for Information Science, 42(5): 351-360.

[53] A Question of Balance: Private Rights and the Public Interest in Scientific and Technical Databases. (1999). Washington, DC: National Academy Press.

[54] Uhlir, P. F. & Cohen, D. (2011).Personal Communication. Board on Research Data and Information, Policy and Global Affairs Division, National Academy of Sciences.

[55] Borgman, C. L. (2007). Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, MA: MIT Press.

[56] Renear, A. H., Sacchi, S. & Wickett, K. M. (2010). Definitions of Dataset in the Scientific and Technical Literature.American Society for Information Science and Technology, Pittsburgh, Information Today.1-4.http://portal.acm.org/citation.cfm?id=1920447.

[57] Borgman, C. L. (2011). Why are the Attribution and Citation of Scientific Data Important?(Keynote). Developing Data Attribution and Citation Practices and Standards: An International Symposium and Workshop, Berkeley, CA, US CODATA and the Board on Research Data and Information, in Collaboration with CODATA-ICSTITask Group on Data Citation Standards and Practices.http://sites.nationalacademies.org/PGA/brdi/PGA_064019.

[58] Reference Model for an Open Archival Information System. (2002). Recommendation for Space Data System Standards: Consultative Committee for Space Data Systems Secretariat, Program Integration Division (Code M-3), National Aeronautics and Space Administration. http://public.ccsds.org/publications/archive/650x0b1.pdf.

[59] Lave, J. & Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation.Cambridge, UK: Cambridge University Press.

[60] Wenger, E. (1998). Communities of Practice: Learning, Meaning, and Identity. Cambridge, UK: Cambridge University Press.

[61] Knorr-Cetina, K. (1999). Epistemic Cultures: How the Sciences Make Knowledge.Cambridge, MA: Harvard University Press.

[62] Osterlund, C. & Carlile, P. (2005). Relations in Practice: Sorting Through Practice Theories on Knowledge Sharing in Complex Organizations. The Information Society, 21(2): 91-107.

[63] Van House, N. A. (2004). Science and Technology Studies and Information Studies.In Cronin, B. (Ed.).Annual Review of Information Science and Technology.Medford, NJ, Information Today. 38: 3-86.

[64] Bowker, G. C. (2000). Biodiversity Data Diversity. Social Studies of Science, 30(5): 643-683.

[65] Bowker, G. C. (2005). Memory Practices in the Sciences. Cambridge, MA: MIT Press.

[66] Edwards, P. N., Mayernik, M. S., Batcheller, A. L., Bowker, G. C. & Borgman, C. L.(2011). Science Friction: Data, Metadata, and Collaboration. Social Studies of Science, 41(5): 667-690.

[67] Karasti, H., Baker, K. S. & Halkola, E. (2006).Enriching the Notion of Data Curation in e-Science: Data Managing and Information Infrastructuring in the Long Term Ecological Research (LTER) Network. Journal of Computer-Supported Cooperative Work, 15(3-4): 321-358.

[68] Mayernik, M. S., Batcheller, A. L. & Borgman, C. L. (2011). How Institutional Factors Influence the Creation of Scientific Metadata. iConference, Seattle, Association for Computing Machinery.

[69] Palmer, C. L. (2005). Scholarly Work and the Shaping of Digital Access. Journal of the American Society for Information Science and Technology, 56(11): 1140-1153.

[70] Renear, A. H. & Palmer, C. L. (2009). Strategic Reading, Ontologies, and the Future of Scientific Publishing. Science, 325(5942): 828-832.

[71] Ribes, D., Baker, K. S., Millerand, F. & Bowker, G. C. (2005). Comparative Interoperability Project: Configurations of Community, Technology,Organization. Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries.

[72] Ribes, D. & Finholt, T. A. (2007). Tensions Across the Scales: Planning Infrastructure for the Long-term. Proceedings of the 2007 International ACM SIGGROUP Conference on Supporting Group Work, Sanibel Island, Florida, USA, Sanibel Island, Florida, Association for Computing Machinery. 229-238.

[73] Zimmerman, A. S. (2007). Not by Metadata Alone: The Use of Diverse Forms of Knowledge to Locate Data for Reuse. International Journal of Digital Libraries, 7(1-2):5-16.

[74] National Ecological Observatory Network. (2010). http://www.neoninc.org/.

[75] U.S. Long Term Ecological Research Network. (2010). http://lternet.edu/.

[76] Porter, J. H. (2010). A Brief History of Data Sharing in the U.S. Long Term Ecological Research Network. Bulletin of the Ecological Society of America, 91: 14-20.http://dx.doi.org/10.1890/0012-9623-91.1.14.

[77] GEON.(2011). http://www.geongrid.org/.

[78] Ribes, D. & Bowker, G. C. (2008).Organizing for Multidisciplinary Collaboration: The Case of the Geosciences Network. In Olson, G. M., Zimmerman, A. & Bos, N.(Eds.). Science on the Internet. Cambridge, MIT Press.

[79] PAN-STARRS.(2009). Panoramic Survey Telescope & Rapid Response System.http://pan-starrs.ifa.hawaii.edu/public/.

[80] Large Synoptic Sky Telescope.(2010). http://www.lsst.org/lsst.

[81] Sloan Digital Sky Survey. (2010). http://www.sdss.org/.

[82] Edwards, P. N. (2010). A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming. Cambridge, MA: MIT Press.

[83] Borgman, C. L., Wallis, J. C., Mayernik, M. S. & Pepe, A. (2007).Drowning in Data: Digital Library Architecture to Support Scientific Use of Embedded Sensor Networks.Joint Conference on Digital Libraries, Vancouver, British Columbia, Canada,Association for Computing Machinery. 269-277. http://doi.acm.org/10.1145/1255175.1255228.

[84] Gobler, C. J., Boneillo, G. E., Debenham, C. J. & Caron, D. A. (2004). Nutrientlimitation, Organic Matter Cycling, and Plankton Dynamics During an Aureococcus Anophagefferens Bloom. Aquatic Microbial Ecology, 35: 31-43.

[85] Borgman, C. L., Wallis, J. C. & Enyedy, N. (2006).Building Digital Libraries for Scientific Data: An Exploratory Study of Data Practices in Habitat Ecology. 10th European Conference on Digital Libraries, Alicante, Spain. Berlin: Springer. 170-183.

[86] Karasti, H., Baker, K. S. & Millerand, F. (2010). Infrastructure Time: Long-term Mattersin Collaborative Development. Computer Supported Cooperative Work, 19(3-4): 377-415.

[87] Aronova, E., Baker, K. S. & Oreskes, N. (2010). Big Science and Big Data in Biology: From the International Geophysical Year through the International Biological Program to the Long Term Ecological Research (LTER) Network, 1957-Present. Historical Studies in the Natural Sciences, 40(2): 183-224.

[88] Moore, A. J., McPeek, M. A., Rausher, M. D., Rieseberg, L. & Whitlock, M. C. (2010).The Need for Archiving Data in Evolutionary Biology. Journal of Evolutionary Biology, 23(4): 659-660.

[1] 池毛毛,潘美钰,王伟军. 线索一致性对共享住宿平台用户购买决策的影响研究:房客文本信息和房源图片信息的交互效应*[J]. 数据分析与知识发现, 2020, 4(11): 74-83.
[2] 彭昱欣,邓朝华,吴江. 基于社会资本与动机理论的在线健康社区医学专业用户知识共享行为分析*[J]. 数据分析与知识发现, 2019, 3(4): 63-70.
[3] 卢新元,王雪霖,代巧锋. 基于fsQCA的竞赛式众包社区知识共享行为构型研究 *[J]. 数据分析与知识发现, 2019, 3(11): 60-69.
[4] 梁晓蓓, 徐真, 李晶晶. 共享短租平台商家属性对消费者网络口碑的影响研究*[J]. 数据分析与知识发现, 2018, 2(11): 46-53.
[5] 李志鹏, 李卫忠. 基于可拓小生境量子粒子群算法的特征选择*[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
[6] 陈远, 刘福珍, 吴江. 基于二模复杂网络的共享经济平台用户交互行为研究*[J]. 数据分析与知识发现, 2017, 1(6): 72-82.
[7] 孙轶楠, 顾立平, 宋秀芳, 刘晶晶, 江娴. 学科数据知识库的政策调研与分析——以生命科学领域为例[J]. 现代图书情报技术, 2015, 31(12): 13-20.
[8] 刘晶晶, 顾立平, 范少萍. 国外通用型数据知识库的政策调研与分析[J]. 现代图书情报技术, 2015, 31(11): 4-11.
[9] Heinz Pampel, Paul Vierkant, Frank Scholze, Roland Bertelmann, Maxi Kindling, Jens Klump, Hans-Jürgen Goebelbecker, Jens Gundlach, Peter Schirmbacher, Uwe Dierolf . 呈现科研数据知识库:re3data.org注册机制[J]. 现代图书情报技术, 2014, 30(3): 26-34.
[10] 高海艳, 窦永香, 齐艺兰. 利用交互历史进行P2P知识共享社区发现的研究[J]. 现代图书情报技术, 2013, 29(9): 93-98.
[11] 顾立平. 开放数据计量研究综述:计算网络用户行为和科学社群影响力的Altmetrics计量[J]. 现代图书情报技术, 2013, (6): 1-8.
[12] 沈奎林, 赵华, 邵波. 研究小间预约系统的设计与实现[J]. 现代图书情报技术, 2013, (5): 87-91.
[13] 史丽萍, 苑婧婷, 唐书林, 刘强. 内部控制机制、团队共享心智模型对知识共享的作用机理——扩展知识共享实现路径的视角[J]. 现代图书情报技术, 2013, 29(11): 40-45.
[14] 李丹, 闫晓第, 魏青山. Drupal的混搭技术在图书馆的应用[J]. 现代图书情报技术, 2013, 29(10): 79-84.
[15] 黄华军, 曾新红, 林伟明. OTCSS关联数据服务的研究与实现[J]. 现代图书情报技术, 2012, 28(7): 40-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn