Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (1): 43-54     https://doi.org/10.11925/infotech.2096-3467.2022.0017
     专题 本期目录 | 过刊浏览 | 高级检索 |
基于数据要素流通视角的数据溯源研究进展*
王晓庆1,2,3,孙战伟1,吴军红4,杜自然5,钱城江6()
1南京财经大学公共管理学院 南京 210023
2南京航空航天大学经济与管理学院 南京 211106
3南京财经大学红山学院 南京 210003
4南京师范大学商学院 南京 210023
5深圳市数聚湾区大数据研究院平台研发部 深圳 518048
6南京南工大安全科技有限公司 南京 210047
Research Progress of Data Traceability from the Perspective of Data Element Circulation
Wang Xiaoqing1,2,3,Sun Zhanwei1,Wu Junhong4,Du Ziran5,Qian Chengjiang6()
1School of Public Administration, Nanjing University of Finance & Economics, Nanjing 210003, China
2College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
3Hongshan College, Nanjing University of Finance & Economics, Nanjing 210003, China
4Department of Platform Research and Development, Business School, Nanjing Normal University, Nanjing 210023, China
5Department of Platform Research and Development, Greater Bay Area Big Data Research Institute, Shenzhen 518048, China
6Nanjing NJtech Safety Co., Ltd, Nanjing 210047, China
全文: PDF (944 KB)   HTML ( 23
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 通过文献梳理分析数据溯源研究进展及应用场景,以期为数据交易平台搭建、行业数据治理建设和数字政府治理建设提供参考。【方法】 从数据溯源模型、数据溯源方法和数据溯源应用分别进行归纳和分析,并在此基础上探讨研究现状和不足之处。【结果】 无论是在内容描述、模型构建,还是场景应用方面,数据溯源研究均取得了丰富成果,表现为数据溯源质量得以提高、数据溯源安全得以保障、数据溯源效率得以提升。【局限】 基于要素流通视角对数据溯源的研究起步相对较晚、研究成果不够丰富、研究体系尚未形成、研究重点偏向实证。【结论】 可从与数据要素市场相结合,积极推进数据交付使用常态化;加快推进数据溯源标准工作,积极推进数据使用工作制度化;不断提升数据溯源信息质量,积极推进数据服务优质化;高度重视数据溯源信息安全,积极推进数据信息使用规范化;高标准搭建数据溯源平台,积极推动数据要素市场健康化发展等方面进行深入研究。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
王晓庆
孙战伟
吴军红
杜自然
钱城江
关键词 数据流通数据溯源管理模型数据要素    
Abstract

[Objective] The research progress and application scenarios of data traceability are analyzed through literature review, in order to provide reference for the construction of data trading platform, the construction of industrial data governance and the construction of digital government governance. [Methods] The data traceability model, data traceability method and data traceability application are summarized and analyzed, and on this basis, the research status and shortcomings are discussed. [Results] Whether in content description, model construction or scene application, data traceability research has achieved rich results, such as improving the quality of data traceability, ensuring the safety of data traceability and improving the efficiency of data traceability. [Limitations] The research on data traceability from the perspective of factor circulation started relatively late, the research results were not rich enough, the research system had not been formed, and the research focus was biased towards empirical research. [Conclusions] We can actively promote the normalization of data delivery and use by combining with data factor market; speed up the work of data traceability standards, and actively promote the institutionalization of data use; continuously improve the quality of data traceability information, and actively promote the quality of data services; attach great importance to data traceability information security, and actively promote the standardization of data information use; to build a high standard data traceability platform, and actively promote the healthy development of data factor market.

Key wordsData Circulation    Data Traceability    Management Model    Data Factor
收稿日期: 2021-12-15      出版日期: 2022-02-22
ZTFLH:  TP391  
基金资助:*本文系国家社会科学基金青年项目的研究成果之一(18CSH018)
通讯作者: 钱城江,ORCID:0000-0002-0559-005X     E-mail: qiancj_njtech@163.com
引用本文:   
王晓庆, 孙战伟, 吴军红, 杜自然, 钱城江. 基于数据要素流通视角的数据溯源研究进展*[J]. 数据分析与知识发现, 2022, 6(1): 43-54.
Wang Xiaoqing, Sun Zhanwei, Wu Junhong, Du Ziran, Qian Chengjiang. Research Progress of Data Traceability from the Perspective of Data Element Circulation. Data Analysis and Knowledge Discovery, 2022, 6(1): 43-54.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0017      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I1/43
Fig.1  数据流通图
Fig.2  文献检索结果聚类分析
应用领域

应用特点
作用 使用技术/模型
重大突发事件 舆情管控 区块链,人工智能,大数据等
电子商务 商品溯源,防信息篡改 区块链等
企业经营 数据管理,指标管理 W7,OPM等
科学研究 数据存储,数据共享 区块链,数据标识技术等
Table 1  各领域应用对比情况
[1] Foster I, Vockler J, Wilde M, et al. Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation[C]// Proceedings of the 14th International Conference on Scientific and Statistical Database Management. IEEE, 2002: 37-46.
[2] 如何看待数据模型在数据管理中的位置?[EB/OL]. [2019-11-02].https://zhuanlan.zhihu.com/p/75883955 .
[2] (How to View the Position of Data Model in Data Management?[EB/OL]. [2019-11-02].https://zhuanlan.zhihu.com/p/75883955 .)
[3] Buneman P, Khanna S, Wang-Chiew T. Why and Where: A Characterization of Data Provenance[A]//Database Theory — ICDT[M]. Springer Berlin Heidelberg, 2001:316-330.
[4] Green T J, Karvounarakis G, Tannen V. Provenance[C]// Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 2007: 31-40.
[5] Ram S, Liu J. A New Perspective on Semantics of Data Provenance[C]// Proceedings of the 1st International Conference on Semantic Web in Provenance Management - Volume 526. 2009: 35-40.
[6] 王逢阳, 徐全军, 刘峰, 等. 科学数据溯源描述模型及规范设计与思考[J]. 科研信息化技术与应用, 2017, 8(1):27-34.
[6] ( Wang Fengyang, Xu Quanjun, Liu Feng, et al. Design and Thinking of Scientific Data Provenance Description Model and Specification[J]. e-Science Technology & Application, 2017, 8(1):27-34.)
[7] 沈志宏, 张晓林. 语义网环境下数据溯源表达模型研究综述[J]. 现代图书情报技术, 2011(4):1-8.
[7] ( Shen Zhihong, Zhang Xiaolin. Data Provenance Model in Semantic Web Environment:An Overview[J]. New Technology of Library and Information Service, 2011(4):1-8.)
[8] Provenance Vocabulary Mappings[EB/OL].[2012-06-30]. http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary Mappings .
[9] Groth P, Moreau L. PROV-Overview. An Overview of the PROV Family of Documents[R]. Southampton, UK:W3C, 2013.
[10] 倪静, 孟宪学. PROV数据溯源模型及Web应用[J]. 图书情报工作, 2014, 58(3):13-19.
[10] ( Ni Jing, Meng Xianxue. PROV Model and Its Web Application[J]. Library and Information Service, 2014, 58(3):13-19.)
[11] 倪静, 孟宪学. 关联数据环境下数据溯源描述语言的比较研究[J]. 现代图书情报技术, 2013(2):18-23.
[11] ( Ni Jing, Meng Xianxue. The Comparative Analysis of Major Provenance Vocabularies in Linked Data Environment[J]. New Technology of Library and Information Service, 2013(2):18-23.)
[12] GB∕T 34945-2017 信息技术数据溯源描述模型[EB/OL]. https://max.book118.com/html/2018/1203/7054141150001162.shtm .
[12] (GB/T 34945-2017 Information Technology Data Traceability Description Model[EB/OL]. https://max.book118.com/html/2018/1203/7054141150001162.shtm .)
[13] Sahoo S S, Barga R S, Goldstein J, et al. Provenance Algebra and Materialized View-based Provenance Management[C]// Proceedings of the 2nd International Provenance and Annotation Workshop. Berlin: Springer, 2008: 531-540.
[14] 杜莹, 林冰仙, 周良辰, 等. 面向SAR数据处理流程的溯源方法研究[J]. 武汉大学学报·信息科学版, 2017, 42(5):669-675.
[14] ( Du Ying, Lin Bingxian, Zhou Liangchen, et al. Provenance Method for SAR Data Processing Flow[J]. Geomatics and Information Science of Wuhan University, 2017, 42(5):669-675.)
[15] 袁洁. 基于关联数据技术的空间数据溯源共享研究[D]. 武汉: 武汉大学, 2013.
[15] ( Yuan Jie. Research on Geospatial Data Provenance Sharing Based on Linked Data Approach[D]. Wuhan: Wuhan University, 2013.)
[16] Hasan R, Sion R, Winslett M. Introducing Secure Provenance: Problems and Challenges[C]//Proceedings of the 2007 ACM Workshop on Storage Security and Survivability. New York: ACM Press, 2007: 13-18.
[17] 李秀美, 王凤英. 数据起源安全模型研究[J]. 山东理工大学学报(自然科学版), 2010, 24(4):56-60.
[17] ( Li Xiumei, Wang Fengying. Research on Data Provenance's Security Model[J]. Journal of Shandong University of Technology(Natural Science Edition), 2010, 24(4):56-60.)
[18] 王凤英, 张方, 张伟. 基于医疗健康大数据的安全起源模型与可信性验证算法[J]. 山东理工大学学报(自然科学版), 2017, 31(6):6-11.
[18] ( Wang Fengying, Zhang Fang, Zhang Wei. Securing Data Provenance and Creditability Validation Study Based on Big Data of Health Care[J]. Journal of Shandong University of Technology (Natural Science Edition), 2017, 31(6):6-11.)
[19] 邓仲华, 容益芳. 一种分层次的数据溯源安全模型[J]. 图书馆学研究, 2016(20):36-41.
[19] ( Deng Zhonghua, Rong Yifang. A Hierarchical Data Traceability Security Model[J]. Researches in Library Science, 2016(20):36-41.)
[20] 刘耀宗, 刘云恒. 基于区块链的RFID大数据安全溯源模型[J]. 计算机科学, 2018, 45(S2):367-368,381.
[20] ( Liu Yaozong, Liu Yunheng. Security Provenance Model for RFID Big Data Based on Blockchain[J]. Computer Science, 2018, 45(S2):367-368,381.)
[21] Liang X P, Shetty S, Tosh D, et al. ProvChain: A Blockchain-Based Data Provenance Architecture in Cloud Environment with Enhanced Privacy and Availability[C]// Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(CCGRID). IEEE, 2017: 468-477.
[22] 王芳, 赵洪, 马嘉悦, 等. 数据科学视角下数据溯源研究与实践进展[J]. 中国图书馆学报, 2019, 45(5):79-100.
[22] ( Wang Fang, Zhao Hong, Ma Jiayue, et al. Research and Practice Progress of Data Provenance from the Perspective of Data Science[J]. Journal of Library Science in China, 2019, 45(5):79-100.)
[23] 周忠. 数据起源技术研究及其在PostgreSQL中的实现[D]. 广州: 华南理工大学, 2016.
[23] ( Zhou Zhong. A Research of Data Provenance Technology and Its Implementation in PostgreSQL[D]. Guangzhou: South China University of Technology, 2016.)
[24] Karvounarakis G, Ives Z G, Tannen V. Querying Data Provenance[C]// Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. 2010: 951-962.
[25] 王黎维, 鲍芝峰, Koehler Henning, 等. 一种优化关系型溯源信息存储的新方法[J]. 计算机学报, 2011, 34(10):1863-1875.
[25] ( Wang Liwei, Bao Zhifeng, Koehler Henning, et al. An Approach for Optimizing Relational Provenance Storage[J]. Chinese Journal of Computers, 2011, 34(10):1863-1875.)
[26] Deutch D, Milo T, Roy S, et al. Circuits for Datalog Provenance[C]// Proceedings of International Conference on Database Theory. 2014: 201-212.
[27] Chapman A P, Jagadish H V, Ramanan P. Efficient Provenance Storage[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 2008: 993-1006.
[28] 王黎维, 黄泽谦, 罗敏, 等. 集成对象代理数据库的科学工作流服务框架中的数据跟踪[J]. 计算机学报, 2008, 31(5):721-732.
[28] ( Wang Liwei, Huang Zeqian, Luo Min, et al. Data Provenance in a Scientific Workflow Service Framework Integrated with Object Deputy Database[J]. Chinese Journal of Computers, 2008, 31(5):721-732.)
[29] 吴渊. 工作流系统-Nebulas中数据溯源框架的设计与实现[D]. 昆明: 昆明理工大学, 2011.
[29] ( Wu Yuan. Design and Implementation of a Provenance Framework in Workflow System-Nebulas[D]. Kunming: Kunming University of Science and Technology, 2011.)
[30] 邓仲华, 魏银珍. 面向数据发布的科学工作流数据溯源方法研究[J]. 图书与情报, 2014(3):61-66.
[30] ( Deng Zhonghua, Wei Yinzhen. Study on the Method of Provenance in Science Workflow for Data Publishing[J]. Library & Information, 2014(3):61-66.)
[31] Billings J J. Applying Distributed Ledgers to Manage Workflow Provenance[J]. arXiv Preprint, arXiv:1804.05395.
[32] 魏银珍, 邓仲华. 云环境下科学工作流的溯源数据收集和查询框架研究[J]. 情报理论与实践, 2015, 38(7):115-118.
[32] ( Wei Yinzhen, Deng Zhonghua. Research on Data Provenance Collection and Query Framework of Scientific Workflow in Cloud Environment[J]. Information Studies: Theory & Application, 2015, 38(7):115-118.)
[33] Park H, Ikeda R, Widom J. RAMP:A System for Capturing and Tracing Provenance in MapReduce Workflows[C]// Proceedings of the 37th International Conference on Very Large Data Bases(VLDB 2011). 2011: 1351-1354.
[34] Saad M I M, Jalil K A, Manaf M. Data Provenance Trusted Model in Cloud Computing[C]// Proceedings of 2013 International Conference on Research and Innovation in Information Systems (ICRIIS). IEEE, 2013: 257-262.
[35] Zawoad S, Hasan R. SECAP: Towards Securing Application Provenance in the Cloud[C]// Proceedings of IEEE 9th International Conference on Cloud Computing. IEEE, 2016: 900-903.
[36] Tosh D K, Shetty S, Liang X P, et al. Consensus Protocols for Blockchain-Based Data Provenance: Challenges and Opportunities[C]// Proceedings of the 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference. IEEE, 2017: 469-474.
[37] Kim H M, Laskowski M. Towards an Ontology-Driven Blockchain Design for Supply Chain Provenance [OL]. arXiv Preprint, arXiv:1610.02922.
[38] 王若佳, 李培. 基于互联网搜索数据的流感监测模型比较与优化[J]. 图书情报工作, 2016, 60(18):122-132.
[38] ( Wang Ruojia, Li Pei. Detecting Influenza Epidemics by Comparing and Optimizing Models Based on Internet Search Engine Query Data[J]. Library and Information Service, 2016, 60(18):122-132.)
[39] 王迪, 杨广义. 基于区块链溯源技术重大公共卫生事件的舆情防控研究[J]. 河北工程大学学报(社会科学版), 2021, 38(1):30-33.
[39] ( Wang Di, Yang Guangyi. Research on Public Opinion Risk Control of Major Public Health Problems Based on Blockchain Traceability Technology[J]. Journal of Hebei University of Engineering (Social Science Edition), 2021, 38(1):30-33.)
[40] 人工智能助力疾病预测平安科技携手重庆疾控联合研发全球首创人工智能+大数据流感预测模型[EB/OL]. [2017-07-25]. http://www.pingan.cn/zh/common/cn_news/1500961992328.shtml .
[40] (AI Assisted Disease Prediction PingAn Science and Technology Cooperates with Chongqing CDC to Jointly Develop the First Global AI + Big Data Influenza Prediction Model[EB/OL].[2017-07-25]. http://www.pingan.cn/zh/common/cn_news/1500961992328.shtml .
[41] 朱鹏, 朱星圳, 王莉, 等. 基于时间序列与信息融合的突发事件信息瀑布溯源方法[J]. 现代情报, 2018, 38(10):38-42.
[41] ( Zhu Peng, Zhu Xingzhen, Wang Li, et al. Tracing Method of Emergencies Information Cascade Based on Time Series and Information Fusion[J]. Journal of Modern Information, 2018, 38(10):38-42.)
[42] 陈卫哨. 微博突发事件检测及溯源技术研究[D]. 哈尔滨: 哈尔滨工程大学, 2014.
[42] ( Chen Weishao. Burst Event Detection and Initialyzing Technology Research in Micro-Blog[D]. Harbin: Harbin Engineering University, 2014.)
[43] 京东万象以科技助力数据流通, 采用区块链技术促行业健康发展[EB/OL].[2017-01-11].https://wx.jdcloud.com/resources/preview/58?winzoom=1 .
[43] (Jingdong Vientiane Uses Science and Technology to Facilitate Data Circulation and Uses Blockchain Technology to Promote Healthy Development of the Industry[EB/OL]. [2017-01-11].https://wx.jdcloud.com/resources/preview/58?winzoom=1 .)
[44] 缪新萍, 吴漾, 孔庆波, 等. 电网企业指标数据溯源模型研究与设计[J]. 电力大数据, 2021, 24(4):70-77.
[44] ( Miao Xinping, Wu Yang, Kong Qingbo, et al. Research and Design of Index Data Provenance Model for Power Grid Enterprises[J]. Power Systems and Big Data, 2021, 24(4):70-77.)
[45] 王姝, 孙善鹏, 樊景超, 等. 基于区块链的农业科学数据溯源应用初探[J]. 农业大数据学报, 2020, 2(2):47-54.
[45] ( Wang Shu, Sun Shanpeng, Fan Jingchao, et al. Preliminary Study on the Traceability Application of Agricultural Science Data Based on Blockchain[J]. Journal of Agricultural Big Data, 2020, 2(2):47-54.)
[46] 英国数字保存中心发布指南《制定数据管理与共享计划》[EB/OL].[2011-11-17]http://www.ecas.cas.cn/xxkw/kbcd/201115_83713/ml/xxhzlyzc/201111/t20111117_3397761.html .
[46] (GuideLines Issued by the British Digital Preservation Centre 'Developing Data Management and Sharing Plan'[EB/OL].[2011-11-17]. http://www.ecas.cas.cn/xxkw/kbcd/201115_83713/ml/xxhzlyzc/201111/t20111117_3397761.html
[47] 谷俊, 许鑫. 人文社科数据共享模型的设计与实现: 以联盟链技术为例[J]. 情报学报, 2019, 38(4):354-367.
[47] ( Gu Jun, Xu Xin. Design and Implementation of a Humanities and Social Sciences Data Sharing Model: A Case Study of Consortium Blockchain[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(4):354-367.)
[1] 窦悦, 易成岐, 黄倩倩, 莫心瑶, 王建冬, 于施洋. 打造面向全国统一数据要素市场体系的国家数据要素流通共性基础设施平台*——构建国家“数联网”根服务体系的技术路径与若干思考[J]. 数据分析与知识发现, 2022, 6(1): 2-12.
[2] 曾坚朋, 赵正, 杜自然, 洪博然. 数据流通场景下的统一隐私计算框架研究——基于深圳数据交易所的实践[J]. 数据分析与知识发现, 2022, 6(1): 35-42.
[3] 倪静, 孟宪学. 关联数据环境下数据溯源描述语言的比较研究[J]. 现代图书情报技术, 2013, 29(2): 18-23.
[4] 沈志宏, 张晓林. 语义网环境下数据溯源表达模型研究综述[J]. 现代图书情报技术, 2011, 27(4): 1-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn