Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (1): 43-54    DOI: 10.11925/infotech.2096-3467.2022.0017
Current Issue | Archive | Adv Search |
Research Progress of Data Traceability from the Perspective of Data Element Circulation
Wang Xiaoqing1,2,3,Sun Zhanwei1,Wu Junhong4,Du Ziran5,Qian Chengjiang6()
1School of Public Administration, Nanjing University of Finance & Economics, Nanjing 210003, China
2College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
3Hongshan College, Nanjing University of Finance & Economics, Nanjing 210003, China
4Department of Platform Research and Development, Business School, Nanjing Normal University, Nanjing 210023, China
5Department of Platform Research and Development, Greater Bay Area Big Data Research Institute, Shenzhen 518048, China
6Nanjing NJtech Safety Co., Ltd, Nanjing 210047, China
Download: PDF (944 KB)   HTML ( 24
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The research progress and application scenarios of data traceability are analyzed through literature review, in order to provide reference for the construction of data trading platform, the construction of industrial data governance and the construction of digital government governance. [Methods] The data traceability model, data traceability method and data traceability application are summarized and analyzed, and on this basis, the research status and shortcomings are discussed. [Results] Whether in content description, model construction or scene application, data traceability research has achieved rich results, such as improving the quality of data traceability, ensuring the safety of data traceability and improving the efficiency of data traceability. [Limitations] The research on data traceability from the perspective of factor circulation started relatively late, the research results were not rich enough, the research system had not been formed, and the research focus was biased towards empirical research. [Conclusions] We can actively promote the normalization of data delivery and use by combining with data factor market; speed up the work of data traceability standards, and actively promote the institutionalization of data use; continuously improve the quality of data traceability information, and actively promote the quality of data services; attach great importance to data traceability information security, and actively promote the standardization of data information use; to build a high standard data traceability platform, and actively promote the healthy development of data factor market.

Key wordsData Circulation      Data Traceability      Management Model      Data Factor     
Received: 15 December 2021      Published: 22 February 2022
ZTFLH:  TP391  
Fund:National Social Science Fund of China(18CSH018)
Corresponding Authors: Qian Chengjiang,ORCID:0000-0002-0559-005X     E-mail: qiancj_njtech@163.com

Cite this article:

Wang Xiaoqing, Sun Zhanwei, Wu Junhong, Du Ziran, Qian Chengjiang. Research Progress of Data Traceability from the Perspective of Data Element Circulation. Data Analysis and Knowledge Discovery, 2022, 6(1): 43-54.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0017     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I1/43

Data Flow Process Diagram
Cluster Analysis of Literature Retrieval Results
应用领域

应用特点
作用 使用技术/模型
重大突发事件 舆情管控 区块链,人工智能,大数据等
电子商务 商品溯源,防信息篡改 区块链等
企业经营 数据管理,指标管理 W7,OPM等
科学研究 数据存储,数据共享 区块链,数据标识技术等
Applications in Various Fields
[1] Foster I, Vockler J, Wilde M, et al. Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation[C]// Proceedings of the 14th International Conference on Scientific and Statistical Database Management. IEEE, 2002: 37-46.
[2] 如何看待数据模型在数据管理中的位置?[EB/OL]. [2019-11-02].https://zhuanlan.zhihu.com/p/75883955 .
[2] (How to View the Position of Data Model in Data Management?[EB/OL]. [2019-11-02].https://zhuanlan.zhihu.com/p/75883955 .)
[3] Buneman P, Khanna S, Wang-Chiew T. Why and Where: A Characterization of Data Provenance[A]//Database Theory — ICDT[M]. Springer Berlin Heidelberg, 2001:316-330.
[4] Green T J, Karvounarakis G, Tannen V. Provenance[C]// Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 2007: 31-40.
[5] Ram S, Liu J. A New Perspective on Semantics of Data Provenance[C]// Proceedings of the 1st International Conference on Semantic Web in Provenance Management - Volume 526. 2009: 35-40.
[6] 王逢阳, 徐全军, 刘峰, 等. 科学数据溯源描述模型及规范设计与思考[J]. 科研信息化技术与应用, 2017, 8(1):27-34.
[6] ( Wang Fengyang, Xu Quanjun, Liu Feng, et al. Design and Thinking of Scientific Data Provenance Description Model and Specification[J]. e-Science Technology & Application, 2017, 8(1):27-34.)
[7] 沈志宏, 张晓林. 语义网环境下数据溯源表达模型研究综述[J]. 现代图书情报技术, 2011(4):1-8.
[7] ( Shen Zhihong, Zhang Xiaolin. Data Provenance Model in Semantic Web Environment:An Overview[J]. New Technology of Library and Information Service, 2011(4):1-8.)
[8] Provenance Vocabulary Mappings[EB/OL].[2012-06-30]. http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary Mappings .
[9] Groth P, Moreau L. PROV-Overview. An Overview of the PROV Family of Documents[R]. Southampton, UK:W3C, 2013.
[10] 倪静, 孟宪学. PROV数据溯源模型及Web应用[J]. 图书情报工作, 2014, 58(3):13-19.
[10] ( Ni Jing, Meng Xianxue. PROV Model and Its Web Application[J]. Library and Information Service, 2014, 58(3):13-19.)
[11] 倪静, 孟宪学. 关联数据环境下数据溯源描述语言的比较研究[J]. 现代图书情报技术, 2013(2):18-23.
[11] ( Ni Jing, Meng Xianxue. The Comparative Analysis of Major Provenance Vocabularies in Linked Data Environment[J]. New Technology of Library and Information Service, 2013(2):18-23.)
[12] GB∕T 34945-2017 信息技术数据溯源描述模型[EB/OL]. https://max.book118.com/html/2018/1203/7054141150001162.shtm .
[12] (GB/T 34945-2017 Information Technology Data Traceability Description Model[EB/OL]. https://max.book118.com/html/2018/1203/7054141150001162.shtm .)
[13] Sahoo S S, Barga R S, Goldstein J, et al. Provenance Algebra and Materialized View-based Provenance Management[C]// Proceedings of the 2nd International Provenance and Annotation Workshop. Berlin: Springer, 2008: 531-540.
[14] 杜莹, 林冰仙, 周良辰, 等. 面向SAR数据处理流程的溯源方法研究[J]. 武汉大学学报·信息科学版, 2017, 42(5):669-675.
[14] ( Du Ying, Lin Bingxian, Zhou Liangchen, et al. Provenance Method for SAR Data Processing Flow[J]. Geomatics and Information Science of Wuhan University, 2017, 42(5):669-675.)
[15] 袁洁. 基于关联数据技术的空间数据溯源共享研究[D]. 武汉: 武汉大学, 2013.
[15] ( Yuan Jie. Research on Geospatial Data Provenance Sharing Based on Linked Data Approach[D]. Wuhan: Wuhan University, 2013.)
[16] Hasan R, Sion R, Winslett M. Introducing Secure Provenance: Problems and Challenges[C]//Proceedings of the 2007 ACM Workshop on Storage Security and Survivability. New York: ACM Press, 2007: 13-18.
[17] 李秀美, 王凤英. 数据起源安全模型研究[J]. 山东理工大学学报(自然科学版), 2010, 24(4):56-60.
[17] ( Li Xiumei, Wang Fengying. Research on Data Provenance's Security Model[J]. Journal of Shandong University of Technology(Natural Science Edition), 2010, 24(4):56-60.)
[18] 王凤英, 张方, 张伟. 基于医疗健康大数据的安全起源模型与可信性验证算法[J]. 山东理工大学学报(自然科学版), 2017, 31(6):6-11.
[18] ( Wang Fengying, Zhang Fang, Zhang Wei. Securing Data Provenance and Creditability Validation Study Based on Big Data of Health Care[J]. Journal of Shandong University of Technology (Natural Science Edition), 2017, 31(6):6-11.)
[19] 邓仲华, 容益芳. 一种分层次的数据溯源安全模型[J]. 图书馆学研究, 2016(20):36-41.
[19] ( Deng Zhonghua, Rong Yifang. A Hierarchical Data Traceability Security Model[J]. Researches in Library Science, 2016(20):36-41.)
[20] 刘耀宗, 刘云恒. 基于区块链的RFID大数据安全溯源模型[J]. 计算机科学, 2018, 45(S2):367-368,381.
[20] ( Liu Yaozong, Liu Yunheng. Security Provenance Model for RFID Big Data Based on Blockchain[J]. Computer Science, 2018, 45(S2):367-368,381.)
[21] Liang X P, Shetty S, Tosh D, et al. ProvChain: A Blockchain-Based Data Provenance Architecture in Cloud Environment with Enhanced Privacy and Availability[C]// Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(CCGRID). IEEE, 2017: 468-477.
[22] 王芳, 赵洪, 马嘉悦, 等. 数据科学视角下数据溯源研究与实践进展[J]. 中国图书馆学报, 2019, 45(5):79-100.
[22] ( Wang Fang, Zhao Hong, Ma Jiayue, et al. Research and Practice Progress of Data Provenance from the Perspective of Data Science[J]. Journal of Library Science in China, 2019, 45(5):79-100.)
[23] 周忠. 数据起源技术研究及其在PostgreSQL中的实现[D]. 广州: 华南理工大学, 2016.
[23] ( Zhou Zhong. A Research of Data Provenance Technology and Its Implementation in PostgreSQL[D]. Guangzhou: South China University of Technology, 2016.)
[24] Karvounarakis G, Ives Z G, Tannen V. Querying Data Provenance[C]// Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. 2010: 951-962.
[25] 王黎维, 鲍芝峰, Koehler Henning, 等. 一种优化关系型溯源信息存储的新方法[J]. 计算机学报, 2011, 34(10):1863-1875.
[25] ( Wang Liwei, Bao Zhifeng, Koehler Henning, et al. An Approach for Optimizing Relational Provenance Storage[J]. Chinese Journal of Computers, 2011, 34(10):1863-1875.)
[26] Deutch D, Milo T, Roy S, et al. Circuits for Datalog Provenance[C]// Proceedings of International Conference on Database Theory. 2014: 201-212.
[27] Chapman A P, Jagadish H V, Ramanan P. Efficient Provenance Storage[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 2008: 993-1006.
[28] 王黎维, 黄泽谦, 罗敏, 等. 集成对象代理数据库的科学工作流服务框架中的数据跟踪[J]. 计算机学报, 2008, 31(5):721-732.
[28] ( Wang Liwei, Huang Zeqian, Luo Min, et al. Data Provenance in a Scientific Workflow Service Framework Integrated with Object Deputy Database[J]. Chinese Journal of Computers, 2008, 31(5):721-732.)
[29] 吴渊. 工作流系统-Nebulas中数据溯源框架的设计与实现[D]. 昆明: 昆明理工大学, 2011.
[29] ( Wu Yuan. Design and Implementation of a Provenance Framework in Workflow System-Nebulas[D]. Kunming: Kunming University of Science and Technology, 2011.)
[30] 邓仲华, 魏银珍. 面向数据发布的科学工作流数据溯源方法研究[J]. 图书与情报, 2014(3):61-66.
[30] ( Deng Zhonghua, Wei Yinzhen. Study on the Method of Provenance in Science Workflow for Data Publishing[J]. Library & Information, 2014(3):61-66.)
[31] Billings J J. Applying Distributed Ledgers to Manage Workflow Provenance[J]. arXiv Preprint, arXiv:1804.05395.
[32] 魏银珍, 邓仲华. 云环境下科学工作流的溯源数据收集和查询框架研究[J]. 情报理论与实践, 2015, 38(7):115-118.
[32] ( Wei Yinzhen, Deng Zhonghua. Research on Data Provenance Collection and Query Framework of Scientific Workflow in Cloud Environment[J]. Information Studies: Theory & Application, 2015, 38(7):115-118.)
[33] Park H, Ikeda R, Widom J. RAMP:A System for Capturing and Tracing Provenance in MapReduce Workflows[C]// Proceedings of the 37th International Conference on Very Large Data Bases(VLDB 2011). 2011: 1351-1354.
[34] Saad M I M, Jalil K A, Manaf M. Data Provenance Trusted Model in Cloud Computing[C]// Proceedings of 2013 International Conference on Research and Innovation in Information Systems (ICRIIS). IEEE, 2013: 257-262.
[35] Zawoad S, Hasan R. SECAP: Towards Securing Application Provenance in the Cloud[C]// Proceedings of IEEE 9th International Conference on Cloud Computing. IEEE, 2016: 900-903.
[36] Tosh D K, Shetty S, Liang X P, et al. Consensus Protocols for Blockchain-Based Data Provenance: Challenges and Opportunities[C]// Proceedings of the 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference. IEEE, 2017: 469-474.
[37] Kim H M, Laskowski M. Towards an Ontology-Driven Blockchain Design for Supply Chain Provenance [OL]. arXiv Preprint, arXiv:1610.02922.
[38] 王若佳, 李培. 基于互联网搜索数据的流感监测模型比较与优化[J]. 图书情报工作, 2016, 60(18):122-132.
[38] ( Wang Ruojia, Li Pei. Detecting Influenza Epidemics by Comparing and Optimizing Models Based on Internet Search Engine Query Data[J]. Library and Information Service, 2016, 60(18):122-132.)
[39] 王迪, 杨广义. 基于区块链溯源技术重大公共卫生事件的舆情防控研究[J]. 河北工程大学学报(社会科学版), 2021, 38(1):30-33.
[39] ( Wang Di, Yang Guangyi. Research on Public Opinion Risk Control of Major Public Health Problems Based on Blockchain Traceability Technology[J]. Journal of Hebei University of Engineering (Social Science Edition), 2021, 38(1):30-33.)
[40] 人工智能助力疾病预测平安科技携手重庆疾控联合研发全球首创人工智能+大数据流感预测模型[EB/OL]. [2017-07-25]. http://www.pingan.cn/zh/common/cn_news/1500961992328.shtml .
[40] (AI Assisted Disease Prediction PingAn Science and Technology Cooperates with Chongqing CDC to Jointly Develop the First Global AI + Big Data Influenza Prediction Model[EB/OL].[2017-07-25]. http://www.pingan.cn/zh/common/cn_news/1500961992328.shtml .
[41] 朱鹏, 朱星圳, 王莉, 等. 基于时间序列与信息融合的突发事件信息瀑布溯源方法[J]. 现代情报, 2018, 38(10):38-42.
[41] ( Zhu Peng, Zhu Xingzhen, Wang Li, et al. Tracing Method of Emergencies Information Cascade Based on Time Series and Information Fusion[J]. Journal of Modern Information, 2018, 38(10):38-42.)
[42] 陈卫哨. 微博突发事件检测及溯源技术研究[D]. 哈尔滨: 哈尔滨工程大学, 2014.
[42] ( Chen Weishao. Burst Event Detection and Initialyzing Technology Research in Micro-Blog[D]. Harbin: Harbin Engineering University, 2014.)
[43] 京东万象以科技助力数据流通, 采用区块链技术促行业健康发展[EB/OL].[2017-01-11].https://wx.jdcloud.com/resources/preview/58?winzoom=1 .
[43] (Jingdong Vientiane Uses Science and Technology to Facilitate Data Circulation and Uses Blockchain Technology to Promote Healthy Development of the Industry[EB/OL]. [2017-01-11].https://wx.jdcloud.com/resources/preview/58?winzoom=1 .)
[44] 缪新萍, 吴漾, 孔庆波, 等. 电网企业指标数据溯源模型研究与设计[J]. 电力大数据, 2021, 24(4):70-77.
[44] ( Miao Xinping, Wu Yang, Kong Qingbo, et al. Research and Design of Index Data Provenance Model for Power Grid Enterprises[J]. Power Systems and Big Data, 2021, 24(4):70-77.)
[45] 王姝, 孙善鹏, 樊景超, 等. 基于区块链的农业科学数据溯源应用初探[J]. 农业大数据学报, 2020, 2(2):47-54.
[45] ( Wang Shu, Sun Shanpeng, Fan Jingchao, et al. Preliminary Study on the Traceability Application of Agricultural Science Data Based on Blockchain[J]. Journal of Agricultural Big Data, 2020, 2(2):47-54.)
[46] 英国数字保存中心发布指南《制定数据管理与共享计划》[EB/OL].[2011-11-17]http://www.ecas.cas.cn/xxkw/kbcd/201115_83713/ml/xxhzlyzc/201111/t20111117_3397761.html .
[46] (GuideLines Issued by the British Digital Preservation Centre 'Developing Data Management and Sharing Plan'[EB/OL].[2011-11-17]. http://www.ecas.cas.cn/xxkw/kbcd/201115_83713/ml/xxhzlyzc/201111/t20111117_3397761.html
[47] 谷俊, 许鑫. 人文社科数据共享模型的设计与实现: 以联盟链技术为例[J]. 情报学报, 2019, 38(4):354-367.
[47] ( Gu Jun, Xu Xin. Design and Implementation of a Humanities and Social Sciences Data Sharing Model: A Case Study of Consortium Blockchain[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(4):354-367.)
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn