|
|
Comprehensive Management System and Technical Framework of Data Quality in the Data Circulation Transaction Scenario |
Huang Qianqian1,2,Zhao Zheng2(),Liu Zhaoyin3 |
1School of Information Resource Management, Renmin University of China, Beijing 100872, China 2Department of Big Data Development, State Information Center, Beijing 100045, China 3Department of Research and Consulting, Greater Bay Area Big Data Research Institute, Shenzhen 518048, China |
|
|
Abstract [Objective] In the context of data transaction, in order to strengthen data circulation management and improve data circulation transaction rules, a set of comprehensive data quality management system and technical framework under the scenario of data circulation transaction are constructed according to the focus of data product quality evaluation and management. [Methods] Using literature research method, we reviewed the current literature of data quality assessment and commonly used methods of data quality inspection at home and abroad. Combining industry experience and specific scenarios of data transactions, we proposed a quality evaluation model containing raw data sets, desensitized data sets, modeled data, and AI-based data, along with a management system to improve the data quality before, during, and after data transactions. [Results] This paper raises a data quality evaluation model in transaction context that based on the “6543” structure, namely six types of main indicators, five types of subjects, four types of products, and three types of evaluation methods. Provide testing and optimization solutions to data normativeness and completeness in the pre-transaction phase, data accuracy and consistency during the transaction phase, as well as data timeliness and accessibility in post-transaction phase. [Limitations] The data quality model and management system have not been systematically used in real transaction scenarios, and there is a lack of actual testing. [Conclusions] The proposed quality evaluation model and quality management system play an important role in realizing the quality evaluation and improvement of data products in the whole process of data transaction.
|
Received: 18 December 2021
Published: 22 February 2022
|
|
Corresponding Authors:
Zhao Zheng
E-mail: pmlzzz0426@163.com
|
[1] |
刘鹤 坚持和完善社会主义基本经济制度 人民日报, 2019-11-22(006)
|
[1] |
( Liu He. Upholding and Improving the Socialist Basic Economic System[J]. People's Daily, 2019-11-22(006)
|
[2] |
Ijab M T, Ahmad A, Kadir R A, et al. Towards Big Data Quality Framework for Malaysia's Public Sector Open Data Initiative [C]//Proceedings of International Visual Informatics Conference. Springer, Cham, 2017.
|
[3] |
邵艳红. 我国政府开放数据质量评价指标体系构建研究[D]. 保定: 河北大学, 2019.
|
[3] |
( Shao Yanhong. The Research on Construction of Quality Evaluation Index System of Chinese Open Government Data[D]. Baoding: Hebei University, 2019.)
|
[4] |
翟军, 陶晨阳, 李晓彤. 开放政府数据质量评估研究进展及启示[J]. 图书馆, 2018(12):74-79.
|
[4] |
( Zhai Jun, Tao Chenyang, Li Xiaotong. Progress and Inspiration of Research on Quality Assessment for Open Government Data[J]. Library, 2018(12):74-79.)
|
[5] |
张文文. 基于用户视角的政府统计数据质量综合评估[D]. 济南:山东大学, 2019.
|
[5] |
( Zhang Wenwen. Comprehensive Evaluation of the Quality of Government Statistical Data from the Perspective of Users[D]. Ji'nan: Shandong University, 2019.)
|
[6] |
莫祖英, 邝苗苗. 基于用户视角的政府开放数据质量评价模型及实证研究[J]. 大学图书情报学刊, 2020, 38(4):84-89.
|
[6] |
( Mo Zuying, Kuang Miaomiao. Empirical Research and Quality Evaluation Model of Government Open Data Based on User Perspective[J]. Journal of Academic Library and Information Science, 2020, 38(4):84-89.)
|
[7] |
刘博浩. 我国开放政府数据质量评价研究[D]. 郑州: 郑州大学, 2019.
|
[7] |
( Liu Bohao. Research on the Evaluation of Open Government Data Quality in China[D]. Zhengzhou:Zhengzhou University, 2019.)
|
[8] |
Behkamal B, Kahani M, Bagheri E. Quality Metrics for Linked Open Data [C]//Proceedings of the 26th International Conference on Database and Expert Systems Applications. Springer, Cham, 2015.
|
[9] |
Graves A, Hendler J. Visualization Tools for Open Government Data [C]//Proceedings of the 14th Annual International Conference on Digital Government Research. 2013: 136-145.
|
[10] |
Kubler S, Robert J, Neumaier S, et al. Comparison of Metadata Quality in Open Data Portals Using the Analytic Hierarchy Process[J]. Government Information Quarterly, 2018, 35(1):13-29.
doi: 10.1016/j.giq.2017.11.003
|
[11] |
廖书妍. 数据清洗研究综述[J]. 电脑知识与技术, 2020, 16(20):44-47.
|
[11] |
( Liao Shuyan. A Review of the Development of Data Cleaning[J]. Computer Knowledge and Technology, 2020, 16(20):44-47.)
|
[12] |
Monge A E, Elkan C. The Field Matching Problem: Algorithms and Applications [C]//Proceedings of Conference on Knowledge Discovery and Data Mining. 1996.
|
[13] |
周芝芬. 基于数据仓库的数据清洗方法研究[D]. 上海: 东华大学, 2004.
|
[13] |
( Zhou Zhifen. Research of Data Cleaning Method Based on Data Warehouse[D]. Shanghai: Donghua University, 2004.)
|
[14] |
Hernández M A, Stolfo S J. Real-world Data is Dirty: Data Cleansing and the Merge/Purge Problem[J]. Data Mining and Knowledge Discovery, 1998, 2(1):9-37.
doi: 10.1023/A:1009761603038
|
[15] |
《国家数据质量评价标准》: GB/T 36344-2018[S]. 北京: 全国信息技术标准化技术委员会, 2018.
|
[15] |
(Information Technology—Evaluation Indicators for Data Quality: GB/T 36344-2018[S]. Beijing: National Information Technology Standardization Technical Committee, 2018.)
|
[16] |
《政府数据数据脱敏工作指南》:DB52/T 1126-2016[S]. 贵州:贵州省质量技术监督局, 2016.
|
[16] |
(Governmental Data Work Instructions for Data Masking: DB52/T 1126-2016[S]. Guizhou: Guizhou's Bureau of Quality and Technical Supervision, 2016.)
|
[17] |
《人工智能深度学习算法评估规范》: AIOSS-01-2018[S]. 北京:中国人工智能开源软件发展联盟, 2018.
|
[17] |
(Artificial Intelligence—Assessment Specification for Deep Learning Algorithms: AIOSS-01-2018[S]. Beijing: China Artificial Intelligence Open Source Software Development League, 2018.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|