Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (2): 58-63    DOI: 10.11925/infotech.2096-3467.2017.0809
Data Masking Analysis Based on Internet Big Data
Qianyi Zhou(),Yamin Wang,Chuang Wang
School of Economics and Management, Xidian University, Xi’an 710126, China
[Objective] This paper aims to improve the classification results of anonymous groups and then obtain better data masking model and algorithm. [Methods] First, we modified the dimension judgment standards based on k-anonymity. Then, we used the KD tree as storage structure to construct a new algorithm. Third, we implemented the proposed algorithm with Python. Finally, we examined the feasibility and effectiveness of the new algorithm with the number of anonymous groups and the percentage of NCP. [Results] The new algorithm could maximize the number of anonymous groups generated by the whole dataset, while the percentage of NCP was lower than similar algorithms. [Limitations] For datasets with significant degree of dispersion, the dimension of the loop computation was cumbersome. [Conclusions] The proposed algorithm could improve the availability of the anonymous groups and reduce the data loss.

Key wordsData Masking      k-anonymity      Integer Division     
Received: 15 August 2017      Published: 07 March 2018

Qianyi Zhou,Yamin Wang,Chuang Wang. Data Masking Analysis Based on Internet Big Data. Data Analysis and Knowledge Discovery, 2018, 2(2): 58-63.

