[Objective] This paper proposed a Social Network Image Privacy classifier based on transfer learning to provide reasonable hints for users to avoid accidentally uploading private information.[Methods] A new standard image dataset was created by gathering and annotating images from the Weibo platform. The deep transfer learning and fine-tuning of various image pre-training models were applied to classify whether the Weibo images contain privacy information or not automatically.[Results] With the same amount of data, the accuracy of transfer learning is improved by at least 30 percent compared to non-transfer learning approaches. Most ResNet deep neural network architectures can achieve more than 88% accuracy with transfer learning. Among them, ResNet50 has the highest recall rate (94.31%), accuracy (90.80%) and F1 value (91.11%), and the shortest testing time (148s). It has been selected out after comprehensive measurements of the above metrics and recommended as the most suitable model structure for current scenario requirements.[Limitations] The amount of labeled data in this study is relatively small, which may not be able to cover all the types of private information.[Conclusions] This study validates the feasibility and efficiency of deep transfer learning in the field of classification of private Weibo images. The result can be applied to various types of social media platforms to warn users about the risk of privacy leaking. The annotated image dataset can be used in others’ further researches as both a foundation and a comparison.
王树义,刘赛,马峥. 基于深度迁移学习的微博图像隐私分类研究*[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
Wang Shuyi,Liu Sai,Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning. Data Analysis and Knowledge Discovery, 2020, 4(10): 80-92.
Wang N, Xu H, Grossklags J. Third-Party Apps on Facebook: Privacy and the Illusion of Control[C]//Proceedings of the 5th ACM Symposium on Computer Human Interaction for Management of Information Technology. 2011: No. 4.
( Gu Liping, Yang Miao. The Boundaries of the “Secondary Use” of Personal Privacy Data[J]. Journalism & Communication, 2016(9):75-86.)
[3]
Mayer-Schönberger V. Delete: The Virtue of Forgetting in the Digital Age[M]. Princeton University Press, 2011.
[4]
Norberg P A, Horne D R, Horne D A. The Privacy Paradox: Personal Information Disclosure Intentions Versus Behaviors[J]. Journal of Consumer Affairs, 2007,41(1):100-126.
doi: 10.1111/joca.2007.41.issue-1
[5]
Wachter S. Normative Challenges of Identification in the Internet of Things: Privacy, Profiling, Discrimination, and the GDPR[J]. Computer Law & Security Review, 2018,34(3):436-449.
[6]
Jensen C, Potts C, Jensen C. Privacy Practices of Internet Users: Self-Reports Versus Observed Behavior[J]. International Journal of Human-Computer Studies, 2005,63(1-2):203-227.
doi: 10.1016/j.ijhcs.2005.04.019
[7]
Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-V4, Inception-Resnet and the Impact of Residual Connections on Learning[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17). CA,USA. 2017: 4278-4284.
[8]
Bhalgat Y, Shah M, Awate S. Annotation-Cost Minimization for Medical Image Segmentation Using Suggestive Mixed Supervision Fully Convolutional Networks[OL]. arXiv Preprint, arXiv: 1812. 11302.
[9]
Zhou Z W, Shin J, Zhang L, et al. Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7340-7351.
( Li Fenghua, Sun Zhe, Niu Ben, et al. Privacy-Preserving Photo Sharing Framework Cross Different Social Network[J]. Journal on Communications, 2019,40(7):1-13.)
( Zhang Jianwu, Shen Wei, Wu Zhendong. Recognition of Face Privacy Protection Using Convolutional Neural Networks[J]. Journal of Image and Graphics, 2019,24(5):744-752.)
[13]
Zerr S, Siersdorfer S, Hare J, et al. Privacy-Aware Image Classification and Search[C]//Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. Hannover, GER, 2012: 35-44.
[14]
Tonge A, Caragea C. Privacy Prediction of Images Shared on Social Media Sites Using Deep Features[OL]. arXiv Preprint, arXiv:1510.08583.
[15]
Tonge A, Caragea C. On the Use of “Deep” Features for Online Image Sharing[C]//Proceedings of the Web Conference. 2018: 1317-1321.
[16]
Squicciarini A C, Caragea C, Balakavi R. Analyzing Images’ Privacy for the Modern Web[C]//Proceedings of the 25th ACM Conference on Hypertext and Social Media. 2014: 136-147.
[17]
Tonge A, Caragea C. Image Privacy Prediction Using Deep Neural Networks[OL]. arXiv Preprint, arXiv: 1903.03695.
[18]
Orekondy T, Schiele B, Fritz M. Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images[OL]. arXiv Preprint, arXiv:1703.10660.
[19]
黄兴森. 基于深度学习的图像隐私感知算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2019.
[19]
( Huang Xingsen. Research on the Algorithm of Image Privacy-Aware Based on Deep Learning[D]. Harbin: Harbin Institute of Technology, 2019.)
[20]
Haralick R M, Shanmugam K, Dinstein I H. Textural Features for Image Classification[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1973(6):610-621.
[21]
Chandrashekar G, Sahin F. A Survey on Feature Selection Methods[J]. Computers & Electrical Engineering, 2014,40(1):16-28.
[22]
Bosch A, Zisserman A, Munoz X. Image Classification Using Random Forests and Ferns[C]//Proceedings of 2007 IEEE 11th International Conference on Computer Vision. 2007: 1-8.
[23]
Chapelle O, Haffner P, Vapnik V N. Support Vector Machines for Histogram-Based Image Classification[J]. IEEE Transactions on Neural Networks, 1999,10(5):1055-1064.
doi: 10.1109/72.788646
pmid: 18252608
[24]
Lauzon F Q. An Introduction to Deep Learning[C]//Proceedings of the 11th International Conference on Information Science, Signal Processing and Their Applications (ISSPA). 2012: 1438-1439.
[25]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1. 2012: 1097-1105.
[26]
He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 770-778.
[27]
Philipp G, Song D, Carbonell J G. The Exploding Gradient Problem Demystified-Definition, Prevalence, Impact, Origin, Tradeoffs, and Solutions[OL]. arXiv Preprint, arXiv:1712.05577.
[28]
Sorokin A, Forsyth D. Utility Data Annotation with Amazon Mechanical Turk[C]//Proceedings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2008: 1-8.
[29]
SrivastavaRI N, Hinton G, Krizhevsky A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting[J]. The Journal of Machine Learning Research, 2014,15(1):1929-1958.
[30]
Tan C Q, Sun F C, Kong T, et al. A Survey on Deep Transfer Learning[C]//Proceedings of International Conference on Artificial Neural Networks. Springer, 2018: 270-279.
[31]
Wang B L, Yao Y S, Viswanath B, et al. With Great Training Comes Great Vulnerability: Practical Attacks Against Transfer Learning[C]// Proceedings of the 27th USENIX Conference on Security Symposium. 2018: 1281-1297.
[32]
Yosinski J, Clune J, Bengio Y, et al. How Transferable Are Features in Deep Neural Networks?[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. 2014: 3320-3328.
( Long Mansheng, Ouyang Chunjuan, Liu Huan, et al. Image Recognition of Camellia Oleifera Diseases Based on Convolutional Neural Network & Transfer Learning[J]. Transactions of the Chinese Society of Agricultural Engineering, 2018,34(18):194-201.)
( Liu Ying, Zhang Shuai, Fan Jiulun. Tread Pattern Image Classification with Feature Fusion Based on Transfer Learning[J]. Computer Engineering and Design, 2019,40(5):1401-1406.)
[35]
Li X, Zhang L P, Du B, et al. Iterative Reweighting Heterogeneous Transfer Learning Framework for Supervised Remote Sensing Image Classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017,10(5):2022-2035.
doi: 10.1109/JSTARS.2016.2646138
[36]
Nguyen L D, Lin D Y, Lin Z P, et al. Deep CNNs for Microscopic Image Classification by Exploiting Transfer Learning and Feature Concatenation[C]//Proceedings of 2018 IEEE International Symposium on Circuits and Systems (ISCAS). 2018. DOI: 10.1109/ISCAS.2018.8351550.
[37]
Lee K H, He X D, Zhang L, et al. CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 5447-5456.
[38]
Han D M, Liu Q G, Fan W G. A New Image Classification Method Using CNN Transfer Learning and Web Data Augmentation[J]. Expert Systems with Applications, 2018,95:43-56.
doi: 10.1016/j.eswa.2017.11.028
[39]
Howard J, Gugger S. fastai: A Layered API for Deep Learning[J]. Information, 2020,11(2). DOI: 10.3390/info11020108.
(Sina Weibo Data Center. 2018 Weibo User Development Report-Application Report-Weibo Report-Weibo[EB/OL]. (2019-03-15)[2020-02-06]. https://data.weibo.com/report/reportDetail?id=433.)
[41]
苏扬. 娱乐新闻中的明星隐私曝光现象研究[D]. 长沙: 湖南师范大学, 2015.
[41]
( Su Yang. The Research on the Exposure Phenomenon of Stars’ Privacy in Entertainment News[D]. Changsha: Hunan Normal University, 2015.)
[42]
Li Y F, Troutman W, Knijnenburg B P, et al. Human Perceptions of Sensitive Content in Photos[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2018: 1671-1676.
[43]
Smith L N. Cyclical Learning Rates for Training Neural Networks[C]//Proceedings of 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). 2017: 464-472.
[44]
Smith L N. A Disciplined Approach to Neural Network Hyper-Parameters: Part 1-Learning Rate, Batch Size, Momentum, and Weight Decay[OL]. arXiv Preprint, arXiv:1803.09820.