1School of Information Management, Nanjing University, Nanjing 210023, China 2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China 3Nanjing Customs District, P.R.China, Nanjing 210001, China
[Objective] This study tries to utilize patterns from the HS codes to provide effective knowledge service for the China customs taxation. [Methods] We proposed two machine learning-based automatic classification schemes. The first one directly used original HS codes as risk identifiers while the other one relied on the correctness of the HS codes. We also built a SVM prediction model and examined the two schemes from the perspectives of target structures and features, as well as the text length. [Results] We found that the second model required less training efforts and processing time and then reached better accuracy. [Limitations] Only used four-month-data to train the new models. [Conclusions] This study finds an effective way to forecast customs risks, and indicate directions of applicable products.
Zhang S, Zhao S.The Implication of Customs Modernization on Export Competitiveness in China[A]// Impact of Trade Facilitation on Export Competitiveness: A Regional Perspective[M]. 2009, 66: 121-131.
[2]
Laporte B.Risk Management Systems: Using Data Mining in Developing Countries’ Customs Administrations[J]. World Customs Journal, 2011, 5(1): 17-27.
[3]
白雪燕. 中国海关概论[M]. 北京: 中国海关出版社, 2011.
[3]
(Bai Xueyan.Introduction to China Customs[M]. Beijing: China Custom Press, 2011.)
[4]
Pierce J R, Schott P K.A Concordance Between Ten-Digit U.S. Harmonized System Codes and Sic/Naics Product Classes and Industries[J]. Journal of Economic and Social Measurement, 2012, 37(1-2): 61-96.
(General Administration of Customs. The Customs of the People’s Republic of China. Customs Administration Measures[J]. The State Council of the People’s Republic of China, 2007(12): 30-33.)
(Zhou Xin, Zhang Chihai.Customs Risk Classification and Forecasting Model Based on Data Mining[J]. Journal of Customs and Trade, 2017, 38(2): 22-31.)
[7]
卢金秋. 数据挖掘中的人工神经网络算法及应用研究[D]. 杭州: 浙江工业大学, 2005.
[7]
(Lu Jinqiu.Research and Application on Artificial Neural Network Algorithm in Data Mining[D]. Hangzhou: Zhejiang University of Technology, 2005.)
[8]
杨海. 现代海关制度建设中的难点及对策研究[D]. 武汉: 华中科技大学, 2008.
[8]
(Yang Hai.A Research on Crux and the Counterplan Within Construction of Modern Customs System[D]. Wuhan: Huazhong University of Science and Technology, 2008.)
[9]
马俊. 基于关联规则的海关审单商品分组研究[D]. 大连: 大连理工大学, 2006.
[9]
(Ma Jun.ARM-Based Research on Commodity Grouping for Customs Documents Checking[D]. Dalian: Dalian University of Technology, 2006.)
(Tang Qilin, Li Changsheng.Introduction of U.S. Customs “Pre-import Review System”[J]. China Custom, 1994(11): 44-45.)
[11]
Zdanowicz J S.Detecting Money Laundering and Terrorist Financing via Data Mining[J]. Communications of the ACM, 2004, 47(5): 53-55.
[12]
Hoffmann L.A Critical Look at the Current International Response to Combat Trade-Based Money Laundering: The Risk-Based Customs Audit as a Solution[J]. Texas International Law Journal, 2013, 48(2): 325.
[13]
操辉. 韩国海关全心开发风险管理系统[J]. 中国海关, 2001(7): 60-61.
[13]
(Cao Hui.South Korean Customs Devotes Heart to Risk Management System[J]. China Custom, 2001(7): 60-61.)
[14]
张荣忠. 印度海关的巨大进步[J]. 中国海关, 2004(8): 46-47.
[14]
(Zhang Rongzhong.The Great Progress of Indian Customs[J]. China Custom, 2004(8): 46-47.)
[15]
Coundoul O, Gadiaga M,Geourjon A M, et al.Inspecting Less to Inspect Better: The Use of Data Mining for Risk Management by Customs Administrations[R]. Working Papers, 2012: 46.
[16]
Shao H, Zhao H, Chang G.Applying Data Mining to Detect Fraud Behavior in Customs Declaration[C]// Proceedings of the 2002 International Conference on Machine Learning and Cybernetics, 2002: 1241-1244.
(Ren Erwei, Mou Qingjie, Sun Xuewen.Application of Data Mining Technology in Customs Inspection and Price-cheat Assistant Decision-making[J]. Journal of Shanghai Customs College, 2002(3): 58-61.)
(Lu Jinqiu.Application Research on Customs Risk-management Based on Artificial Neural Networks[J]. Computer Engineering and Applications, 2006, 42(27): 208-211.)
[20]
喻宇. 重庆海关进出口数据挖掘与分析[D]. 重庆: 重庆大学, 2008.
[20]
(Yu Yu.Mining and Analysising of Chongqing Customs’ Import and Export Data[D]. Chongqing: Chongqing University, 2008.)
(Liu Changwei, Duan Jinghui.On Evaluation of Customs Risk Management on the Basis of Factor Analysis[J]. Journal of Customs and Trade, 2016, 37(6): 27-42.)
(Zhang Yiming. The Influence of the 1996 Version of the Harmonized Commodity Name and Coding System on China’s Import and Export Tariffs[J]. China Custom, 1995(2): 27-28.)
(Wang Kehai.The Automatic Generation of the Event Number for the Large-Scale Producting Task Schedule[J]. Systems Engineering-Theory & Practice, 1994(8): 51-55.)
(Chen Dongming, Chang Guiran.Automatic Creation of Product Structure Tree Based on Segment Coding[J]. Computer Integrated Manufacturing Systems, 2005, 11(7): 1014-1018.)
(Wang Hao, Yan Ming, Su Xinning.Research on Automatic Classification of Chinese Language Items Based on Machine Learning[J]. Journal of Library Science in China, 2010, 36(6): 28-39.)
[27]
Wang J, Lee M C.Reconstructing DDC for Interactive Classification[C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 2007: 137-146.
[28]
Koller D, Sahami M.Hierarchically Classifying Documents Using Very Few Words[C]// Proceedings of the 14th International Conference on Machine Learning. 1997: 170-178.
[29]
Zimek A, Buchwald F, Frank E, et al.A Study of Hierarchical and Flat Classification of Proteins[J]. IEEE/ACM Transactions on Computational Biology & Bioinformatics, 2010, 7(3): 563-571.
(Wang Hao, Ye Peng, Deng Sanhong.The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
[31]
谢小楚. 数据挖掘技术在海关缉私系统中的设计与应用[D]. 北京: 北京工业大学, 2007.
[31]
(Xie Xiaochu.The Design and Application of Data Mining Technology in Customs Smuggling Systems[D]. Beijing: Beijing University of Technology, 2007.)
(Luo Fangke, Chen Xiaohong.Credit Risk Assessment of Personal Small Loan Based on Logistic Regression Model and Its Application[J]. The Theory and Practice of Finance and Economics, 2017, 38(1): 30-35.)
[34]
海关总署关税征管司. 进出口税则商品及品目注释[M]. 北京: 中国商务出版社, 2011.
[34]
(Customs Administration Department.Import and Export Tariff Notes on Commodities and Products[M]. Beijing: China Business Press, 2011.)
(Customs Import and Export Tariff Editorial Board of the People’s Republic of China. Customs Import and Export Tariff of the People’s Republic of China[M]. Beijing: Economic Daily Press, 2012.)
[37]
海关总署统计司. 中华人民共和国海关统计商品目录[M].北京: 中国海关出版社, 2014.
[37]
(Statistical Department of the General Administration of Customs. Catalogue of Customs Statistics of the People’s Republic of China[M]. Beijing: China Customs Press, 2014.)
(Lu Yanting, Lu Jianfeng, Yang Jingyu.A Survey of Hierarchical Classification Methods[J]. Pattern Recognition and Artificial Intelligence, 2013, 26(12): 1130-1139.)
[39]
李森. 层次化文本分类方法的研究[D]. 济南: 山东大学, 2007.
[39]
(Li Sen.Research on Hierarchy Document Classification[D]. Jinan: Shandong University, 2007.)
[40]
McCallum A, Rosenfeld R, Mitchell T M, et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes[C]// Proceedings of the 15th International Conference on Machine Learning. 1998: 359-367.
[41]
胥丽娜. 海关商品归类错误的风险及其防范[J]. 对外经贸实务, 2015(11): 70-73.
[41]
(Xu Lina.The Risk of Misclassification of Customs Commodities and Its Prevention[J]. Practice in Foreign Economic Relations and Trade, 2015(11): 70-73.)