Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (1): 72-84    DOI: 10.11925/infotech.2096-3467.2018.0506
Current Issue | Archive | Adv Search |
Identifying Risks of HS Codes by China Customs
Zixuan Zhang1,2,Hao Wang1,2(),Liping Zhu1,2,3,Sanhong eng1,2
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3Nanjing Customs District, P.R.China, Nanjing 210001, China
Download: PDF (897 KB)   HTML ( 11
Export: BibTeX | EndNote (RIS)      

[Objective] This study tries to utilize patterns from the HS codes to provide effective knowledge service for the China customs taxation. [Methods] We proposed two machine learning-based automatic classification schemes. The first one directly used original HS codes as risk identifiers while the other one relied on the correctness of the HS codes. We also built a SVM prediction model and examined the two schemes from the perspectives of target structures and features, as well as the text length. [Results] We found that the second model required less training efforts and processing time and then reached better accuracy. [Limitations] Only used four-month-data to train the new models. [Conclusions] This study finds an effective way to forecast customs risks, and indicate directions of applicable products.

Key wordsRisk Identification      HS Prediction      SVM      Text Classification      Machine Learning     
Received: 07 May 2018      Published: 04 March 2019

Cite this article:

Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs. Data Analysis and Knowledge Discovery, 2019, 3(1): 72-84.

URL:     OR

[1] Zhang S, Zhao S.The Implication of Customs Modernization on Export Competitiveness in China[A]// Impact of Trade Facilitation on Export Competitiveness: A Regional Perspective[M]. 2009, 66: 121-131.
[2] Laporte B.Risk Management Systems: Using Data Mining in Developing Countries’ Customs Administrations[J]. World Customs Journal, 2011, 5(1): 17-27.
[3] 白雪燕. 中国海关概论[M]. 北京: 中国海关出版社, 2011.
[3] (Bai Xueyan.Introduction to China Customs[M]. Beijing: China Custom Press, 2011.)
[4] Pierce J R, Schott P K.A Concordance Between Ten-Digit U.S. Harmonized System Codes and Sic/Naics Product Classes and Industries[J]. Journal of Economic and Social Measurement, 2012, 37(1-2): 61-96.
[5] 海关总署. 中华人民共和国海关报关员执业管理办法[J]. 中华人民共和国国务院公报, 2007(12): 30-33.
[5] (General Administration of Customs. The Customs of the People’s Republic of China. Customs Administration Measures[J]. The State Council of the People’s Republic of China, 2007(12): 30-33.)
[6] 周欣, 张弛海. 基于数据挖掘的海关风险分类预测模型研究[J]. 海关与经贸研究, 2017, 38(2):22-31.
[6] (Zhou Xin, Zhang Chihai.Customs Risk Classification and Forecasting Model Based on Data Mining[J]. Journal of Customs and Trade, 2017, 38(2): 22-31.)
[7] 卢金秋. 数据挖掘中的人工神经网络算法及应用研究[D]. 杭州: 浙江工业大学, 2005.
[7] (Lu Jinqiu.Research and Application on Artificial Neural Network Algorithm in Data Mining[D]. Hangzhou: Zhejiang University of Technology, 2005.)
[8] 杨海. 现代海关制度建设中的难点及对策研究[D]. 武汉: 华中科技大学, 2008.
[8] (Yang Hai.A Research on Crux and the Counterplan Within Construction of Modern Customs System[D]. Wuhan: Huazhong University of Science and Technology, 2008.)
[9] 马俊. 基于关联规则的海关审单商品分组研究[D]. 大连: 大连理工大学, 2006.
[9] (Ma Jun.ARM-Based Research on Commodity Grouping for Customs Documents Checking[D]. Dalian: Dalian University of Technology, 2006.)
[10] 唐麒麟, 李长生. 美国海关“预进口复审系统”简介[J]. 中国海关, 1994(11): 44-45.
[10] (Tang Qilin, Li Changsheng.Introduction of U.S. Customs “Pre-import Review System”[J]. China Custom, 1994(11): 44-45.)
[11] Zdanowicz J S.Detecting Money Laundering and Terrorist Financing via Data Mining[J]. Communications of the ACM, 2004, 47(5): 53-55.
[12] Hoffmann L.A Critical Look at the Current International Response to Combat Trade-Based Money Laundering: The Risk-Based Customs Audit as a Solution[J]. Texas International Law Journal, 2013, 48(2): 325.
[13] 操辉. 韩国海关全心开发风险管理系统[J]. 中国海关, 2001(7): 60-61.
[13] (Cao Hui.South Korean Customs Devotes Heart to Risk Management System[J]. China Custom, 2001(7): 60-61.)
[14] 张荣忠. 印度海关的巨大进步[J]. 中国海关, 2004(8): 46-47.
[14] (Zhang Rongzhong.The Great Progress of Indian Customs[J]. China Custom, 2004(8): 46-47.)
[15] Coundoul O, Gadiaga M,Geourjon A M, et al.Inspecting Less to Inspect Better: The Use of Data Mining for Risk Management by Customs Administrations[R]. Working Papers, 2012: 46.
[16] Shao H, Zhao H, Chang G.Applying Data Mining to Detect Fraud Behavior in Customs Declaration[C]// Proceedings of the 2002 International Conference on Machine Learning and Cybernetics, 2002: 1241-1244.
[17] 任尔伟, 牟青杰, 孙学文. 数据挖掘技术在海关查验和价格瞒骗辅助决策中的应用[J]. 上海海关高等专科学校学报, 2002(3): 58-61.
[17] (Ren Erwei, Mou Qingjie, Sun Xuewen.Application of Data Mining Technology in Customs Inspection and Price-cheat Assistant Decision-making[J]. Journal of Shanghai Customs College, 2002(3): 58-61.)
[18] 张云波, 邓波, 苏锦秀. 数据挖掘在海关商品查验中的应用[J]. 上海海关高等专科学校学报, 2003(2): 51-55.
[18] (Zhang Yunbo, Deng Bo, Su Jinxiu.Application of Data Mining in Customs Inspection[J]. Journal of Shanghai Customs College, 2003(2): 51-55.)
[19] 卢金秋. 人工神经网络在海关风险管理中的应用研究[J]. 计算机工程与应用, 2006, 42(27): 208-211.
[19] (Lu Jinqiu.Application Research on Customs Risk-management Based on Artificial Neural Networks[J]. Computer Engineering and Applications, 2006, 42(27): 208-211.)
[20] 喻宇. 重庆海关进出口数据挖掘与分析[D]. 重庆: 重庆大学, 2008.
[20] (Yu Yu.Mining and Analysising of Chongqing Customs’ Import and Export Data[D]. Chongqing: Chongqing University, 2008.)
[21] 杨波. 关于进出口商品归类风险的成因探析和防范[J]. 海关与经贸研究, 2016, 37(1): 59-81.
[21] (Yang Bo.Cause and Prevention of the Risks in Import and Export Commodities Classification[J]. Journal of Customs and Trade, 2016, 37(1): 59-81.)
[22] 刘昌伟, 段景辉. 基于因子分析法的海关风险管理评价分析[J]. 海关与经贸研究, 2016, 37(6): 27-42.
[22] (Liu Changwei, Duan Jinghui.On Evaluation of Customs Risk Management on the Basis of Factor Analysis[J]. Journal of Customs and Trade, 2016, 37(6): 27-42.)
[23] 张亦鸣. 1996年版《商品名称及编码协调制度》对我国进出口税则的影响[J]. 中国海关, 1995(2): 27-28.
[23] (Zhang Yiming. The Influence of the 1996 Version of the Harmonized Commodity Name and Coding System on China’s Import and Export Tariffs[J]. China Custom, 1995(2): 27-28.)
[24] 王克海. 大规模产品生产作业计划作业事项号的自动生成[J]. 系统工程理论与实践, 1994(8): 51-55.
[24] (Wang Kehai.The Automatic Generation of the Event Number for the Large-Scale Producting Task Schedule[J]. Systems Engineering-Theory & Practice, 1994(8): 51-55.)
[25] 陈东明, 常桂然. 基于分段编码自动生成产品结构树的研究[J]. 计算机集成制造系统, 2005, 11(7): 1014-1018.
[25] (Chen Dongming, Chang Guiran.Automatic Creation of Product Structure Tree Based on Segment Coding[J]. Computer Integrated Manufacturing Systems, 2005, 11(7): 1014-1018.)
[26] 王昊, 严明, 苏新宁. 基于机器学习的中文书目自动分类研究[J]. 中国图书馆学报, 2010, 36(6): 28-39.
[26] (Wang Hao, Yan Ming, Su Xinning.Research on Automatic Classification of Chinese Language Items Based on Machine Learning[J]. Journal of Library Science in China, 2010, 36(6): 28-39.)
[27] Wang J, Lee M C.Reconstructing DDC for Interactive Classification[C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 2007: 137-146.
[28] Koller D, Sahami M.Hierarchically Classifying Documents Using Very Few Words[C]// Proceedings of the 14th International Conference on Machine Learning. 1997: 170-178.
[29] Zimek A, Buchwald F, Frank E, et al.A Study of Hierarchical and Flat Classification of Proteins[J]. IEEE/ACM Transactions on Computational Biology & Bioinformatics, 2010, 7(3): 563-571.
[30] 王昊, 叶鹏, 邓三鸿. 机器学习在中文期刊论文自动分类研究中的应用[J]. 现代图书情报技术, 2014(3): 80-87.
[30] (Wang Hao, Ye Peng, Deng Sanhong.The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
[31] 谢小楚. 数据挖掘技术在海关缉私系统中的设计与应用[D]. 北京: 北京工业大学, 2007.
[31] (Xie Xiaochu.The Design and Application of Data Mining Technology in Customs Smuggling Systems[D]. Beijing: Beijing University of Technology, 2007.)
[32] 严俊龙, 李铁源. 基于SVM的网络安全风险评估模型及应用[J]. 计算机与数字工程, 2012, 40(1): 82-84.
[32] (Yan Junlong, Li Tieyuan.Assessing Model of Network Security Risk Based on SVM[J]. Computer and Digital Engineering, 2012, 40(1): 82-84.)
[33] 罗方科, 陈晓红. 基于Logistic回归模型的个人小额贷款信用风险评估及应用[J]. 财经理论与实践, 2017, 38(1): 30-35.
[33] (Luo Fangke, Chen Xiaohong.Credit Risk Assessment of Personal Small Loan Based on Logistic Regression Model and Its Application[J]. The Theory and Practice of Finance and Economics, 2017, 38(1): 30-35.)
[34] 海关总署关税征管司. 进出口税则商品及品目注释[M]. 北京: 中国商务出版社, 2011.
[34] (Customs Administration Department.Import and Export Tariff Notes on Commodities and Products[M]. Beijing: China Business Press, 2011.)
[35] 陆跃平. 《商品名称及编码协调制度》及其公约介绍[J]. 国际贸易, 1992(1): 51-53.
[35] (Lu Yueping.“Commodity Name and Coding Coordination System” and Its Convention Introduction[J]. International Trade, 1992(1): 51-53.)
[36] 中华人民共和国海关进出口税则编委会. 中华人民共和国海关进出口税则[M]. 北京: 经济日报出版社, 2012.
[36] (Customs Import and Export Tariff Editorial Board of the People’s Republic of China. Customs Import and Export Tariff of the People’s Republic of China[M]. Beijing: Economic Daily Press, 2012.)
[37] 海关总署统计司. 中华人民共和国海关统计商品目录[M].北京: 中国海关出版社, 2014.
[37] (Statistical Department of the General Administration of Customs. Catalogue of Customs Statistics of the People’s Republic of China[M]. Beijing: China Customs Press, 2014.)
[38] 陆彦婷, 陆建峰, 杨静宇. 层次分类方法综述[J]. 模式识别与人工智能, 2013, 26(12): 1130-1139.
[38] (Lu Yanting, Lu Jianfeng, Yang Jingyu.A Survey of Hierarchical Classification Methods[J]. Pattern Recognition and Artificial Intelligence, 2013, 26(12): 1130-1139.)
[39] 李森. 层次化文本分类方法的研究[D]. 济南: 山东大学, 2007.
[39] (Li Sen.Research on Hierarchy Document Classification[D]. Jinan: Shandong University, 2007.)
[40] McCallum A, Rosenfeld R, Mitchell T M, et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes[C]// Proceedings of the 15th International Conference on Machine Learning. 1998: 359-367.
[41] 胥丽娜. 海关商品归类错误的风险及其防范[J]. 对外经贸实务, 2015(11): 70-73.
[41] (Xu Lina.The Risk of Misclassification of Customs Commodities and Its Prevention[J]. Practice in Foreign Economic Relations and Trade, 2015(11): 70-73.)
[42] Joachims T.Making Large-Scale SVM Learning Practical[R]. Advances in Kernel Methods-Support Vector Learning, DOI: 10.17877/DE290R-14262.
[43] Leslie C, Eskin E, Noble W S.The Spectrum Kernel: A String Kernel for SVM Protein Classification[J]. Pacific Symposium on Biocomputing, 2002: 564-575.
[44] 曹予思. 我国海关查验工作绩效评估的研究[D]. 北京: 中央财经大学, 2010.
[44] (Cao Yusi.Study on Performance Evaluation of China Customs Inspection Work[D]. Beijing: Central University of Finance and Economics, 2010.)
[1] Wang Hanxue,Cui Wenjuan,Zhou Yuanchun,Du Yi. Identifying Pathogens of Foodborne Diseases with Machine Learning[J]. 数据分析与知识发现, 2021, 5(9): 54-62.
[2] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[3] Chen Donghua,Zhao Hongmei,Shang Xiaopu,Zhang Runtong. Optimizing Large Hospital Operating Rooms with Data Analytics[J]. 数据分析与知识发现, 2021, 5(9): 115-128.
[4] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[5] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[6] Su Qiang, Hou Xiaoli, Zou Ni. Predicting Surgical Infections Based on Machine Learning[J]. 数据分析与知识发现, 2021, 5(8): 65-75.
[7] Cao Rui,Liao Bin,Li Min,Sun Ruina. Predicting Prices and Analyzing Features of Online Short-Term Rentals Based on XGBoost[J]. 数据分析与知识发现, 2021, 5(6): 51-65.
[8] Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[9] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[10] Xiang Zhuoyuan,Liu Zhicong,Wu Yu. Adaptive Recommendation Model Based on User Behaviors[J]. 数据分析与知识发现, 2021, 5(4): 103-114.
[11] Shen Wang, Li Shiyu, Liu Jiayu, Li He. Optimizing Quality Evaluation for Answers of Q&A Community[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[12] Wang Yan, Wang Huyan, Yu Bengong. Chinese Text Classification with Feature Fusion[J]. 数据分析与知识发现, 2021, 5(10): 1-14.
[13] Chai Guorong,Wang Bin,Sha Yongzhong. Public Health Risk Forecasting with Multiple Machine Learning Methods Combined:Case Study of Influenza Forecasting in Lanzhou, China[J]. 数据分析与知识发现, 2021, 5(1): 90-98.
[14] Chen Dong,Wang Jiandong,Li Huiying,Cai Sihang,Huang Qianqian,Yi Chengqi,Cao Pan. Forecasting Poultry Turnovers with Machine Learning and Multiple Factors[J]. 数据分析与知识发现, 2020, 4(7): 18-27.
[15] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938