Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (1): 72-84    DOI: 10.11925/infotech.2096-3467.2018.0506
Identifying Risks of HS Codes by China Customs
Zixuan Zhang1,2,Hao Wang1,2(),Liping Zhu1,2,3,Sanhong eng1,2
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3Nanjing Customs District, P.R.China, Nanjing 210001, China
[Objective] This study tries to utilize patterns from the HS codes to provide effective knowledge service for the China customs taxation. [Methods] We proposed two machine learning-based automatic classification schemes. The first one directly used original HS codes as risk identifiers while the other one relied on the correctness of the HS codes. We also built a SVM prediction model and examined the two schemes from the perspectives of target structures and features, as well as the text length. [Results] We found that the second model required less training efforts and processing time and then reached better accuracy. [Limitations] Only used four-month-data to train the new models. [Conclusions] This study finds an effective way to forecast customs risks, and indicate directions of applicable products.

Key wordsRisk Identification      HS Prediction      SVM      Text Classification      Machine Learning     
Received: 07 May 2018      Published: 04 March 2019

Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs. Data Analysis and Knowledge Discovery, 2019, 3(1): 72-84.

