Please wait a minute...
Advanced Search
数据分析与知识发现  2019, Vol. 3 Issue (1): 72-84    DOI: 10.11925/infotech.2096-3467.2018.0506
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
中国海关HS编码风险的识别研究*
张紫玄1,2,王昊1,2(),朱立平1,2,3,邓三鸿1,2
1南京大学信息管理学院 南京 210023
2江苏省数据工程与知识服务重点实验室 南京 210023
3中华人民共和国南京海关 南京 210001
Identifying Risks of HS Codes by China Customs
Zixuan Zhang1,2,Hao Wang1,2(),Liping Zhu1,2,3,Sanhong eng1,2
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3Nanjing Customs District, P.R.China, Nanjing 210001, China
全文: PDF(897 KB)   HTML ( 9
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】利用HS编码数据中所蕴含的规律, 为海关税收风险判断分析提供有效的知识服务。【方法】提出直接以HS编码作为风险判别目标和以HS编码正误作为风险判别目标两种基于机器学习的自动分类方案解决HS编码风险判断问题, 针对编码目标的结构、特征的性质、文本的长短等特征构建与方案对应的SVM预测模型并进行相应实验。【结果】对以HS编码作为判别目标和以HS编码正误作为判别目标两种预测海关报关风险方案进行探讨与分析, 发现后者对训练数据的要求更低, 预测速度更快, 风险的识别效果也更好。【局限】仅获得4个月的数据, 可能存在样本代表性不足的问题。【结论】最终经过测试获得风险预测率较高的分类器, 为形成可实用的分类模型和判别系统提供了良好的知识基础。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
张紫玄
王昊
朱立平
邓三鸿
关键词 风险识别HS编码预测SVM算法文本分类机器学习    
Abstract

[Objective] This study tries to utilize patterns from the HS codes to provide effective knowledge service for the China customs taxation. [Methods] We proposed two machine learning-based automatic classification schemes. The first one directly used original HS codes as risk identifiers while the other one relied on the correctness of the HS codes. We also built a SVM prediction model and examined the two schemes from the perspectives of target structures and features, as well as the text length. [Results] We found that the second model required less training efforts and processing time and then reached better accuracy. [Limitations] Only used four-month-data to train the new models. [Conclusions] This study finds an effective way to forecast customs risks, and indicate directions of applicable products.

Key wordsRisk Identification    HS Prediction    SVM    Text Classification    Machine Learning
收稿日期: 2018-05-07     
基金资助:*本文系江苏省研究生科研与实践创新计划项目“大数据环境下海关商品归类风险分析和规避研究”(项目编号: SJCX18_0009)和“南京海关税收大数据分析咨询项目”的研究成果之一
引用本文:   
张紫玄,王昊,朱立平,邓三鸿. 中国海关HS编码风险的识别研究*[J]. 数据分析与知识发现, 2019, 3(1): 72-84.
Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs. Data Analysis and Knowledge Discovery, DOI:10.11925/infotech.2096-3467.2018.0506.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.0506
[1] Zhang S, Zhao S.The Implication of Customs Modernization on Export Competitiveness in China[A]// Impact of Trade Facilitation on Export Competitiveness: A Regional Perspective[M]. 2009, 66: 121-131.
[2] Laporte B.Risk Management Systems: Using Data Mining in Developing Countries’ Customs Administrations[J]. World Customs Journal, 2011, 5(1): 17-27.
[3] 白雪燕. 中国海关概论[M]. 北京: 中国海关出版社, 2011.
[3] (Bai Xueyan.Introduction to China Customs[M]. Beijing: China Custom Press, 2011.)
[4] Pierce J R, Schott P K.A Concordance Between Ten-Digit U.S. Harmonized System Codes and Sic/Naics Product Classes and Industries[J]. Journal of Economic and Social Measurement, 2012, 37(1-2): 61-96.
[5] 海关总署. 中华人民共和国海关报关员执业管理办法[J]. 中华人民共和国国务院公报, 2007(12): 30-33.
[5] (General Administration of Customs. The Customs of the People’s Republic of China. Customs Administration Measures[J]. The State Council of the People’s Republic of China, 2007(12): 30-33.)
[6] 周欣, 张弛海. 基于数据挖掘的海关风险分类预测模型研究[J]. 海关与经贸研究, 2017, 38(2):22-31.
[6] (Zhou Xin, Zhang Chihai.Customs Risk Classification and Forecasting Model Based on Data Mining[J]. Journal of Customs and Trade, 2017, 38(2): 22-31.)
[7] 卢金秋. 数据挖掘中的人工神经网络算法及应用研究[D]. 杭州: 浙江工业大学, 2005.
[7] (Lu Jinqiu.Research and Application on Artificial Neural Network Algorithm in Data Mining[D]. Hangzhou: Zhejiang University of Technology, 2005.)
[8] 杨海. 现代海关制度建设中的难点及对策研究[D]. 武汉: 华中科技大学, 2008.
[8] (Yang Hai.A Research on Crux and the Counterplan Within Construction of Modern Customs System[D]. Wuhan: Huazhong University of Science and Technology, 2008.)
[9] 马俊. 基于关联规则的海关审单商品分组研究[D]. 大连: 大连理工大学, 2006.
[9] (Ma Jun.ARM-Based Research on Commodity Grouping for Customs Documents Checking[D]. Dalian: Dalian University of Technology, 2006.)
[10] 唐麒麟, 李长生. 美国海关“预进口复审系统”简介[J]. 中国海关, 1994(11): 44-45.
[10] (Tang Qilin, Li Changsheng.Introduction of U.S. Customs “Pre-import Review System”[J]. China Custom, 1994(11): 44-45.)
[11] Zdanowicz J S.Detecting Money Laundering and Terrorist Financing via Data Mining[J]. Communications of the ACM, 2004, 47(5): 53-55.
[12] Hoffmann L.A Critical Look at the Current International Response to Combat Trade-Based Money Laundering: The Risk-Based Customs Audit as a Solution[J]. Texas International Law Journal, 2013, 48(2): 325.
[13] 操辉. 韩国海关全心开发风险管理系统[J]. 中国海关, 2001(7): 60-61.
[13] (Cao Hui.South Korean Customs Devotes Heart to Risk Management System[J]. China Custom, 2001(7): 60-61.)
[14] 张荣忠. 印度海关的巨大进步[J]. 中国海关, 2004(8): 46-47.
[14] (Zhang Rongzhong.The Great Progress of Indian Customs[J]. China Custom, 2004(8): 46-47.)
[15] Coundoul O, Gadiaga M,Geourjon A M, et al.Inspecting Less to Inspect Better: The Use of Data Mining for Risk Management by Customs Administrations[R]. Working Papers, 2012: 46.
[16] Shao H, Zhao H, Chang G.Applying Data Mining to Detect Fraud Behavior in Customs Declaration[C]// Proceedings of the 2002 International Conference on Machine Learning and Cybernetics, 2002: 1241-1244.
[17] 任尔伟, 牟青杰, 孙学文. 数据挖掘技术在海关查验和价格瞒骗辅助决策中的应用[J]. 上海海关高等专科学校学报, 2002(3): 58-61.
[17] (Ren Erwei, Mou Qingjie, Sun Xuewen.Application of Data Mining Technology in Customs Inspection and Price-cheat Assistant Decision-making[J]. Journal of Shanghai Customs College, 2002(3): 58-61.)
[18] 张云波, 邓波, 苏锦秀. 数据挖掘在海关商品查验中的应用[J]. 上海海关高等专科学校学报, 2003(2): 51-55.
[18] (Zhang Yunbo, Deng Bo, Su Jinxiu.Application of Data Mining in Customs Inspection[J]. Journal of Shanghai Customs College, 2003(2): 51-55.)
[19] 卢金秋. 人工神经网络在海关风险管理中的应用研究[J]. 计算机工程与应用, 2006, 42(27): 208-211.
[19] (Lu Jinqiu.Application Research on Customs Risk-management Based on Artificial Neural Networks[J]. Computer Engineering and Applications, 2006, 42(27): 208-211.)
[20] 喻宇. 重庆海关进出口数据挖掘与分析[D]. 重庆: 重庆大学, 2008.
[20] (Yu Yu.Mining and Analysising of Chongqing Customs’ Import and Export Data[D]. Chongqing: Chongqing University, 2008.)
[21] 杨波. 关于进出口商品归类风险的成因探析和防范[J]. 海关与经贸研究, 2016, 37(1): 59-81.
[21] (Yang Bo.Cause and Prevention of the Risks in Import and Export Commodities Classification[J]. Journal of Customs and Trade, 2016, 37(1): 59-81.)
[22] 刘昌伟, 段景辉. 基于因子分析法的海关风险管理评价分析[J]. 海关与经贸研究, 2016, 37(6): 27-42.
[22] (Liu Changwei, Duan Jinghui.On Evaluation of Customs Risk Management on the Basis of Factor Analysis[J]. Journal of Customs and Trade, 2016, 37(6): 27-42.)
[23] 张亦鸣. 1996年版《商品名称及编码协调制度》对我国进出口税则的影响[J]. 中国海关, 1995(2): 27-28.
[23] (Zhang Yiming. The Influence of the 1996 Version of the Harmonized Commodity Name and Coding System on China’s Import and Export Tariffs[J]. China Custom, 1995(2): 27-28.)
[24] 王克海. 大规模产品生产作业计划作业事项号的自动生成[J]. 系统工程理论与实践, 1994(8): 51-55.
[24] (Wang Kehai.The Automatic Generation of the Event Number for the Large-Scale Producting Task Schedule[J]. Systems Engineering-Theory & Practice, 1994(8): 51-55.)
[25] 陈东明, 常桂然. 基于分段编码自动生成产品结构树的研究[J]. 计算机集成制造系统, 2005, 11(7): 1014-1018.
[25] (Chen Dongming, Chang Guiran.Automatic Creation of Product Structure Tree Based on Segment Coding[J]. Computer Integrated Manufacturing Systems, 2005, 11(7): 1014-1018.)
[26] 王昊, 严明, 苏新宁. 基于机器学习的中文书目自动分类研究[J]. 中国图书馆学报, 2010, 36(6): 28-39.
[26] (Wang Hao, Yan Ming, Su Xinning.Research on Automatic Classification of Chinese Language Items Based on Machine Learning[J]. Journal of Library Science in China, 2010, 36(6): 28-39.)
[27] Wang J, Lee M C.Reconstructing DDC for Interactive Classification[C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM, 2007: 137-146.
[28] Koller D, Sahami M.Hierarchically Classifying Documents Using Very Few Words[C]// Proceedings of the 14th International Conference on Machine Learning. 1997: 170-178.
[29] Zimek A, Buchwald F, Frank E, et al.A Study of Hierarchical and Flat Classification of Proteins[J]. IEEE/ACM Transactions on Computational Biology & Bioinformatics, 2010, 7(3): 563-571.
[30] 王昊, 叶鹏, 邓三鸿. 机器学习在中文期刊论文自动分类研究中的应用[J]. 现代图书情报技术, 2014(3): 80-87.
[30] (Wang Hao, Ye Peng, Deng Sanhong.The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
[31] 谢小楚. 数据挖掘技术在海关缉私系统中的设计与应用[D]. 北京: 北京工业大学, 2007.
[31] (Xie Xiaochu.The Design and Application of Data Mining Technology in Customs Smuggling Systems[D]. Beijing: Beijing University of Technology, 2007.)
[32] 严俊龙, 李铁源. 基于SVM的网络安全风险评估模型及应用[J]. 计算机与数字工程, 2012, 40(1): 82-84.
[32] (Yan Junlong, Li Tieyuan.Assessing Model of Network Security Risk Based on SVM[J]. Computer and Digital Engineering, 2012, 40(1): 82-84.)
[33] 罗方科, 陈晓红. 基于Logistic回归模型的个人小额贷款信用风险评估及应用[J]. 财经理论与实践, 2017, 38(1): 30-35.
[33] (Luo Fangke, Chen Xiaohong.Credit Risk Assessment of Personal Small Loan Based on Logistic Regression Model and Its Application[J]. The Theory and Practice of Finance and Economics, 2017, 38(1): 30-35.)
[34] 海关总署关税征管司. 进出口税则商品及品目注释[M]. 北京: 中国商务出版社, 2011.
[34] (Customs Administration Department.Import and Export Tariff Notes on Commodities and Products[M]. Beijing: China Business Press, 2011.)
[35] 陆跃平. 《商品名称及编码协调制度》及其公约介绍[J]. 国际贸易, 1992(1): 51-53.
[35] (Lu Yueping.“Commodity Name and Coding Coordination System” and Its Convention Introduction[J]. International Trade, 1992(1): 51-53.)
[36] 中华人民共和国海关进出口税则编委会. 中华人民共和国海关进出口税则[M]. 北京: 经济日报出版社, 2012.
[36] (Customs Import and Export Tariff Editorial Board of the People’s Republic of China. Customs Import and Export Tariff of the People’s Republic of China[M]. Beijing: Economic Daily Press, 2012.)
[37] 海关总署统计司. 中华人民共和国海关统计商品目录[M].北京: 中国海关出版社, 2014.
[37] (Statistical Department of the General Administration of Customs. Catalogue of Customs Statistics of the People’s Republic of China[M]. Beijing: China Customs Press, 2014.)
[38] 陆彦婷, 陆建峰, 杨静宇. 层次分类方法综述[J]. 模式识别与人工智能, 2013, 26(12): 1130-1139.
[38] (Lu Yanting, Lu Jianfeng, Yang Jingyu.A Survey of Hierarchical Classification Methods[J]. Pattern Recognition and Artificial Intelligence, 2013, 26(12): 1130-1139.)
[39] 李森. 层次化文本分类方法的研究[D]. 济南: 山东大学, 2007.
[39] (Li Sen.Research on Hierarchy Document Classification[D]. Jinan: Shandong University, 2007.)
[40] McCallum A, Rosenfeld R, Mitchell T M, et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes[C]// Proceedings of the 15th International Conference on Machine Learning. 1998: 359-367.
[41] 胥丽娜. 海关商品归类错误的风险及其防范[J]. 对外经贸实务, 2015(11): 70-73.
[41] (Xu Lina.The Risk of Misclassification of Customs Commodities and Its Prevention[J]. Practice in Foreign Economic Relations and Trade, 2015(11): 70-73.)
[42] Joachims T.Making Large-Scale SVM Learning Practical[R]. Advances in Kernel Methods-Support Vector Learning, DOI: 10.17877/DE290R-14262.
[43] Leslie C, Eskin E, Noble W S.The Spectrum Kernel: A String Kernel for SVM Protein Classification[J]. Pacific Symposium on Biocomputing, 2002: 564-575.
[44] 曹予思. 我国海关查验工作绩效评估的研究[D]. 北京: 中央财经大学, 2010.
[44] (Cao Yusi.Study on Performance Evaluation of China Customs Inspection Work[D]. Beijing: Central University of Finance and Economics, 2010.)
[1] 胡佳慧,方安,赵琬清,杨晨柳,任慧玲. 面向知识发现的中文电子病历标注方法
研究 *
[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[2] 张金柱,胡一鸣. 融合表示学习与机器学习的专利科学引文标题自动抽取研究*[J]. 数据分析与知识发现, 2019, 3(5): 68-76.
[3] 余本功,陈杨楠,杨颖. 基于nBD-SVM模型的投诉短文本分类*[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[4] 刘志强,都云程,施水才. 基于改进的隐马尔科夫模型的网页新闻关键信息抽取*[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[5] 徐红霞,李春旺. 科技文献内容知识点抽取研究综述[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[6] 谭章禄,王兆刚,胡翰. 一种基于χ2统计的特征分类选择方法研究*[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[7] 刘丽娜,齐佳音,张镇平,曾丹. 品牌对商品在线销量的影响*——基于海量商品评论的在线声誉和品牌知名度的调节作用研究[J]. 数据分析与知识发现, 2018, 2(9): 10-21.
[8] 李心蕾,王昊,刘小敏,邓三鸿. 面向微博短文本分类的文本向量化方法比较研究*[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[9] 贾隆嘉,张邦佐. 高校网络舆情安全中主题分类方法研究*——以新浪微博数据为例[J]. 数据分析与知识发现, 2018, 2(7): 55-62.
[10] 陆伟,罗梦奇,丁恒,李信. 深度学习图像标注与用户标注比较研究*[J]. 数据分析与知识发现, 2018, 2(5): 1-10.
[11] 李琳,李辉. 一种基于概念向量空间的文本相似度计算方法[J]. 数据分析与知识发现, 2018, 2(5): 48-58.
[12] 刘浏,王东波. 基于论文自动分类的社科类学科跨学科性研究*[J]. 数据分析与知识发现, 2018, 2(3): 30-38.
[13] 王丽,邹丽雪,刘细文. 基于LDA主题模型的文献关联分析及可视化研究[J]. 数据分析与知识发现, 2018, 2(3): 98-106.
[14] 冯国明,张晓冬,刘素辉. 基于CapsNet的中文文本分类研究*[J]. 数据分析与知识发现, 2018, 2(12): 68-76.
[15] 范馨月,崔雷. 基于网络属性的抗肿瘤药物靶点预测方法及其应用*[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn