|
|
Identifying Tax Audit Cases with Multi-task Learning |
Li Guofeng1,Li Zuojuan1( ),Wang Zheji1,Wu Meng2 |
1School of Statistics, Shandong University of Finance and Economics, Jinan 250014, China 2School of Economics, Shandong University of Finance and Economics, Jinan 250014, China |
|
|
Abstract [Objective] This paper integrates tax-related data from multiple sources, and uses machine learning methods to identify the illegal corporate tax evasions. [Methods] First, we use web-scraping, text mining, and other methods to collect business financial data, executive information, and media coverage of the corporations. Then, we used the random forest method for feature selection and established indictors for the candidate companies. Then, we built a discriminatory model with the multi-task sparse structure learning based on the improved focal loss function. Finally, we trained the model with different types of tax audits to identify the needed candidates. [Results] We examined our model with real world datasets and found it had good performance for various applications. Its mean recall rate reached 0.830 9, which was 0.135 1 and 0.103 3 higher than the logistic method and the traditional multi-task sparse structure learning. [Limitations] The model needs to be examined with datasets not from the listed companies. [Conclusions] The new model could identify the target enterprises with various dishonest tax evasions. This study provides new directions for smart tax audit by the government.
|
Received: 30 September 2021
Published: 25 January 2022
|
|
Fund:National Social Science Fund of China(19BTJ023) |
Corresponding Authors:
Li Zuojuan
E-mail: lzj901231@163.com
|
[1] |
Wu R S, Ou C S, Lin H Y, et al. Using Data Mining Technique to Enhance Tax Evasion Detection Performance[J]. Expert Systems with Applications, 2012, 39(10): 8769-8777.
|
[2] |
刘尚希, 孙静. 大数据思维:在税收风险管理中的应用[J]. 经济研究参考, 2016(9):19-26.
|
[2] |
(Liu Shangxi, Sun Jing. Big Data Thinking: Application in Tax Risk Management[J]. Review of Economic Research, 2016(9):19-26.)
|
[3] |
Lismont J, Cardinaels E, Bruynseels L, et al. Predicting Tax Avoidance by Means of Social Network Analytics[J]. Decision Support Systems, 2018, 108: 13-24.
|
[4] |
王艳杰, 李清, 齐鑫鑫. 基于 Logistic 回归的税务稽查选案模型研究[J]. 经济研究导刊, 2012(35): 96-97.
|
[4] |
(Wang Yanjie, Li Qing, Qi Xinxin. Research on the Tax Inspection Selection Scheme Model Based on the Logistic Regression[J]. Economic Research Guide, 2012(35): 96-97.)
|
[5] |
程书生. 浅析大数据背景下制造业税务风险管理[J]. 纳税, 2020(14): 13-15.
|
[5] |
(Cheng Shusheng. Analysis of Tax Risk Management in Manufacturing Industry in the Context of Big Data[J]. Taxation, 2020(14): 13-15.)
|
[6] |
Slemrod J. The Economics of Corporate Tax Selfishness[J]. National Tax Journal, 2004, 57(4): 877-899.
|
[7] |
田高良, 李星, 司毅. 期权激励、媒体关注与税收激进行为:基于媒体情绪的公司治理机制研究[J]. 管理工程学报, 2019, 33(1): 1-11.
|
[7] |
(Tian Gaoliang, Li Xing, Si Yi, et al. Option Incentives, Media Coverage and Tax Aggressive: The Corporate Governance Mechanism of Media from Coverage Mode Perspective[J]. Journal of Industrial Engineering and Engineering Management, 2019, 33(1): 1-11.)
|
[8] |
Lin T Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.
|
[9] |
Gonçalves A R, Zuben F V J, Banerjee A. Multi-task Sparse Structure Learning with Gaussian Copula Models[J]. Journal of Machine Learning Research, 2016, 17: 1-30.
|
[10] |
李选举. Tobit模型与税收稽查[J]. 统计研究, 2000, 17(1):46-50.
|
[10] |
(Li Xuanju. Tobit Model and Tax Audit[J]. Statistical Research, 2000, 17(1):46-50.)
|
[11] |
González P C, Velásquez J D. Characterization and Detection of Taxpayers with False Invoices Using Data Mining Techniques[J]. Expert Systems with Applications, 2013, 40(5): 1427-1436.
|
[12] |
齐鑫鑫. 识别偷税的税务稽查方法研究[D]. 长春: 吉林大学, 2010.
|
[12] |
(Qi Xinxin. The Research on the Tax Inspection Methods about Identifying Tax Evasion[D]. Changchun: Jilin University, 2010.)
|
[13] |
唐登山. 税务稽查选案方法探析[J]. 税务研究, 2011(4):61-63.
|
[13] |
(Tang Dengshan. Exploration of Case Selection Method of Tax Audit[J]. Taxation Research, 2011(4):61-63.)
|
[14] |
Rahimikia E, Mohammadi S, Rahmani T, et al. Detecting Corporate Tax Evasion Using a Hybrid Intelligent System: A Case Study of Iran[J]. International Journal of Accounting Information Systems, 2017, 25: 1-17.
|
[15] |
谢旭人. 加强税收经济分析和企业纳税评估, 提高税源管理水平[J]. 税务研究, 2007 (5): 3-10.
|
[15] |
(Xie Xuren. Strengthen Tax Economic Analysis and Enterprise Tax Assessment to Improve Tax Source Management[J]. Taxation Research, 2007(5): 3-10.)
|
[16] |
范辉. “互联网+”思维下完善税收风险识别指标体系的探索[J]. 税务研究, 2019(11): 77-81.
|
[16] |
(Fan Hui. A Discussion on the Improvement of the Tax Risk Identification Index System from the “Interne-Plus” Perspective[J]. Taxation Research, 2019(11): 77-81.)
|
[17] |
Bonilla E V, Chai K M A, Williams C K I. Multi-task Gaussian Process Prediction[C]// Proceedings of the 20th Annual Conference on Neural Information Processing Systems. 2007: 153-160.
|
[18] |
Zhang Y, Yeung D Y. A Convex Formulation for Learning Task Relationships in Multi-Task Learning[C]// Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence. 2010: 733-742.
|
[19] |
邢新颖, 冀俊忠, 姚垚. 基于自适应多任务卷积神经网络的脑网络分类方法[J]. 计算机研究与发展, 2020, 57(7): 1449-1459.
|
[19] |
(Xing Xinying, Ji Junzhong, Yao Yao. Brain Networks Classification Based on an Adaptive Multi-task Convolutional Neural Networks[J]. Journal of Computer Research and Development, 2020, 57(7): 1449-1459.)
|
[20] |
杨晗迅, 周德群, 马静, 等. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
|
[20] |
(Yang Hanxun, Zhou Dequn, Ma Jing, et al. Detecting Rumors with Uncertain Loss and Task-level Attention Mechanism[J]. Data Analysis and Knowledge Discovery, 2021, 5(7): 101-110.)
|
[21] |
郑红霞, 韩梅芳. 基于不同股权结构的上市公司税收筹划行为研究——来自中国国有上市公司和民营上市公司的经验证据[J]. 中国软科学, 2008(9): 122-131.
|
[21] |
(Zheng Hongxia, Han Meifang. Tax Planning Analysis Based on Listed Company with Different Ownership Structure: The Empirical Evidence from State-owned Listed Company and Private Listed Company in China[J]. China Soft Science, 2008(9): 122-131.)
|
[22] |
刘华, 张天敏, 徐建斌. 高管个人特征与公司税负[J]. 税务与经济, 2012(4): 58-64.
|
[22] |
(Liu Hua, Zhang Tianmin, Xu Jianbin. Personal Characteristics of Top Executives and Company Tax Burden[J]. Taxation and Economy, 2012(4): 58-64.)
|
[23] |
Desai M A, Dyck A, Zingales L. Theft and Taxes[J]. Journal of Financial Economics, 2007, 84(3): 591-623.
|
[24] |
于忠泊, 田高良, 齐保垒, 等. 媒体关注的公司治理机制——基于盈余管理视角的考察[J]. 管理世界, 2011(9): 127-140.
|
[24] |
(Yu Zhongbo, Tian Gaoliang, Qi Baolei, et al. Corporate Governance Mechanisms of Media Attention: An Examination Based on the Perspective of Surplus Management[J]. Management World, 2011(9): 127-140.)
|
[25] |
Kamkar I, Gupta S K, Phung D, et al. Stable Feature Selection for Clinical Prediction: Exploiting ICD Tree Structure Using Tree-Lasso[J]. Journal of Biomedical Informatics, 2015, 53: 277-290.
|
[26] |
Geurts P, Ernst D, Wehenkel L. Extremely Randomized Trees[J]. Machine Learning, 2006, 63(1): 3-42.
|
[27] |
Jiang R, Tang W W, Wu X B, et al. A Random Forest Approach to the Detection of Epistatic Interactions in Case-Control Studies[J]. BMC Bioinformatics, 2009, 10: S65.
|
[28] |
Gorski J, Pfeuffer F, Klamroth K. Biconvex Sets and Optimization with Biconvex Functions: A Survey and Extensions[J]. Mathematical Methods of Operations Research, 2007, 66(3): 373-407.
|
[29] |
Beck A, Teboulle M. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202.
|
[30] |
Boyd S, Parikh N, Chu E, et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers[J]. Foundation and Trends in Machine Learning, 2010, 3(1): 1-122.
|
[31] |
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic Minority Over-Sampling Technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1): 321-357.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|