[Objective] This paper proposes a new model to improve the traditional text classifiers which tend to misclassify commodity titles with different labels and similar modifiers. [Methods] First, we designed the text discriminator as an auxiliary task, which took the normalized Euclidean distance of different label text vectors as the loss function. Then, we utilized the cross-entropy loss function of the traditional text classification to the new text encoder. Finally, we generated text representation with sufficient discrimination for different categories of commodity texts, and constructed the ITR-BiLSTM-Attention model. [Results] Compared with the BiLSTM-Attention model without text discriminator, the proposed model’s accuracy, precision, recall and F1 values improved by 1.84%, 2.31%, 2.88% and 2.82%, respectively. Compared with the Cos-BiLSTM-Attention model, our new model improved accuracy, precision, recall and F1 values by 0.53%, 0.54%, 1.21% and 1.01%, respectively. [Limitations] The impacts of different sampling methods on the model were not tested. We did not conduct experiment on a larger data set. [Conclusions] The text discriminator auxiliary task designed in this paper can improve the text representation generated by the text encoder. The item categorization model based on improved text representation was more effective than the traditional ones.
屠振超, 马静. 基于改进文本表示的商品文本分类算法研究*[J]. 数据分析与知识发现, 2022, 6(5): 34-43.
Tu Zhenchao, Ma Jing. Item Categorization Algorithm Based on Improved Text Representation. Data Analysis and Knowledge Discovery, 2022, 6(5): 34-43.
( He Bo, Ma Jing, Li Chi. Research on Commodity Text Classification Based on Fusion Features[J]. Information Studies: Theory & Application, 2020, 43(11): 162-168.)
( Li Xiaofeng, Ma Jing, Li Chi, et al. Ide.pngying Commodity Names Based on XGBoost Model[J]. Data Analysis and Knowledge Discovery, 2019, 3(7): 34-41.)
( Wan Jiashan, Wu Yunzhi. Review of Text Classification Research Based on Deep Learning[J]. Journal of Tianjin University of Technology, 2021, 37(2): 41-47.)
[4]
Ohashi S, Takayama J, Kajiwara T, et al. Text Classification with Negative Supervision[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 351-357.
[5]
Shen D H, Wang G Y, Wang W L, et al. Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 440-450.
[6]
Yang Z C, Yang D Y, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 1480-1489.
[7]
Qin Q, Hu W P, Liu B. Feature Projection for Improved Text Classification[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 8161-8171.
[8]
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
pmid: 9377276
[9]
Bengio Y, Ducharme R, Vincent P, et al. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003, 3: 1137-1155.
[10]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[11]
Brown T B, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[OL]. arXiv Preprint, arXiv: 2005.14165.
[12]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[13]
Androutsopoulos I, Koutsias J, Konstantinos V C, et al. An Evaluation of Naive Bayesian Anti-spam Iltering[C]// Proceedings of the 2000 Workshop on Machine Learning in the New Information Age. 2000: 9-17.
[14]
Tan S B. An Effective Refinement Strategy for KNN Text Classifier[J]. Expert Systems with Applications, 2006, 30(2): 290-298.
doi: 10.1016/j.eswa.2005.07.019
[15]
Forman G. BNS Feature Scaling: An Improved Representation over TF-IDF for SVM Text Classification[C]// Proceeding of the 17th ACM Conference on Information and Knowledge Mining. New York: ACM Press, 2008: 263-270.
[16]
Zhang Y F, Yu X L, Cui Z Y, et al. Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 334-339.
[17]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1746-1751.
[18]
Iyyer M, Manjunatha V, Boyd-Graber J, et al. Deep Unordered Composition Rivals Syntactic Methods for Text Classification[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 1681-1691.
[19]
Tang D Y, Qin B, Liu T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 1422-1432.
[20]
Dai A M, Le Q V. Semi-supervised Sequence Learning [A]// Advances in Neural Information Processing Systems[M]. 2015, 28: 3079-3087.
[21]
Jin P, Zhang Y, Chen X, et al. Bag-of-Embeddings for Text Classification[C]// Proceedings of the 25th International Joint Conference on A.pngicial Intelligence. 2016: 2824-2830.
[22]
Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 427-431.
( Sun Jiaqi, Wang Xiaoye, Zhou Xiaowen. Review of Text Classification Research Based on Neural Network Model[J]. Journal of Tianjin University of Technology, 2019, 35(5): 29-33.)
[24]
Chung J, Gulcehre C, Cho K, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[OL]. arXiv Preprint, arXiv: 1412.3555.
( Li Qixing, Liao Wei, Meng Jingwen. Dual-channel DAC-RNN Text Categorization Model Based on Attention Mechanism[J/OL]. Computer Engineering and Applications. [2021-08-18]. http://kns.cnki.net/kcms/detail/11.2127.tp.20210420.1354.070.html.)
( Ming Jianhua, Hu Chuang, Zhou Jianzheng, et al. TextCNN Based Filtering Model for Barrage in Live Video Broadcast[J]. Computer Engineering and Applications, 2021, 57(3): 162-167.)
[27]
Lai S, Xu L, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on A.pngicial Intelligence. 2015: 2267-2273.
[28]
Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[29]
王微. 融合全局和局部信息的度量学习方法研究[D]. 合肥: 中国科学技术大学, 2014.
[29]
( Wang Wei. Globality and Locality Incorporation in Distance Metric Learning[D]. Hefei: University of Science and Technology of China, 2014.)
[30]
Kulis B. Metric Learning: A Survey[J]. Foundations and Trends® in Machine Learning, 2013, 5(4): 287-364.
doi: 10.1561/2200000019
[31]
Musgrave K, Belongie S, Lim S N. A Metric Learning Reality Check[C]// Proceedings of the 16th European Conference on Computer Vision. 2020: 681-699.
[32]
Hadsell R, Chopra S, LeCun Y. Dimensionality Reduction by Learning an Invariant Mapping[C]// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2006: 1735-1742.
[33]
Schroff F, Kalenichenko D, Philbin J. FaceNet: A Unified Embedding for Face Recognition and Clustering[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015: 815-823.
[34]
Wen Y D, Zhang K P, Li Z F, et al. A Discriminative Feature Learning Approach for Deep Face Recognition[C]// Proceedings of the 14th European Conference on Computer Vision. 2016: 499-515.
[35]
Liu W Y, Wen Y D, Yu Z D, et al. Large-Margin Softmax Loss for Convolutional Neural Networks[OL]. arXiv Preprint, arXiv: 1612.02295.
[36]
Xuan H, Stylianou A, Liu X T, et al. Hard Negative Examples are Hard, but Useful[C]// Proceedings of the 16th European Conference on Computer Vision. 2020: 126-142.
( Jiang Tongqiang, Wan Zhonghe, Zhang Qingchuan. Text Classification of Food Safety Judgment Document Based on BLSTM and Self-Attention[J]. Science Technology and Engineering, 2019, 19(29): 188-192.)
[38]
周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[38]
( Zhou Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016.)