|
|
Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network |
Zhiyong Tao1,Xiaobing Li1,2(),Ying Liu1,Xiaofang Liu1 |
1 School of Electronic and Information Engineering, Liaoning Technical University, Huludao125105, China 2 Fuxin Lixing Technology Co., Ltd., Fuxin 123000, China |
|
|
Abstract [Objective] This paper proposes a new model based on bidirectional long-short term memory network with improved attention, aiming to address the issues facing short texts classification. [Methods] First, we used the pre-trained word vectors to digitize the original texts. Then, we extracted their semantic features with bidirectional long-short term memory network. Third, we calculated their global attention scores with the fused forward and reverse features in the improved attention layer. Finally, we obtained short texts vector representation with deep semantic features. [Results] We used Softmax to create the sample label. Compared with the traditional CNN, LSTM and BLSTM networks, the proposed model improved the classification accuracy up to 19.1%. [Limitations] The performance of our new model on long texts is not satisfactory. [Conclusions] The proposed model could effectively classify short texts.
|
Received: 07 March 2019
Published: 25 December 2019
|
|
Corresponding Authors:
Xiaobing Li
E-mail: lixiaobing_lgd@163.com
|
[1] |
Bollegala D, Mastsuo Y, Lshizuka M . Measuring Semantic Similarity Between Words Using Web Search Engines [C] //Proceedings of the 2nd ACM International Conference on World Wide Web. ACM, 2007: 757-766.
|
[2] |
Li J, Cai Y, Cai Z , et al. Wikipedia Based Short Text Classification Method [C]//Proceedings of the 2017 International Conference on Database Systems for Advanced Applications. Springer Cham, 2017: 275-286.
|
[3] |
吕超镇, 姬东鸿, 吴飞飞 . 基于LDA特征扩展的短文本分类[J]. 计算机工程与应用, 2015,51(4):123-127.
|
[3] |
( Lv Chaozhen, Ji Donghong, Wu Feifei . Short Text Classification Based on Expanding Feature of LDA[J]. Computer Engineering and Applications, 2015,15(4):123-127.)
|
[4] |
Ma C, Zhao Q, Pan J , et al. Short Text Classification Based on Distributional Representations of Words[J]. IEICE Transactions on Information and Systems, 2016,99(10):2562-2565.
|
[5] |
Kaljahi R, Foster J . Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study[OL]. arXiv Preprint, arXiv: 1712.07004v1.
|
[6] |
Li B, Zhao Z, Liu T , et al. Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification [C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1591-1600.
|
[7] |
Kim Y . Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882v2.
|
[8] |
Lee J Y, Dernoncourt F . Sequential Short-text Classification with Recurrent and Convolutional Neural Networks[OL]. arXiv Preprint, arXiv: 1603.03827.
|
[9] |
Hsu S T, Moon C, Jones P , et al. A Hybrid CNN-RNN Alignment Model for Phrase-aware Sentence Classification [C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2017: 443-449.
|
[10] |
Zhou P, Qi Z, Zheng S , et al. Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling[OL]. arXiv Preprint, arXiv:1611.06639.
|
[11] |
Itti L, Koch C, Niebur E . A Model of Saliency-based Visual Attention for Rapid Scene Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(11):1254-1259.
|
[12] |
Yang Z, Yang D, Dyer C , et al. Hierarchical Attention Networks for Document Classification [C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016: 1480-1489.
|
[13] |
Zhou P, Shi W, Tian J , et al. Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Lingustics. Berlin, Germany: Association for Computational Linguistics, 2016: 207-212.
|
[14] |
Wang Y, Huang M, Zhao L , et al. Attention-based LSTM for Aspect-level Sentiment Classification [C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2016: 606-615.
|
[15] |
Zhou Y, Xu J, Cao J , et al. Hybrid Attention Networks for Chinese Short Text Classification[J]. Computación y Sistemas, 2018,21(4):759-769.
|
[16] |
Zaremba W, Sutskever I, Vinyals O . Recurrent Neural Network Regularization[OL]. arXiv Preprint, arXiv: 1409.2329v5.
|
[17] |
Hochreiter S, Schmidhuber J . Long Short Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
|
[18] |
Lin Z, Feng M, Santos C N D , et al. A Structured Self-attentive Sentence Embedding[OL]. arXiv Preprint, arXiv: 1703.03130.
|
[19] |
Daniluk M, Rocktaschel T, Welbl J , et al. Frustratingly Short Attention Spans in Neural Language Modeling[OL]. arXiv Preprint, arXiv: 1702.04521.
|
[20] |
Bahdanau D, Cho K, Bengio Y . Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
|
[21] |
Qiu X, Gong J, Huang X . Overview of the NLPCC 2017 Shared Tash: Chinese News Headline Categorization [C] //Proceedings of NLPCC 2017: Natural Language Processing and Chinese Computing. Sprintger, 2017: 948-953.
|
[22] |
Pang B, Lee L . A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts [C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004: 271-278.
|
[23] |
Li X, Roth D . Learning Question Classifiers [C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2002: 1-7.
|
[24] |
Diao Q, Qiu M, Wu C Y , et al. Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS) [C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014: 193-202.
|
[25] |
Mikolov T, Sutskever I, Chen K , et al. Distributed Representations of Words and Phrases and Their Compositionality [C]//Proceedings of Advances in Neural Information Processing Systems. Neural Information Processing Systems, 2013: 3111-3119.
|
[26] |
Pentington J, Socher R, Manning C D . Glove: Global Vectors for Word Representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Computational Linguistics, 2014: 1532-1543.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|