Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (12): 21-29    DOI: 10.11925/infotech.2096-3467.2019.0267
Current Issue | Archive | Adv Search |
Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network
Zhiyong Tao1,Xiaobing Li1,2(),Ying Liu1,Xiaofang Liu1
1 School of Electronic and Information Engineering, Liaoning Technical University, Huludao125105, China
2 Fuxin Lixing Technology Co., Ltd., Fuxin 123000, China
Download: PDF (508 KB)   HTML ( 27
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new model based on bidirectional long-short term memory network with improved attention, aiming to address the issues facing short texts classification. [Methods] First, we used the pre-trained word vectors to digitize the original texts. Then, we extracted their semantic features with bidirectional long-short term memory network. Third, we calculated their global attention scores with the fused forward and reverse features in the improved attention layer. Finally, we obtained short texts vector representation with deep semantic features. [Results] We used Softmax to create the sample label. Compared with the traditional CNN, LSTM and BLSTM networks, the proposed model improved the classification accuracy up to 19.1%. [Limitations] The performance of our new model on long texts is not satisfactory. [Conclusions] The proposed model could effectively classify short texts.

Key wordsShort Text Classification      Bidirectional Long-short Term Memory Network      Attentive Mechanism     
Received: 07 March 2019      Published: 25 December 2019
ZTFLH:  TP391.9  
Corresponding Authors: Xiaobing Li     E-mail: lixiaobing_lgd@163.com

Cite this article:

Zhiyong Tao,Xiaobing Li,Ying Liu,Xiaofang Liu. Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network. Data Analysis and Knowledge Discovery, 2019, 3(12): 21-29.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0267     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I12/21

数据集 类别 样本数 训练集 验证集 测试集 平均词数 文本最大长度 词语总数
Chinese_news (CNH) 18 192 000 156 000 18 000 18 000 12 29 137 890
MR 2 10 658 7 462 1 598 1 598 20 57 18 159
TREC 6 5 949 5 357 - 592 10 35 9 337
IMDB 2 50 000 25 000 12 500 12 500 239 2 525 141 902
IMDB_10 10 50 000 25 000 12 500 12 500 239 2 525 141 902
Yelp 5 35 000 25 000 5 000 5 000 129 984 104 352
注意力 数据集 CNH MR TREC IMDB IMDB_10 Yelp
未引入注意力 CNN 60.0% 72.1% 81.7% 74.4% 35.1% 46.0%
LSTM 75.3% 73.7% 85.4% 88.5% 40.3% 55.3%
BLSTM_ave 78.7% 78.7% 87.3% 90.8% 47.4% 59.3%
BLSTM 78.5% 80.3% 89.4% 89.7% 44.2% 61.8%
引入注意力 ABLSTM 78.7% 80.7% 89.0% 91.5% 46.8% 62.3%
HAN 79.0% 80.3% 89.0% 90.2% 49.4% 62.1%
IABLSTM 79.1% 81.5% 90.9% 91.4% 49.4% 62.8%
[1] Bollegala D, Mastsuo Y, Lshizuka M . Measuring Semantic Similarity Between Words Using Web Search Engines [C] //Proceedings of the 2nd ACM International Conference on World Wide Web. ACM, 2007: 757-766.
[2] Li J, Cai Y, Cai Z , et al. Wikipedia Based Short Text Classification Method [C]//Proceedings of the 2017 International Conference on Database Systems for Advanced Applications. Springer Cham, 2017: 275-286.
[3] 吕超镇, 姬东鸿, 吴飞飞 . 基于LDA特征扩展的短文本分类[J]. 计算机工程与应用, 2015,51(4):123-127.
[3] ( Lv Chaozhen, Ji Donghong, Wu Feifei . Short Text Classification Based on Expanding Feature of LDA[J]. Computer Engineering and Applications, 2015,15(4):123-127.)
[4] Ma C, Zhao Q, Pan J , et al. Short Text Classification Based on Distributional Representations of Words[J]. IEICE Transactions on Information and Systems, 2016,99(10):2562-2565.
[5] Kaljahi R, Foster J . Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study[OL]. arXiv Preprint, arXiv: 1712.07004v1.
[6] Li B, Zhao Z, Liu T , et al. Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification [C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1591-1600.
[7] Kim Y . Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882v2.
[8] Lee J Y, Dernoncourt F . Sequential Short-text Classification with Recurrent and Convolutional Neural Networks[OL]. arXiv Preprint, arXiv: 1603.03827.
[9] Hsu S T, Moon C, Jones P , et al. A Hybrid CNN-RNN Alignment Model for Phrase-aware Sentence Classification [C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2017: 443-449.
[10] Zhou P, Qi Z, Zheng S , et al. Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling[OL]. arXiv Preprint, arXiv:1611.06639.
[11] Itti L, Koch C, Niebur E . A Model of Saliency-based Visual Attention for Rapid Scene Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(11):1254-1259.
[12] Yang Z, Yang D, Dyer C , et al. Hierarchical Attention Networks for Document Classification [C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016: 1480-1489.
[13] Zhou P, Shi W, Tian J , et al. Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Lingustics. Berlin, Germany: Association for Computational Linguistics, 2016: 207-212.
[14] Wang Y, Huang M, Zhao L , et al. Attention-based LSTM for Aspect-level Sentiment Classification [C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2016: 606-615.
[15] Zhou Y, Xu J, Cao J , et al. Hybrid Attention Networks for Chinese Short Text Classification[J]. Computación y Sistemas, 2018,21(4):759-769.
[16] Zaremba W, Sutskever I, Vinyals O . Recurrent Neural Network Regularization[OL]. arXiv Preprint, arXiv: 1409.2329v5.
[17] Hochreiter S, Schmidhuber J . Long Short Term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
[18] Lin Z, Feng M, Santos C N D , et al. A Structured Self-attentive Sentence Embedding[OL]. arXiv Preprint, arXiv: 1703.03130.
[19] Daniluk M, Rocktaschel T, Welbl J , et al. Frustratingly Short Attention Spans in Neural Language Modeling[OL]. arXiv Preprint, arXiv: 1702.04521.
[20] Bahdanau D, Cho K, Bengio Y . Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[21] Qiu X, Gong J, Huang X . Overview of the NLPCC 2017 Shared Tash: Chinese News Headline Categorization [C] //Proceedings of NLPCC 2017: Natural Language Processing and Chinese Computing. Sprintger, 2017: 948-953.
[22] Pang B, Lee L . A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts [C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004: 271-278.
[23] Li X, Roth D . Learning Question Classifiers [C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2002: 1-7.
[24] Diao Q, Qiu M, Wu C Y , et al. Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS) [C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014: 193-202.
[25] Mikolov T, Sutskever I, Chen K , et al. Distributed Representations of Words and Phrases and Their Compositionality [C]//Proceedings of Advances in Neural Information Processing Systems. Neural Information Processing Systems, 2013: 3111-3119.
[26] Pentington J, Socher R, Manning C D . Glove: Global Vectors for Word Representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Computational Linguistics, 2014: 1532-1543.
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[3] Guo Chen,Tianxiang Xu. Sentence Function Recognition Based on Active Learning[J]. 数据分析与知识发现, 2019, 3(8): 53-61.
[4] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[5] Li Xinlei,Wang Hao,Liu Xiaomin,Deng Sanhong. Comparing Text Vector Generators for Weibo Short Text Classification[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[6] Qun Zhang, Hongjun Wang, Lunwen Wang. Classifying Short Texts with Word Embedding and LDA Model[J]. 数据分析与知识发现, 2016, 32(12): 27-35.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn