Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (7): 141-151    DOI: 10.11925/infotech.2096-3467.2021.1462
Original article Current Issue | Archive | Adv Search |
A Text-Aligned Cross-Language Sentiment Classification Method Based on Adversarial Networks
Yang Wenli,Li Nana()
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Download: PDF (1854 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The paper tries to improve the accuracy of cross-language sentiment classification by narrowing the distribution of bilingual text pairs in the shared space. [Methods] In the process of emotional knowledge transfer, we aligned the word and text pairs simultaneously by adjusting the balance coefficient. Then, we combined the language discriminator to generate the conversion matrix for adversarial network optimization. Finally, we used a multi-feature fusion hierarchical neural network to represent the texts, the contexts, as well as the topic relevance of words and sentences, which addressed the issue of long-distance feature dependence of the texts. [Results] We examined our model on the NLP&CC 2013 standard data sets and the average cross-language sentiment classification accuracy was 83.66%, which was 2.30% higher than the benchmark model. [Limitations] This method was only tested with Chinese and English datasets. More research is needed to evaluate its effectiveness with other languages. [Conclusions] Improving the similarity of bilingual texts could effectively increase the accuracy of cross-language sentiment classification.

Key wordsWord Alignment      Text Alignment      Generative Adversarial Network      Multi-Feature Fusion      Hierarchical Neural Network     
Received: 28 December 2021      Published: 24 August 2022
ZTFLH:  TP391  
Fund:National Natural Science Foundation of China(61806072)
Corresponding Authors: Li Nana,ORCID:0000-0002-5517-6033     E-mail: linana@scse.hebut.edu.cn

Cite this article:

Yang Wenli, Li Nana. A Text-Aligned Cross-Language Sentiment Classification Method Based on Adversarial Networks. Data Analysis and Knowledge Discovery, 2022, 6(7): 141-151.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.1462     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I7/141

Cross-Language Sentiment Classification Model
Feature Extractor
Vector Space
Cross-lingual Vector Space
Adversarial Model
序号 评论实例
1 This is a hot movie recently.
2 The water in the hotel is too hot and burns people.
Comment Examples
Bilingual Sentiment Dictionary
数据集 DVD Book Music
训练集 英文 35 000 41 000 41 000
中文 30 000 35 000 40 000
测试集 中文 500 500 500
Dataset
Accuracy Changes with Similarity
Text Proportional Change the Classification Results
Changes in Results Before and after Optimization
数据集 指标 SD SWD SWAB SWC SS-BiDocv
DVD Acc/% 75.22 76.00 76.39 80.97 83.52
Pre/% 74.15 74.89 76.60 76.54 82.02
Rec/% 77.12 78.22 76.00 76.54 82.00
Book Acc/% 76.47 78.25 80.93 78.26 84.26
Pre/% 74.68 75.33 89.58 79.28 83.62
Rec/% 80.10 84.00 70.00 76.52 85.18
Music Acc/% 75.25 78.31 77.39 79.26 83.20
Pre/% 76.34 77.25 75.00 80.96 81.50
Rec/% 73.96 80.26 82.18 76.52 85.90
Feature Fusion Classification Results
Accuracy Under Different Thresholds
方法 分类准确率/%
DVD Book Music Average
MT(En-Ch) 70.32 75.40 74.26 73.33
MT(Ch-En) 76.25 76.00 73.21 75.15
SCL-CLSC 82.60 82.90 78.95 81.48
CLWEs 82.92 83.00 81.13 82.35
BLSE 79.36 77.95 81.20 79.50
AttLSTM-CLSC 81.22 82.50 80.66 81.46
ACNN-AMT 80.58 81.85 81.22 81.22
BSWE 81.60 81.05 79.40 80.68
本文方法 83.52 84.26 83.20 83.66
Comparison of Optimal Model Results
[1] Kornai A. Formal Phonology[M]. London: Routledge, 2018.
[2] Barnes J, Lambert P, Badia T. Exploring Distributional Representations and Machine Translation for Aspect-based Cross-lingual Sentiment Classification[C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1613-1623.
[3] Fei H L, Li P. Cross-Lingual Unsupervised Sentiment Classification with Multi-view Transfer Learning[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 5759-5771.
[4] Otani N, Ozaki S, Zhao X Y, et al. Pre-tokenization of Multi-word Expressions in Cross-lingual Word Embeddings[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 4451-4464.
[5] Ormazabal A, Artetxe M, Labaka G, et al. Analyzing the Limitations of Cross-lingual Word Embedding Mappings[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 4990-4995.
[6] Wang H Z, Henderson J, Merlo P. Weakly-supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 4418-4429.
[7] Liu X B, Wong D F, Liu Y, et al. Shared-private Bilingual Word Embeddings for Neural Machine Translation[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 3613-3622.
[8] Artetxe M, Labaka G, Agirre E. A Robust Self-learning Method for Fully Unsupervised Cross-lingual Mappings of Word Embeddings[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 789-798.
[9] Li N N, Zhai S F, Zhang Z F, et al. Structural Correspondence Learning for Cross-lingual Sentiment Classification with One-to-Many Mappings[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3490-3496.
[10] Cao H L, Zhao T J. Word Embedding Transformation for Robust Unsupervised Bilingual Lexicon Induction[OL]. arXiv Preprint, arXiv: 2105.12297.
[11] Ni J, Florian R. Neural Cross-lingual Relation Extraction Based on Bilingual Word Embedding Mapping[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 399-409.
[12] Artetxe M, Labaka G, Agirre E. Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-step Framework of Linear Transformations[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 5012-5019.
[13] Singh P, Lefever E. LT3 at SemEval-2020 Task 9:Cross-lingual Embeddings for Sentiment Analysis of Hinglish Social Media Text[C]// Proceedings of the 14th Workshop on Semantic Evaluation. 2020: 1288-1293.
[14] Marie B, Fujita A. Unsupervised Joint Training of Bilingual Word Embeddings[C]// Proceedings of the 57th Conference of the Association for Computational Linguistics. 2019: 3224-3230.
[15] Ri R, Tsuruoka Y. Revisiting the Context Window for Cross-lingual Word Embeddings[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 995-1005.
[16] Zhang M Z, Fujinuma Y, Paul M J, et al. Why Overfitting isn’t Always Bad: Retrofitting Cross-lingual Word Embeddings to Dictionaries[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 2214-2220.
[17] Barnes J, Klinger R, im Walde S S. Bilingual Sentiment Embeddings: Joint Projection of Sentiment across Languages[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 2483-2493.
[18] Nishikawa S, Ri R, Tsuruoka Y. Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing:Student Research Workshop. 2021: 163-173.
[19] Chen Z P, Shen S, Hu Z N, et al. Emoji-powered Representation Learning for Cross-lingual Sentiment Classification (Extended Abstract)[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020: 4701-4705.
[20] Dong X, Melo G. Cross-lingual Propagation for Deep Sentiment Analysis[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018: 5771-5778.
[21] Zhou X J, Wan X J, Xiao J G. Attention-based LSTM Network for Cross-lingual Sentiment Classification[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 246-256.
[22] Lyu C Y, Foster J, Graham Y. Improving Document-level Sentiment Analysis with User and Product Context[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 6724-6729.
[23] Chen X L, Sun Y, Athiwaratkun B, et al. Adversarial Deep Averaging Networks for Cross-lingual Sentiment Classification[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 557-570.
doi: 10.1162/tacl_a_00039
[24] Esuli A, Sebastiani F. SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining[C]// Proceedings of the 5th International Conference on Language Resources and Evaluation. 2006: 417-422.
[25] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[26] Gers F A, Schmidhuber J, Cummins F. Learning to Forget: Continual Prediction with LSTM[C]// Proceedings of the 9th International Conference on Artificial Neural Networks. 1999: 850-855.
[27] Wang W C, Feng S, Gao W, et al. Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 338-348.
[28] Zhou H W, Chen L, Shi F L, et al. Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2015: 430-440.
[1] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[2] Yu Chuanming,Gong Yutian,Zhao Xiaoli,An Lu. Collaboration Recommendation of Finance Research Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn