Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (1): 1-21    DOI: 10.11925/infotech.2096-3467.2022.0472
Current Issue | Archive | Adv Search |
Cross-Lingual Sentiment Analysis: A Survey
Xu Yuemei(),Cao Han,Wang Wenqing,Du Wanze,Xu Chengyang
School of Information Science and Technology, Beijing Foreign Studies of University, Beijing 100089, China
Download: PDF (1427 KB)   HTML ( 65
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper teases out the research context of cross-lingual sentiment analysis (CLSA). [Coverage] We searched “TS=cross lingual sentiment OR cross lingual word embedding” in Web of Science database and 90 representative papers were chosen for this review. [Methods] We elaborated the following CLSA methods in detail: (1) The early main methods of CLSA, including those based on machine translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on Multi-BERT and other pre-trained models. [Results] We analyzed their main ideas, methodologies, shortcomings, etc., and attempted to reach a conclusion on the coverage of languages, datasets and their performance. It is found that although pre-trained models such as Multi-BERT have achieved good performance in zero-shot cross-lingual sentiment analysis, some challenges like language sensitivity still exist. Early CLSA methods still have some inspirations for existing researches. [Limitations] Some CLSA models are mixed models and they are classified according to the main methods. [Conclusions] We look into the future development of CLSA and the challenges facing the research area. With in-depth research of pre-trained models on multi-lingual semantics, CLSA models fit for more and wider languages will be the future direction.

Key wordsCross Lingual      Multi-lingual      Sentiment Analysis      Bilingual Word Embedding     
Received: 11 May 2022      Published: 16 February 2023
ZTFLH:  TP391  
Fund:Fundamental Research Funds for the Central Universities(2022JJ006)
Corresponding Authors: Xu Yuemei,ORCID:0000-0002-0223-7146,E-mail: xuyuemei@bfsu.edu.cn。   

Cite this article:

Xu Yuemei, Cao Han, Wang Wenqing, Du Wanze, Xu Chengyang. Cross-Lingual Sentiment Analysis: A Survey. Data Analysis and Knowledge Discovery, 2023, 7(1): 1-21.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0472     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I1/1

Cross-Lingual Sentiment Analysis Based on Machine Translation
作者 模型 特点 数据来源 语种 准确率/%
He*[16] LSM 借助对源语言情感词典的翻译,得到目标语言的情感词先验知识,纳入到LDA模型进行学习 中国商品评论数据 英-中 81.41
Zhang等*[17] ATTM 基于训练集选择,将与目标语言高度相似的标记样本放入训练集中,构建一个以目标语言为中心的跨语言情感分类器 测试集: COAE2014;
训练集:中国科学院计算技术研究所带标记中文数据集
中-德 84.3
中-英 87.7
中-法 80.1
中-西 83.3
Al-Shabi等*[18] SVM、NB、KNN 设置标准数据集对机器翻译优化,以此找到最优的基线模型,并确定了机器翻译数据中的噪声与情感分类精度之间的关系 亚马逊产品评论 英-阿
Hajmohammadi等*[19] MLMV 将多种源语言的标记数据作为训练集,克服从单一源语言到目标语言的机器翻译过程导致的泛化问题 亚马逊产品评论;
Pan Reviews数据集
英+德-法 79.85
英+法-德 81.55
英+法-日 73.73
英+日-中 76.65
Hajmohammadi等*[20] DBAST 将目标语言无标记文档通过机器翻译转化为源语言文档后,从中选择信息量最大、最可信的样本进行标记以丰富训练数据 亚马逊产品评论;
Pan Reviews数据集
英-法 78.63
英-中 71.36
英-日 70.04
Hajmohammadi等*[21] Graph-Based
Semi-Supervised
Learning Model
提出一种基于多视图的半监督学习模型,将目标语言中未标记的数据合并到多视图半监督学习模型中,即在文档级分析中加入目标语言内在结构的学习 亚马逊产品评论;
Pan Reviews数据集
英-中 73.81
英-日 72.72
Lu等&[22] Joint 联合双语有情感标注的平行语料库和未标记平行数据,为每种语言同时学习更好的单语情感分类器 MPQA; NTCIR-EN;
NTCIR-CH; ISI中-
英平行语料库
英-中 83.54
中-英 79.29
Meng等&[23] CLMM 不依赖机器翻译标记目标语言文本,从未标记的平行语料库中通过拟合参数学习情感词,扩大词汇覆盖率 MPQA; NTCIR-EN;
NTCIR-CH; ISI中-
英平行语料库
英-中 83.02
Gao等&[24] BLP 基于平行语料库和词对齐构建双语词图,从现有源语言(英语)情感词典中学习到目标语言的情感词典 General Inquirer Lexicon;ISI中-英平行语料库;NTCIR情感语料库 英-中 78.90
Zhou等&[25] NMF 提出一个子空间学习框架,利用少量文档对齐的并行数据和双语下非并行数据,缩小源语言和目标语言的差距 亚马逊产品评论 英-法 81.83
英-德 80.45
英-日 75.78
法-英 79.47
德-英 79.56
日-英 78.79
Representative Researches on Early Cross-Lingual Sentiment Analysis
Structure of CLSA Based on Parallel Corpora
Schematic of CLWE in English and Spanish
方法 主要思路 优点 / 缺点
有监督的方法 借助大量的双语平行文本 优点:将平行文本蕴含的嵌入空间(Embedding Space)信息作为参考,有效保证映射的效果;
缺点:双语平行语料难以获得,尤其是大规模的双语平行语料。
半监督的方法 基于小样本的启发式双语种子词典作为映射锚点,学习转移矩阵 优点:只需要用到小样本的种子词典,较易获得;
缺点:本质上是利用种子词典对齐词空间的映射矩阵来代替整个空间的映射矩阵,不一定能代表源-目标语言整个空间的映射矩阵。
完全无监督的方法 借助大规模的非平行语料资源,通过生成对抗网络、自动编码器-解码器等模型学习双语之间的转换矩阵 优点:无需借助平行语料库/双语词典;
缺点:存在初始化不鲁棒问题,对于初始解要求比较高,不同的初始解对结果影响较大;在缺少监督信息的情况下,容易陷入局部最优解。
Classification and Summarization of Cross-Lingual Word Embedding Generation
Structure of Cross-Lingual Word Embedding Based on Unsupervised Approach
作者 模型 特点 数据来源 语种 准确率/%
Chen等[46] RBST 将语言差异建模为源语言和目标语言在每个特定极性下的固定转移向量,基于此向量确定目标语言文档情感 亚马逊产品评论数据;
微博评论数据
英-中 81.5
Abdalla等[47] SVM;
LR分类器
借助由机器翻译获得的单词对来计算从源语言到目标语言向量空间的转换矩阵 谷歌新闻数据集;西班牙十亿单词语料库;维基百科数据;谷歌万亿单词语料库;中文酒店评论数据集 英-中 F: 77.0
英-西 F: 81.0
Dong等[48] DC-CNN 基于标注的双语平行语料库,将潜在的情感信息编码到跨语言词向量中 SST影评;TA旅游网站评论;AC法国电视剧评论;SE16-T5餐馆评论;AFF亚马逊美食评论 英-西 85.93
英-荷 79.30
英-俄 93.26
英-德 92.31
英-捷 93.69
英-意 96.48
英-法 92.97
英-日 88.08
Akhtar等[49] Bilingual-SGNS 结合负采样的双语连续跳跃元语法模型构建两种语言的词嵌入向量表示并映射至同一空间,用于细粒度方面级情感分析 印地语ABSA数据集;英语SemEval-2014数据集 英-印 多语言设置:76.29
跨语言设置:60.39
Atrio等[50] SVM; SNN; BiLSTM 对目标语言进行词序调整以提高短文本情感分析的性能 OpeNER语料库;
加泰罗尼亚MultiBooked数据集
英-西 Bi: F=65.1
4-C: F=35.8
英-加 Bi: F=65.6
4-C: F=38.1
Peirsman等[51] Cross-Lingual Selectional Preferences Model 使用双语同根词构成的小样本种子词典作为初始解构造双语词向量空间,生成双语词向量 TiGer语料库;AMT 西-英 47.0
德-英 48.0
Vuli?等[52] MuPTM 利用多语言概率模型对单词间一对多的映射关系生成一一映射的种子词典,以此作为初始解生成跨语言词向量 维基百科文章 西-英 89.1
意-英 88.2
Artetxe等[38] Self-Learning Framework 基于两种语言单语词向量间的相似度构造种子词典 英-意数据集;
ukWaC+Wikipedi+BNC;itWaC;Europarl;OPUS;SdeWaC;28亿词Common Crawl语料库;RG-65 & WordSim-353跨语言数据集
英-意 37.27
英-德 39.60
英-芬 28.16
Chen等[53] Ermes 将emoji表情符号作为补充情感监督信息,获得源-目标语言融合情感信息的句子表征 亚马逊产品评论数据;推特数据 英-日 80.17
英-法 86.5
英-德 86.6
Barnes等[54] BLSE 借助一个小的双语词典和源语言带标注的情感数据,得到双语映射到同一个共享向量空间、同时携带情感信息的变换矩阵 OpeNER;MultiBooked数据集 英-西 Bi:F=80.3
4-C:F=50.3
英-加 Bi:F=85.0
4-C:F=53.9
英-巴 Bi:F=73.5
4-C:F=50.5
Gouws等[55] BiBOWA 利用粗糙的双语数据,基于优化过的词语相似度矩阵计算方法无监督地生成跨语言词向量 路透社RCV1/RCV2多语语料库;EuroParl 英-德 86.5
德-英 75.0
Barone等[44] AAE 首次使用对抗性自动编码器将源语言词向量映射到目标语言词向量空间中 维基百科语料库;路透社语料库;2015 News Commentary语料库 英-意
英-德
Shen等[56] TL-AAE-
BiGRU
利用对抗自动编码器学习双语平行文本,通过线性变换矩阵将双语映射到同一向量空间 亚马逊产品评论数据 英-中 F: 78.57
英-德
Artetxe等[57] Vecmap 利用无监督模型Vecmap构造初始解,去除对小规模种子词典的依赖 英-意数据集;EuroParl;OPUS; 英-意 48.13
英-德 48.19
英-芬 32.63
英-西 37.33
Rasooli等[58] NBLR+
POSwemb;LSTM
使用多种源语言缩小源-目标语言间的差异,并采用标注投影和直接迁移两种迁移方法为资源稀缺的语言构造健壮的情感分析系统 推特数据;SentiPer;SemEval 2017 Task 4;BQ;EuroParl;LDC;GIZA++;维基百科文章 单源设置
英-中 F: 66.8
英-德 F: 51.0
英-瑞典 F: 49.0
英-克、英-匈、英-波斯、英-波兰等实验性能详见文献[58]
多源设置
F: 54.7
波兰 F: 54.6
F: 54.0
阿拉伯语、保加利亚语、中文、克罗地亚语等实验性能详见文献[58]
Cross-Lingual Sentiment Analysis Based on CLWE
Structure of Cross-Lingual Sentiment Analysis Based on GAN
作者 模型 任务 优点 缺点 数据集
Pires等[75] Multilingual BERT 零次跨语言模式迁移 在零样本跨语言任务中表现出色,尤其是当源和目标相似时 在某些语言对的多语言表示上表现出系统性的缺陷 Code-Switching Hindi, English Universal Dependencies Corpus
Lample等[76] XLM 预训练模型的跨语言表征 利用平行语料引导模型表征对齐,提升预训练模型的跨语言表征性能 训练数据规模相对较小,尤其对于资源较少的语言 MultiUN, IIT Bombay Corpus, EUbookshop Corpus
Conneau等[77] XLM-RoBERTa 跨语言分类、序列标注和问答 使用大规模多语言预训练,在跨语言分类、序列标注和问答上表现出色 模型有大量的代码合成词,导致系统无法理解句子的内在含义 Common Crawl Corpus in 100 Languages, Wikipedia Corpus
Xia等[78] MetaXL 跨语言情感分析的多语言传输 使目标语言和源语言在表达空间中更接近,具有良好的传输性能 尚未探索在预训练模型的多个层上放置多个转换网络 亚马逊产品评论数据,SentiPers, Sentiraama
Bataa等[79] ELMo
ULMFiT
BERT
针对日语的情感分类 使用知识迁移技术和预训练模型解决日语情感分类 没有执行K折交叉验证 Japanese Rakuten Review Binary, Five Class Yahoo Datasets
Gupta等[80] BERT
Multi-BERT等
情感分析中的任务型预训练和跨语言迁移 针对性强,表现良好,可作为未来情感分析任务的基线模型 在特定数据集上的跨语言传输效果不理想,没有显著提高模型的性能 Tamil-English, Malayalam English, SentiMix Hinglish
Cross-Lingual Sentiment Analysis Based on Pre-Trained Model
[1] Shanahan J G, Grefenstette G, Qu Y, Evans D A. Mining Multilingual Options Through Classification and Translation[C]// Proceeding of AAAI Spring Symposium. Menlo Park, CA: AAAI, 2004
[2] Wan X J. Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis[C]// Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2008: 553-561.
[3] Vulić I, Moens M F. Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings[C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. USA: ACM, 2015: 363-372.
[4] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[5] Balahur A, Mihalcea R, Montoyo A. Computational Approaches to Subjectivity and Sentiment Analysis: Present and Envisaged Methods and Applications[J]. Computer Speech & Language, 2014, 28(1): 1-6.
[6] Banea C, Mihalcea R, Wiebe J, et al. Multilingual Subjectivity Analysis Using Machine Translation[C]// Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2008: 127-135.
[7] Martín-Valdivia M T, Martínez-Cámara E, Perea-Ortega J M, et al. Sentiment Polarity Detection in Spanish Reviews Combining Supervised and Unsupervised Approaches[J]. Expert Systems with Applications, 2013, 40(10): 3934-3942.
doi: 10.1016/j.eswa.2012.12.084
[8] Prettenhofer P, Stein B. Cross-Language Text Classification Using Structural Correspondence Learning[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 1118-1127.
[9] Wan X J. Co-Training for Cross-Lingual Sentiment Classification[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. USA: Association for Computational Linguistics, 2009: 235-243.
[10] Balahur A, Turchi M. Comparative Experiments Using Supervised Learning and Machine Translation for Multilingual Sentiment Analysis[J]. Computer Speech & Language, 2014, 28(1): 56-75.
[11] Banea C, Mihalcea R, Wiebe J. Multilingual Subjectivity: Are More Languages Better?[C]// Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 28-36.
[12] Hajmohammadi M S, Ibrahim R, Selamat A. Density Based Active Self-Training for Cross-Lingual Sentiment Classification[C]// Proceedings of the 2013 International Conference on Computer Science and Applications. 2014: 1053-1059.
[13] Hajmohammadi M S, Ibrahim R, Selamat A. Bi-View Semi-Supervised Active Learning for Cross-Lingual Sentiment Classification[J]. Information Processing & Management, 2014, 50(5): 718-732.
doi: 10.1016/j.ipm.2014.03.005
[14] Pan J F, Xue G R, Yu Y, et al. Cross-Lingual Sentiment Classification via Bi-View Non-Negative Matrix Tri-Factorization[C]// Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2011: 289-300.
[15] Wan X J. Bilingual Co-Training for Sentiment Classification of Chinese Product Reviews[J]. Computational Linguistics, 2011, 37(3): 587-616.
doi: 10.1162/COLI_a_00061
[16] He Y L. Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification[C]// Proceedings of the 2011 European Conference on Information Retrieval Lecture Notes in Computer Science. 2011: 214-225.
[17] Zhang P, Wang S G, Li D Y. Cross-Lingual Sentiment Classification: Similarity Discovery Plus Training Data Adjustment[J]. Knowledge-Based Systems, 2016, 107: 129-141.
doi: 10.1016/j.knosys.2016.06.004
[18] Al-Shabi A, Adel A, Omar N, et al. Cross-Lingual Sentiment Classification from English to Arabic Using Machine Translation[J]. International Journal of Advanced Computer Science and Applications, 2017, 8(12): 434-440.
[19] Hajmohammadi M S, Ibrahim R, Selamat A. Cross-Lingual Sentiment Classification Using Multiple Source Languages in Multi-View Semi-Supervised Learning[J]. Engineering Applications of Artificial Intelligence, 2014, 36: 195-203.
doi: 10.1016/j.engappai.2014.07.020
[20] Hajmohammadi M S, Ibrahim R, Selamat A, et al. Combination of Active Learning and Self-Training for Cross-Lingual Sentiment Classification with Density Analysis of Unlabelled Samples[J]. Information Sciences, 2015, 317: 67-77.
doi: 10.1016/j.ins.2015.04.003
[21] Hajmohammadi M S, Ibrahim R, Selamat A. Graph-Based Semi-Supervised Learning for Cross-Lingual Sentiment Classification[C]// Proceedings of the 2015 Asian Conference on Intelligent Information and Database Systems. 2015: 97-106.
[22] Lu B, Tan C, Cardie C, et al. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies. 2011: 320-330.
[23] Meng X, Wei F, Liu X, et al. Cross-Lingual Mixture Model for Sentiment Classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 2012: 572-581.
[24] Gao D, Wei F R, Li W J, et al. Cross-Lingual Sentiment Lexicon Learning with Bilingual Word Graph Label Propagation[J]. Computational Linguistics, 2015, 41: 21-40.
doi: 10.1162/COLI_a_00207
[25] Zhou G, He T, Zhao J, Wu W. A Subspace Learning Framework For Cross-Lingual Sentiment Classification With Partial Parallel Data[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI). Palo Alto, California USA: AAAI Press / International Joint Conferences on Artificial Intelligence, 2015: 1426-1432.
[26] 高影繁, 王惠临, 徐红姣. 跨语言文本分类技术研究进展[J]. 情报理论与实践, 2010, 33(11): 126-128.
[26] ( Gao Yingfan, Wang Huilin, Xu Hongjiao. Progress in Research on Cross-Language Text Categorization Technology[J]. Information Studies: Theory & Application, 2010, 33(11): 126-128.)
[27] Duh K, Fujino A, Nagata M. Is Machine Translation Ripe for Cross-Lingual Sentiment Classification?[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies. 2011: 429-433.
[28] Mihalcea R, Banea C, Wiebe J. Learning Multilingual Subjective Language via Cross-Lingual Projections[C]// Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 2007: 976-983.
[29] Darwich M, Noah S A M, Omar N. Automatically Generating a Sentiment Lexicon for the Malay Language[J]. Asia-Pacific Journal of Information Technology and Multimedia, 2016, 5(1): 49-59.
doi: 10.17576/apjitm-2016-0501-05
[30] Nasharuddin N A, Abdullah M T, Azman A, et al. English and Malay Cross-Lingual Sentiment Lexicon Acquisition and Analysis[C]// Proceedings of the 2017 International Conference on Information Science and Applications. 2017: 467-475.
[31] Sazzed S. Development of Sentiment Lexicon in Bengali Utilizing Corpus and Cross-Lingual Resources[C]// Proceedings of the 21st International Conference on Information Reuse and Integration for Data Science. IEEE, 2020: 237-244.
[32] Vania C M, Ibrahim A M. Sentiment Lexicon Generation for an Under-Resourced Language[J]. International Journal of Computational Linguistics and Applications, 2014, 5(1): 59-72.
[33] Chang C H, Wu M L, Hwang S Y. An Approach to Cross-Lingual Sentiment Lexicon Construction[C]// Proceedings of the 2019 IEEE International Congress on Big Data. IEEE, 2019: 129-131.
[34] He X X, Gao S X, Yu Z T, et al. Sentiment Classification Method for Chinese and Vietnamese Bilingual News Sentence Based on Convolution Neural Network[C]// Proceedings of the 2018 International Conference on Mechatronics and Intelligent Robotics. 2018: 1230-1239.
[35] Zabha N I, Ayop Z, Anawar S, et al. Developing Cross-Lingual Sentiment Analysis of Malay Twitter Data Using Lexicon-Based Approach[J]. International Journal of Advanced Computer Science and Applications, 2019, 10(1): 346-351.
[36] Pennington J, Socher R, Manning C D. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[37] Peters M, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2018: 2227-2237.
[38] Artetxe M, Labaka G, Agirre E. Learning Bilingual Word Embeddings with (Almost) No Bilingual Data[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2017: 451-462.
[39] Faruqui M, Dyer C. Improving Vector Space Word Representations Using Multilingual Correlation[C]// Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2014: 462-471.
[40] Zou W Y, Socher R, Cer D, et al. Bilingual Word Embeddings for Phrase-Based Machine Translation[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2013: 1393-1398.
[41] Vulić I, Moens M F. Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. USA: Association for Computational Linguistics, 2015: 719-725.
[42] Ruder S, Vulić I, Søgaard A. A Survey of Cross-Lingual Word Embedding Models[J]. Journal of Artificial Intelligence Research, 2019, 65: 569-631.
doi: 10.1613/jair.1.11640
[43] Vulić I, Korhonen A. On the Role of Seed Lexicons in Learning Bilingual Word Embeddings[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2016: 247-257.
[44] Barone A V M. Towards Cross-Lingual Distributed Representations Without Parallel Text Trained with Adversarial Autoencoders[C]// Proceedings of the 1st Workshop on Representation Learning for NLP. USA: Association for Computational Linguistics, 2016: 121-126.
[45] 彭晓娅, 周栋. 跨语言词向量研究综述[J]. 中文信息学报, 2020, 34(2): 1-15.
[45] ( Peng Xiaoya, Zhou Dong. Survey of Cross-Lingual Word Embedding[J]. Journal of Chinese Information Processing, 2020, 34(2): 1-15.)
[46] Chen Q, Li C L, Li W J. Modeling Language Discrepancy for Cross-Lingual Sentiment Analysis[C]// Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. USA: ACM, 2017: 117-126.
[47] Abdalla M, Hirst G. Cross-Lingual Sentiment Analysis Without (Good) Translation[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Sweden: Association for Computational Linguistics, 2017: 462-471.
[48] Dong X,de Melo G. Cross-Lingual Propagation for Deep Sentiment Analysis[C]// Proceedings of the 32nd Conference on Artificial Intelligence. 2018: 5771-5778.
[49] Akhtar M S, Sawant P, Sen S, et al. Improving Word Embedding Coverage in Less-Resourced Languages Through Multi-Linguality and Cross-Linguality[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2019, 18(2): 1-22.
[50] Atrio À R, Badia T, Barnes J. On the Effect of Word Order on Cross-Lingual Sentiment Analysis[OL]. arXiv Preprint, arXiv: 1906.05889.
[51] Peirsman Y, Padó S. Cross-Lingual Induction of Selectional Preferences with Bilingual Vector Spaces[C]// Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2010: 921-929.
[52] Vulić I, Moens M F. A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else)[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2013: 1613-1624.
[53] Chen Z P, Shen S, Hu Z N, et al. Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification[C]// Proceedings of the 2019 World Wide Web Conference. New York: ACM Press, 2019: 251-262.
[54] Barnes J, Klinger R, im Walde S S. Bilingual Sentiment Embeddings: Joint Projection of Sentiment across Languages[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2018: 2483-2493.
[55] Gouws S, Bengio Y, Corrado G. BilBOWA: Fast Bilingual Distributed Representations Without Word Alignments[C]// Proceedings of the 2015 International Conference on Machine Learning. PMLR, 2015: 748-756.
[56] Shen J H, Liao X D, Lei S. Cross-Lingual Sentiment Analysis via AAE and BiGRU[C]// Proceedings of the 2020 Asia-Pacific Conference on Image Processing, Electronics and Computers. IEEE, 2020: 237-241.
[57] Artetxe M, Labaka G, Agirre E. A Robust Self-Learning Method for Fully Unsupervised Cross-Lingual Mappings of Word Embeddings[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2018: 789-798.
[58] Rasooli M S, Farra N, Radeva A, et al. Cross-Lingual Sentiment Transfer with Limited Resources[J]. Machine Translation, 2018, 32(1-2): 143-165.
doi: 10.1007/s10590-017-9202-6
[59] Hermann K M, Blunsom P. Multilingual Distributed Representations Without Word Alignment[OL]. arXiv Preprint, arXiv: 1312.6173. Multilingual Distributed Representations Without Word Alignment[OL]. arXiv Preprint, arXiv: 1312.6173.
[60] Chandar A P S, Lauly S, Larochelle H, et al. An Autoencoder Approach to Learning Bilingual Word Representations[C]// Proceedings of the 2014 Conference and Workshop on Neural Information Processing Systems. 2014.
[61] Gouws S, Søgaard A. Simple Task-Specific Bilingual Word Embeddings[C]// Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. USA: Association for Computational Linguistics, 2015: 1386-1390.
[62] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
doi: 10.1145/3422622
[63] Chen X L, Sun Y, Athiwaratkun B, et al. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 557-570.
doi: 10.1162/tacl_a_00039
[64] Feng Y L, Wan X J. Towards a Unified End-to-End Approach for Fully Unsupervised Cross-Lingual Sentiment Analysis[C]// Proceedings of the 23rd Conference on Computational Natural Language Learning. USA: Association for Computational Linguistics, 2019: 1035-1044.
[65] Antony A, Bhattacharya A, Goud J, et al. Leveraging Multilingual Resources for Language Invariant Sentiment Analysis[C]// Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. 2020: 71-79.
[66] Conneau A, Lample G, Ranzato M A, et al. Word Translation Without Parallel Data[OL]. arXiv Preprint, arXiv: 1710.04087.
[67] Wang W C, Feng S, Gao W, et al. Personalized Microblog Sentiment Classification via Adversarial Cross-Lingual Multi-Task Learning[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2018: 338-348.
[68] Kandula H, Min B N. Improving Cross-Lingual Sentiment Analysis via Conditional Language Adversarial Nets[C]// Proceedings of the 3rd Workshop on Computational Typology and Multilingual NLP. USA: Association for Computational Linguistics, 2021: 32-37.
[69] Ganin Y, Ustinova E, Ajakan H, et al. Domain-Adversarial Training of Neural[J]. The Journal of Machine Learning Research, 2016, 17(1): 2096-2030.
[70] Long M, Cao Z, Wang J, et al. Conditional Adversarial Domain Adaptation[C]// Proceedings of the 2018 Conference on Neural Information Processing Systems. 2018.
[71] Pelicon A, Pranjić M, Miljković D, et al. Zero-Shot Learning for Cross-Lingual News Sentiment Classification[J]. Applied Sciences, 2020, 10(17): 5993.
doi: 10.3390/app10175993
[72] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[73] Brown T, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[C]// Proceedings of the 2020 Conference on Neural Information Processing Systems. 2020, 33: 1877-1901.
[74] Qiu X P, Sun T X, Xu Y G, et al. Pre-Trained Models for Natural Language Processing: A Survey[J]. Science China Technological Sciences, 2020, 63(10): 1872-1897.
doi: 10.1007/s11431-020-1647-3
[75] Pires T, Schlinger E, Garrette D. How Multilingual is Multilingual BERT?[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2019: 4996-5001.
[76] Lample G, Conneau A. Cross-Lingual Language Model Pretraining[OL]. arXiv Preprint, arXiv: 1901.07291.
[77] Conneau A, Khandelwal K, Goyal N, et al. Unsupervised Cross-Lingual Representation Learning at Scale[OL]. arXiv Preprint, arXiv: 1911.02116.
[78] Xia M, Zheng G, Mukherjee S, et al. MetaXL: Meta Representation Transformation for Low-Resource Cross-Lingual Learning[OL]. arXiv Preprint, arXiv: 2104.07908.
[79] Bataa E, Wu J. An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2019: 4652-4657.
[80] Gupta A, Rallabandi S K, Black A W. Task-Specific Pre-Training and Cross Lingual Transfer for Sentiment Analysis in Dravidian Code-Switched Languages[C]// Proceedings of the 1st Workshop on Speech and Language Technologies for Dravidian Languages. 2021: 73-79.
[81] Hossain E, Sharif O, Hoque M M. NLP-CUET@LT-EDI-EACL2021: Multilingual Code-Mixed Hope Speech Detection Using Cross-Lingual Representation Learner[C]// Proceedings of the 1st Workshop on Language Technology for Equality, Diversity and Inclusion. 2021: 168-174.
[82] Howard J, Ruder S. Universal Language Model Fine-Tuning for Text Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. USA: Association for Computational Linguistics, 2018: 328-339.
[83] Barbieri F, Camacho-Collados J, Espinosa Anke L, et al. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification[OL]. arXiv Preprint, arXiv: 2010.12421.
[84] Pikuliak M, Šimko M, Bieliková M. Cross-Lingual Learning for Text Processing: A Survey[J]. Expert Systems with Applications, 2021, 165: 113765.
doi: 10.1016/j.eswa.2020.113765
[85] Schwenk H, Li X. A Corpus for Multilingual Document Classification in Eight Languages[OL]. arXiv Preprint, arXiv: 1805.09821.
[86] Dong X, de Melo G. A Robust Self-Learning Framework for Cross-Lingual Text Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. USA: Association for Computational Linguistics, 2019: 6306-6310.
[87] Houlsby N, Giurgiu A, Jastrzebski S, et al. Parameter-Efficient Transfer Learning for NLP[C]// Proceedings of the 2019 International Conference on Machine Learning. PMLR, 2019: 2790-2799.
[88] Pfeiffer J, Vulić I, Gurevych I, et al. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. USA: Association for Computational Linguistics, 2020: 7654-7673.
[89] Lachraf R, Nagoudi E M B, Ayachi Y, et al. ArbEngVec: Arabic-English Cross-Lingual Word Embedding Model[C]// Proceedings of the 4th Arabic Natural Language Processing Workshop. USA: Association for Computational Linguistics, 2019: 40-48.
[90] Khalid U, Beg M O, Arshad M U. RUBERT: A Bilingual Roman Urdu BERT Using Cross Lingual Transfer Learning[OL]. arXiv Preprint, arXiv: 2102.11278.
[1] Xiao Yuhan, Lin Huiping. Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews[J]. 数据分析与知识发现, 2023, 7(1): 63-75.
[2] Xiao Hanqiong, Zhang Xinyu, Xiao Yuhan, Lin Huiping. Creating Consumer Psychology Portrait with Aspect Words[J]. 数据分析与知识发现, 2022, 6(6): 22-31.
[3] Shang Rongxuan, Zhang Bin, Mi Jianing. End-to-End Aspect-Level Sentiment Analysis for E-Government Applications Based on BRNN[J]. 数据分析与知识发现, 2022, 6(2/3): 364-375.
[4] Sun Yu, Qiu Jiangnan. Studying Opinion Leaders with Network Analysis and Text Mining[J]. 数据分析与知识发现, 2022, 6(1): 69-79.
[5] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[6] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[7] Liu Tong,Liu Chen,Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[8] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[9] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[10] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[11] Zhang Mengyao, Zhu Guangli, Zhang Shunxiang, Zhang Biao. Grouping Microblog Users of Trending Topics Based on Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
[12] Yu Bengong, Zhang Shuwen. Aspect-Level Sentiment Analysis Based on BAGCNN[J]. 数据分析与知识发现, 2021, 5(12): 37-47.
[13] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[14] Lv Huakui,Liu Zhenghao,Qian Yuxing,Hong Xudong. Relationship Between Financial News and Stock Market Fluctuations[J]. 数据分析与知识发现, 2021, 5(1): 99-111.
[15] Xu Hongxia,Yu Qianqian,Qian Li. Studying Content Interaction Data with Topic Model and Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(7): 110-117.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn