Abstract
[Objective] This study aims to address the performance degradation of multilingual models in handling new language tasks due to catastrophic forgetting. [Methods] A continual learning-based multilingual sentiment analysis model, mLMs-EWC, was proposed. By integrating the continual learning idea into the multilingual models, these models can acquire new language features while retaining the linguistic characteristics of previously learned languages. [Results] The mLMs-EWC model outperforms the Multi-BERT model by approximately 5.2% and 4.5% on French and English tasks, respectively. In addition, we also evaluate our approach on a lightweight distillation model, which showed a remarkable improvement rate of 24.7% on the English task. [Limitations] This study focuses on three widely used languages, and further validation is needed for the generalization ability of other languages. [Conclusions] The proposed model can alleviate catastrophic forgetting in multilingual sentiment analysis tasks and achieve continual learning on multilingual datasets. The code can be visited through https://github.com/flutter85/mLMs-EWC/tree/master.
|