Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (2/3): 167-183    DOI: 10.11925/infotech.2096-3467.2021.1020
Identifying Metaphors and Association of Chinese Idioms with Transfer Learning and Text Augmentation
Zhang Wei,Wang Hao(),Chen Yuetong,Fan Tao,Deng Sanhong
School of Information Management, Nanjing University, Nanjing 210023, China
Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
[Objective] This paper tries to identify sentiment metaphors from Chinese idioms and build an idiom knowledge graph integrating external things (source) and users’ internal attitudes or sentiments (target). [Methods] We proposed a recognition scheme for metaphors of Chinese idioms based on transfer learning and text augmentation. First, we retrieved the idioms and their external categories to obtain the external knowledge and the learning corpus with the help of sentiment dictionary. Then, we matched idioms with the dictionary, which were used for the first round of transfer learning. All other sentiment words in the sentiment dictionary were the training set for the second round of transfer. Third, we introduced Chinese language knowledge to augment the texts with the weak sentiment semantics due to the metaphorical characteristics. Fourth, we compared the CLS of the BERT text embedding with the average pooling schemes using mainstream deep learning models. Finally, we hierarchically classified the un-matched idioms with the optimal model and merged them with the matched idioms to obtain internal knowledge. [Results] The average pooling accuracy was 4.69% higher than the [CLS], which was further improved by 13% by adding idiom interpretation. The sentiment accuracy at all levels of the second transfer reached 80%, and the highest improvement was up to 6.25% for small corpus. [Limitations] The classification accuracy of sentiment categories could be improved with larger corpus. [Conclusions] Our scheme can effectively identify the sentiment metaphor knowledge of Chinese idioms, and the association of internal and external knowledge lays the foundation for better knowledge services.

Key wordsIdiom Knowledge Graph      Metaphor Knowledge      Transfer Learning      Text Augmentation      Multi-Layer Sentiment Classification     
Received: 11 September 2021      Published: 14 April 2022
ZTFLH:  G202  
Fund:National Natural Science Foundation of China(72074108);Graduate Research and Innovation Projects of Jiangsu Province(KYCX21_0026);Fundamental Research Funds for the Central Universities(010814370113)
Corresponding Authors: Wang Hao,ORCID:0000-0002-0131-0823     E-mail:

Zhang Wei, Wang Hao, Chen Yuetong, Fan Tao, Deng Sanhong. Identifying Metaphors and Association of Chinese Idioms with Transfer Learning and Text Augmentation. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 167-183.

A Framework for Metaphor Knowledge Recognition and Association of Chinese Idioms
Chinese Idiom Ontology Modelling and Application Patterns of Internal and External Associations
一级类 二级类 三级类 成语例词
快乐(PA) 春风得意 逍遥自得 无拘无束 逍遥自在
安心(PE) 安闲自得 乐天知命 独善其身 清风朗月
尊敬(PD) 永垂不朽 王公大人 非同寻常 至高无上
赞扬(PH) 百折不挠 义无反顾 建功立业 文质彬彬
相信(PG) 季布一诺 肝胆相照 山盟海誓 心领神会
喜爱(PB) 如痴如醉 手足之情 依依不舍 一往情深
祝愿(PK) 万寿无疆 千秋万岁 马到成功 鹏程万里
惊* 惊奇(PC) 惊天动地 光怪陆离 不期而遇 匪夷所思
悲伤(NB) 百业萧条 切肤之痛 逝者如斯 肝肠寸断
失望(NJ) 心灰意懒 付之东流 一蹶不振 心若死灰
内疚(NH) 一念之差 引咎自责 后悔莫及 负荆请罪
思念(PF) 睹物思人 牵肠挂肚 白云亲舍 望穿秋水
慌乱(NI) 不知所措 失魂落魄 心神不定 燃眉之急
恐惧(NC) 战战兢兢 惶恐不安 危在旦夕 不寒而栗
羞愧(NG) 无地自处 面红耳赤 羞面见人 狼狈万状
怒* 愤怒(NA) 怒发冲冠 愤愤不平 气势汹汹 怒不可遏
烦闷(NE) 百无聊赖 忧心忡忡 辗转反侧 怅然若失
憎恶(ND) 阿谀谄媚 乌合之众 欺世盗名 不屑一顾
贬责(NN) 目光短浅 争名夺利 不可一世 穷奢极欲
妒忌(NK) 妒火中烧 爱毛反裘 拈酸吃醋 避面尹邢
怀疑(NL) 迟疑不决 莫测高深 众口纷纭 弓影杯蛇
Transfer Learning Corpus of Idiom Sentiment Metaphor Recognition
Text Data Augmentation of Idioms/Sentiment Words Based on Chinese Language Knowledge
A Deep Learning-Based Sentiment Metaphor Recognition Model for Chinese Idioms
模型 Acc/% Macro_P/% Macro_R/% Macro_F1/% 正(4 513) 负(3 568)
P/% R /% F1/% P/% R/% F1 /%
CNN [CLS] 73.07 73.07 72.93 72.96 73.01 69.47 71.20 73.12 76.38 74.72
CNN_[AVG] 77.76 78.10 77.49 77.55 80.33 70.95 75.35 75.87 84.03 79.74
CNN_AS 80.84 81.00 80.65 80.72 82.46 76.21 79.21 79.55 85.09 82.23
CNN_AE 90.02 90.08 90.13 90.01 87.15 92.84 89.91 93.00 87.42 90.12
CNN_AES 90.42 90.79 90.23 90.35 93.78 85.68 89.55 87.80 94.77 91.15
RNN_AES 89.56 89.75 89.42 89.51 91.60 86.11 88.77 87.89 92.74 90.25
LSTM_AES 90.67 90.78 90.57 90.63 92.08 88.11 90.05 89.48 93.03 91.22
LSTM_AES 90.87 90.85 90.87 90.86 90.18 90.84 90.51 91.52 90.90 91.21
BiLSTM_AES 91.43 91.41 91.42 91.41 90.97 91.16 91.06 91.85 91.67 91.76
BiLSTM_Att_AES 91.73 91.70 91.74 91.72 90.77 92.11 91.43 92.64 91.38 92.01
Results of Idiom Sentiment Metaphor Recognition Based on Text Data Augmentation
Performance of Transfer Learning-Based Hierarchical Sentiment Classification of Idioms
层次 父类 子类 ST1 ST2
P /% R /% F1/% P/% R/% F1 /%
(9 923/27 377)
90.77 92.11 91.43 91.82 90.95 91.38
92.64 91.38 92.01 91.75 92.55 92.14
(4 755/13 257)
92.23 95.72 93.94 91.49 97.48 94.39
82.35 77.78 80.00 78.95 83.33 81.08
70.37 55.47 62.04 79.76 48.91 60.63

(5 168/14 120)
72.19 63.87 67.78 77.02 64.92 70.45
67.44 60.42 63.74 73.68 58.33 65.12
77.78 29.17 42.42 84.62 45.83 59.46
85.92 91.53 88.63 86.30 93.61 89.81
(690/1 955)
安心 75.68 57.14 65.12 82.50 67.35 74.16
快乐 79.00 89.77 84.04 83.51 92.05 87.57

(3 972/11 074)
喜爱 69.23 31.03 42.86 63.16 41.38 50.00
相信 75.00 9.68 17.14 40.00 6.45 11.11
赞扬 86.34 98.94 92.21 87.03 97.87 92.13
祝愿 25.00 14.29 18.18 100.00 14.29 25.00
尊敬 100.00 8.11 15.00 71.43 13.51 22.73
惊(93/228) 惊奇 82.35 77.78 80.00 78.95 83.33 81.08

(960/2 307)
悲伤 77.86 90.83 83.85 80.45 89.17 84.58
内疚 75.00 37.50 50.00 80.00 50.00 61.54
失望 64.29 42.86 51.43 63.89 54.76 58.97
思念 89.47 80.95 85.00 94.12 76.19 84.21

(485/1 177)
慌乱 71.88 69.70 70.77 80.65 75.76 78.12
恐惧 81.67 84.48 83.05 86.67 89.66 88.14
羞愧 100.00 80.00 88.89 100.00 100.00 100.00
怒(121/387) 愤怒 77.78 29.17 42.42 84.62 45.83 59.46

(3 602/10 249)
贬责 81.34 96.23 88.16 83.60 94.25 88.61
烦闷 66.67 53.57 59.41 75.76 44.64 56.18
怀疑 50.00 14.29 22.22 40.00 28.57 33.33
妒忌 0.00 0.00 0.00 100.00 40.00 57.14
憎恶 62.50 10.31 17.70 50.00 27.84 35.76
Specific Performance of Transfer Learning-Based Hierarchical Sentiment Classification of Idioms
Prediction Results of Sentiment Metaphor of Unlabeled Idioms
A Knowledge Graph of Chinese Idioms with Internal and External Features
Detecting External Things by Internal Sentiments (Query Keyword “Miss”)
Detecting Internal Sentiments by External Things (Query Keyword “Cloud”)
Humanistic Knowledge Service by Internal and External Knowledge (Keyword “Cloud” and “Miss”)
