Please wait a minute...
Advanced Search
数据分析与知识发现  2018, Vol. 2 Issue (10): 77-83     https://doi.org/10.11925/infotech.2096-3467.2018.0114
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于卷积神经网络与SVM分类器的隐喻识别*
黄孝喜, 李晗雨(), 王荣波, 王小华, 谌志群
杭州电子科技大学认知与智能计算研究所 杭州 310018
Recognizing Metaphor with Convolution Neural Network and SVM
Huang Xiaoxi, Li Hanyu(), Wang Rongbo, Wang Xiaohua, Chen Zhiqun
Institute of Cognitive and Intelligent Computing, Hangzhou Dianzi University, Hangzhou 310018, China
全文: PDF (576 KB)   HTML ( 2
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】针对中英文的隐喻数据集, 提出一种基于卷积神经网络与SVM分类器的隐喻识别方法。【方法】将实验数据向量化, 结合词性特征和关键词特征作为卷积神经网络的输入, 通过卷积层和池化层提取特征, 应用SVM进行分类。针对卷积神经网络的池化层中特征采样的不完全性, 提出将MaxPooling与MeanPooling组合在一起的改进方法。【结果】相对于直接使用卷积神经网络, 利用本文方法进行隐喻识别的准确率在英文动宾语料、英文形容词-名词词组语料和中文隐喻语料分别提高4.12%、0.84%和4.50%。【局限】中文分词不准确, 影响词向量模型训练; 卷积神经网络的层数过少, 影响特征的完整性。【结论】根据中英文数据集上隐喻识别的结果分析, 该方法在两个数据集上都取得了良好效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
黄孝喜
李晗雨
王荣波
王小华
谌志群
关键词 隐喻识别卷积神经网络支持向量机特征提取    
Abstract

[Objective] This paper presents a new method to recognize metaphor, from the Chinese and English datasets. [Methods] First, we mapped the experimental dataset to vector space, which was also input to a convolutional neural network along with the property and keyword features. Then, we extracted the needed features with the help of convolutional and pooled layers, as well as classified them using SVM. Finally, we combined the Max-Pooling and Mean-Pooling to improve the extracted features’ accuracy. [Results] Compared with the traditional models, our method increased the accuracy of extracted features from the corpus of English verb-object, English adjective-noun and Chinese metaphor by 4.12%, 0.84% and 4.50% respectively. [Limitations] The Chinese word segmentation affects the training of word vector model. We need to add more layers to the convolutional neural networks. [Conclusions] The proposed method could effectively identify metaphor from Chinese and English corpus.

Key wordsMetaphor Recognition    Convolution Neural Network    Support Vector Machines    Feature Extraction
收稿日期: 2018-01-29      出版日期: 2018-11-12
ZTFLH:  TP391  
基金资助:*本文系教育部人文社会科学研究规划基金项目“融合深度神经网络模型的汉语隐喻计算研究”(项目编号: 18YJA740016)和教育部人文社会科学研究青年基金项目“基于语义相关性的汉语组块切分模型研究”(项目编号: 12YJCZH201)的研究成果之一
引用本文:   
黄孝喜, 李晗雨, 王荣波, 王小华, 谌志群. 基于卷积神经网络与SVM分类器的隐喻识别*[J]. 数据分析与知识发现, 2018, 2(10): 77-83.
Huang Xiaoxi,Li Hanyu,Wang Rongbo,Wang Xiaohua,Chen Zhiqun. Recognizing Metaphor with Convolution Neural Network and SVM. Data Analysis and Knowledge Discovery, 2018, 2(10): 77-83.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.0114      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2018/V2/I10/77
  基于SVM与CNN的隐喻识别的整体框架
  卷积神经网络结构
  Tanh和ReLU对比
Verb Noun Class Relation
See development
Live dream
Envy eat
Break window
Boy cry
Paint dry
Metaphorical
Metaphorical
Metaphorical
Literal
Literal
Literal
VO
VO
SV
VO
SV
SV
  TSV中动词隐喻的主语-动词或动词-宾语关系
Metaphorical Literal
bright smile
bushy eyebrows
cautious smile
dark history
deep faith
desolate beauty
economic battle
fading memory
faint impression
blue fence
blinding light
biting dog
bright sun
bright light
burning tree
burning arm
dark face
dirty hands
  TSV-TRAIN中的形容词-名词短语
实验 准确率
CNN-sentence-eng 81.80%
CNN_SVM -sentence-eng 87.23%
CNN-Word/Pos-eng 86.00%
CNN_SVM -Word/Pos-eng 90.12%
CNN-AN-eng 86.36%
CNN_SVM -AN-eng 87.20%
Rei等[13] 83.00%
  英文语料隐喻识别
实验 准确率
CNN-sentence-ch 72.5%
CNN_SVM -sentence-ch 77.00%
  中文语料隐喻识别
[1] Wilks Y.A Preferential, Pattern-seeking, Semantics for Natural Language Inference[A]// Words and Intelligence I. Text, Speech and Language Technology[M]. Springer, 2007.
[2] Fass D.Met*: A Method for Discriminating Metonymy and Metaphor by Computer[J]. Computational Linguistics, 1991, 17(1): 49-90.
[3] Neuman Y, Assaf D, Cohen Y, et al.Metaphor Identification in Large Texts Corpora[J]. PLoS One, 2013, 8(4): e62343.
doi: 10.1371/journal.pone.0062343 pmid: 3639214
[4] Shutova E, Sun L, Korgonen A.Metaphor Identification Using Verb and Nouns Clustering[C]// Proceedings of the 23rd International Conference on Computational Linguistics. 2010.
[5] Hovy D, Srivastava S, Kumar S, et al.Identifying Metaphorical Word Use with Tree Kernels[C]// Proceedings of the 1st Workshop on Metaphor in NLP. 2013.
[6] Rai S, Chakraverty S, Tayal D K.Supervised Metaphor Detection Using Conditional Random Fields[C]// Proceedings of the 4th Workshop on Metaphor in NLP. 2016.
[7] Tsvetkov Y, Boytsov L, Gershman A, et al.Metaphor Detection with Cross-Lingual Model Transfer[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014.
[8] Kalchbrenner N, Grefenstette E, Blunsom P.A Convolutional Neural Network for Modelling Sentences[OL]. arXiv Preprint, arXiv: 1404.2188.
[9] Graves A, Fernández S, Schmidhuber J.Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition[A]// Artificial Neural Networks: Formal Models and Their Applications[M]. Springer, 2005.
[10] Graves A, Schmidhuber J.Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005, 18:602-610.
doi: 10.1016/j.neunet.2005.06.042 pmid: 16112549
[11] Dinh E L D, Gurevych I. Token-Level Metaphor Detection Using Neural Networks[C]// Proceedings of the 4th Workshop on Metaphor in NLP. 2016.
[12] Bizzoni Y, Chatzikyriakidis S, Ghanimifard M.“Deep” Learning: Detecting Metaphoricity in Adjective-Noun Pairs[C]// Proceedings of the Workshop on Stylistic Variation. 2017: 43-52.
[13] Rei M, Bulat L, Kiela D, et al.Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection[C]// Proceedings of EMNLP. 2017: 1537-1546.
[14] 王治敏, 王厚峰, 俞士汶. 基于机器学习方法的汉语名词隐喻识别[J]. 高技术通讯, 2006, 17(6): 575-580.
doi: 10.3321/j.issn:1002-0470.2007.06.005
[14] (Wang Zhimin, Wang Houfeng, Yu Shiwen.Chinese Nominal Metaphor Recognition Based on Machine Learning[J]. Chinese High Technology Letter, 2006, 17(6): 575-580.)
doi: 10.3321/j.issn:1002-0470.2007.06.005
[15] 徐扬. 基于最大熵模型的汉语隐喻现象识别[J]. 计算机工程和科学, 2007, 29(4): 95-103.
[15] (Xu Yang.Recognition of the Chinese Metaphor Phenomena Based on the Maximum Entropy Model[J]. Computer Engineering and Science, 2007, 29(4): 95-103.)
[16] 李斌, 于丽丽, 石民, 等. “像”的明喻计算[J]. 中文信息学报, 2008, 22(6): 27-32.
[16] (Li Bin, Yu Lili, Shi Min, et al.Computation of Chinese Simile with “Xiang”[J]. Journal of Chinese Information Processing, 2008, 22(6): 27-32.)
[17] 黄孝喜. 隐喻机器理解的若干关键问题研究[D]. 杭州: 浙江大学, 2009.
[17] (Huang Xiaoxi.Research on Some Key Issues of Metaphor Computation[D]. Hangzhou: Zhejiang University, 2009.)
[18] Kim Y.Convolutional Neural Networks for Sentence Classification [OL]. arXiv Preprint, arXiv: 1408.5882.
[19] Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[20] Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Com- positionality[A]// Advances in Neural Information Processing Systems[M]. Springer, 2013.
[21] Lécun Y, Bottou L, Bengio Y, et al.Gradient-based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
doi: 10.1109/5.726791
[22] 李航. 统计学习方法[M]. 第1版. 北京: 清华大学出版社, 2016: 95-123.
[22] (Li Hang.Statistical Learning Method[M]. The 1st Edition. Beijing: Tsinghua University Publishing House, 2016: 95-123.)
[1] 范少萍,赵雨宣,安新颖,吴清强. 基于卷积神经网络的医学实体关系分类模型研究*[J]. 数据分析与知识发现, 2021, 5(9): 75-84.
[2] 范涛,王昊,吴鹏. 基于图卷积神经网络和依存句法分析的网民负面情感分析研究*[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[3] 孟镇,王昊,虞为,邓三鸿,张宝隆. 基于特征融合的声乐分类研究*[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[4] 韩普,张展鹏,张明淘,顾亮. 基于多特征融合的中文疾病名称归一化研究*[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[5] 沈旺, 李世钰, 刘嘉宇, 李贺. 问答社区回答质量评价体系优化方法研究 *[J]. 数据分析与知识发现, 2021, 5(2): 83-93.
[6] 冯昊, 李树青. 基于多种支持向量机的多层级联式分类器研究及其在信用评分中的应用*[J]. 数据分析与知识发现, 2021, 5(10): 28-36.
[7] 郑新曼, 董瑜. 基于科技政策文本的程度词典构建研究*[J]. 数据分析与知识发现, 2021, 5(10): 81-93.
[8] 邱尔丽,何鸿魏,易成岐,李慧颖. 基于字符级CNN技术的公共政策网民支持度研究 *[J]. 数据分析与知识发现, 2020, 4(7): 28-37.
[9] 刘伟江,魏海,运天鹤. 基于卷积神经网络的客户信用评估模型研究*[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
[10] 张冬瑜,崔紫娟,李映夏,张伟,林鸿飞. 基于Transformer和BERT的名词隐喻识别*[J]. 数据分析与知识发现, 2020, 4(4): 100-108.
[11] 苏传东,黄孝喜,王荣波,谌志群,毛君钰,朱嘉莹,潘宇豪. 基于词嵌入融合和循环神经网络的中英文隐喻识别*[J]. 数据分析与知识发现, 2020, 4(4): 91-99.
[12] 徐月梅,刘韫文,蔡连侨. 基于深度融合特征的政务微博转发规模预测模型*[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[13] 丁晟春,俞沣洋,李真. 网络舆情潜在热点主题识别研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 29-38.
[14] 向菲,谢耀谈. 基于混合采样与迁移学习的患者评论识别模型*[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[15] 龚丽娟,王昊,张紫玄,朱立平. Word2Vec对海关报关商品文本特征降维效果分析*[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn