Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (5): 44-53     https://doi.org/10.11925/infotech.2096-3467.2021.0857
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于义原知识和双向注意力流的问题生成模型*
段建勇1,2(),徐丽闪1,刘杰1,2,李欣1,2,张家铭1,王昊1,2
1北方工业大学信息学院 北京 100144
2CNONIX国家标准应用与推广实验室 北京 100144
Question Generation Based on Sememe Knowledge and Bidirectional Attention Flow
Duan Jianyong1,2(),Xu Lishan1,Liu Jie1,2,Li Xin1,2,Zhang Jiaming1,Wang Hao1,2
1School of Information, North China University of Technology, Beijing 100144, China
2CNONIX National Standard Application and Promotion Laboratory, Beijing 100144, China
全文: PDF (1029 KB)   HTML ( 18
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 为解决现有模型存在的生成问题语义偏离于给定上下文文本和答案的问题,提出一种基于义原知识和双向注意力流的问题生成模型。【方法】 提出两种语义增强策略:(1)通过在嵌入层融入义原外部知识的方法来捕捉比词向量更小粒度的语义知识,进而增强文本自身的语义特征。此外,通过余弦相似度算法得到更符合上下文文本语义的扩充义原知识库,这样做不仅可以筛除原有义原知识库中可能会导致语义嘈杂的义原,而且可以为词表中无义原标注的单词推荐符合语义的义原集合。(2)通过在编码层后融入双向注意力流的方法,增强文本与答案之间的语义表征。【结果】 本模型在SQuAD1.1数据集上的Bleu_1、Bleu_2、Bleu_3、Blue_4评价指标分别达到46.70%、31.07%、22.90%、17.48%。实验证明,本文所提改进模型性能优于基线模型。【局限】 当融入双向注意力流时,由于模型需要分别对段落文本及问题进行特征提取,因此训练模型时需要消耗成倍的内存和时间。【结论】 义原知识和双向注意力流这两种语义增强策略可以增强问题生成模型的效果,并且使模型生成更符合人类语言习惯的更高质量的问题。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
段建勇
徐丽闪
刘杰
李欣
张家铭
王昊
关键词 问题生成义原知识余弦相似度双向注意力流    
Abstract

[Objective] This paper proposes a question generation model based on sememe knowledge and bidirectional attention flow, aiming to improve the semantics of the questions. [Methods] We developed two strategies to enhance semantics: (I) By integrating the external knowledge of sememe in the embedding layer, we captured the semantic knowledge with a smaller granularity than word vectors, and then enhanced the semantic features of the text itself. In addition, we obtained an expanded sememe knowledge base that is more in line with the semantics of the contextual text through the cosine similarity algorithm. It helped us filter out the sememes creating semantic noise in the original knowledge base, and recommended semantically compliant sememe sets for words labeled with non-semantic origins. (II) We enhanced the semantic representation between texts and answers by incorporating a bidirectional attention flow after the encoding layer. [Results] We evaluated our model with the SQuAD1.1 dataset, and the Bleu_1, Bleu_2, Bleu_3, and Blue_4 reached 46.70%, 31.07%, 22.90%, and 17.48%, respectively. The proposed model outperformed the baseline models. [Limitations] With the bidirectional attention flow, the model needs to extract features of paragraph texts and questions, which demands double memory and time to train the model. [Conclusions] Sememe knowledge and bidirectional attention flow could help the proposed model generate higher-quality questions more in line with human language habits.

Key wordsQuestion Generation    Sememe Knowledge    Cosine Similarity    Bidirectional Attention Flow
收稿日期: 2021-08-19      出版日期: 2022-03-01
ZTFLH:  TP391  
基金资助:*国家自然科学基金项目(61972003);教育部人文社会科学基金项目的研究成果之一(21YJA740052)
通讯作者: 段建勇,ORCID: 0000-0002-2244-3764     E-mail: duanjy@ncut.edu.cn
引用本文:   
段建勇, 徐丽闪, 刘杰, 李欣, 张家铭, 王昊. 基于义原知识和双向注意力流的问题生成模型*[J]. 数据分析与知识发现, 2022, 6(5): 44-53.
Duan Jianyong, Xu Lishan, Liu Jie, Li Xin, Zhang Jiaming, Wang Hao. Question Generation Based on Sememe Knowledge and Bidirectional Attention Flow. Data Analysis and Knowledge Discovery, 2022, 6(5): 44-53.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.0857      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I5/44
Fig.1  Encoder-Decoder模型[2]
Fig.2  基于义原知识和双向注意力流的问题生成模型
Fig.3  在HowNet中cardinal一词的义原树示例
硬件 参数
处理器 Intel(R)
操作系统 Linux
CPU核数 8核
GPU块数 4块
GPU型号 GeForce GTX 1080 Ti
GPU大小 12GB
Table 1  实验硬件配置参数
参数
Epoch 50
Max_len 400
学习率 0.2
Batch_size 18
Dropout 0.3
Beam_size 10
Table 2  实验最佳参数设置
模型 Bleu_1 Bleu_2 Bleu_3 Bleu_4
Seq2Seq 31.34 13.79 7.36 4.26
NQG++ 43.09 25.96 17.50 12.28
Ass2s - - - 16.20
s2s-map-gsa 45.69 30.25 22.16 16.85
Our model 46.70 31.07 22.90 17.48
Table 3  基于义原知识和双向注意力流的问题生成模型 实验结果(%)
集合中义原的个数 Bleu_1 Bleu_2 Bleu_3 Bleu_4
1 46.21 30.60 22.40 17.08
2 46.70 31.07 22.90 17.48
3 46.64 30.89 22.71 17.38
4 46.37 30.65 22.47 17.05
Table 4  集合中义原个数对模型性能的影响(%)
义原作用域 Bleu_1 Bleu_2 Bleu_3 Bleu_4
仅文本 46.62 30.96 22.75 17.33
仅答案 46.42 30.77 22.53 17.31
文本和答案 46.70 31.07 22.90 17.48
Table 5  义原作用域对模型的影响(%)
模型 Bleu_1 Bleu_2 Bleu_3 Bleu_4
s2s-map-gsa 45.69 30.25 22.16 16.85
+ prediction_sememes 46.17 30.64 22.53 17.20
+ FClayer 46.11 30.41 22.26 16.97
Our Model 46.70 31.07 22.90 17.48
Table 6  义原知识和双向注意力流机制分别对模型的影响结果(%)
Fig.4  各基线模型生成问题样例分析
[1] 吴云芳, 张仰森. 问题生成研究综述[J]. 中文信息学报, 2021, 35(7): 1-9.
[1] ( Wu Yunfang, Zhang Yangsen. A Survey of Question Generation[J]. Journal of Chinese Information Processing, 2021, 35(7): 1-9.)
[2] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[OL]. arXiv Preprint, arXiv: 1409.3215.
[3] Heilman M, Smith N A. Good Question! Statistical Ranking for Question Generation[C]// Proceedings of the 2010 North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference. 2010: 1-9.
[4] Mannem P, Prasad R, Joshi A. Question Generation from Paragraphs at UPenn: QGSTEC System Description[C]// Proceedings of the 3rd Workshop on Question Generation. 2010: 84-91.
[5] Du X Y, Shao J R, Cardie C. Learning to Ask: Neural Question Generation for Reading Comprehension[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 1342-1352.
[6] Zhou Q, Yang N, Wei F, et al. Neural Question Generation from Text: A Preliminary Study[C]// Proceedings of the 2017 CCF International Conference on Natural Language Processing and Chinese Computing. 2017: 662-671.
[7] Ma X Y, Zhu Q L, Zhou Y L, et al. Improving Question Generation with Sentence-Level Semantic Matching and Answer Position Inferring[J]. Proceedings of the AAAI Conference on A.pngicial Intelligence, 2020, 34(5): 8464-8471.
[8] 谭红叶, 孙秀琴, 闫真. 基于答案及其上下文信息的问题生成模型[J]. 中文信息学报, 2020, 34(5): 74-81.
[8] ( Tan Hongye, Sun Xiuqin, Yan Zhen. Question Generation Model Based on the Answer and Its Contexts[J]. Journal of Chinese Information Processing, 2020, 34(5): 74-81.)
[9] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1532-1543.
[10] Niu Y L, Xie R B, Liu Z Y, et al. Improved Word Representation Learning with Sememes[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 2049-2058.
[11] 闫强, 张笑妍, 周思敏. 基于义原相似度的关键词抽取方法[J]. 数据分析与知识发现, 2021, 5(4): 80-89.
[11] ( Yan Qiang, Zhang Xiaoyan, Zhou Simin. Extracting Keywords Based on Sememe Similarity[J]. Data Analysis and Knowledge Discovery, 2021, 5(4): 80-89.)
[12] Seo M, Kembhavi A, Farhadi A, et al. Bidirectional Attention Flow for Machine Comprehension[OL]. arXiv Preprint, arXiv: 1611.01603.
[13] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv: 1706.03762.
[14] Luong T, Pham H, Manning C D. Effective Approaches to Attention-Based Neural Machine Translation[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 1412-1421.
[15] Zhao Y, Ni X C, Ding Y Y, et al. Paragraph-Level Neural Question Generation with Maxout Pointer and Gated Self-Attention Networks[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 3901-3910.
[16] Gu J T, Lu Z D, Li H, et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 1631-1640.
[17] See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-Generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 1073-1087.
[18] Gulcehre C, Ahn S, Nallapati R, et al. Pointing the Unknown Words[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 140-149.
[19] Kim Y, Lee H, Shin J, et al. Improving Neural Question Generation Using Answer Separation[OL]. arXiv Preprint, arXiv: 1809.02393.
[20] Wang W H, Yang N, Wei F R, et al. Gated Self-Matching Networks for Reading Comprehension and Question Answering[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017: 189-198.
[1] 吴丹, 陆柳杏. 跨设备搜索中设备转移前后查询式语义变化研究*[J]. 数据分析与知识发现, 2018, 2(8): 69-78.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn