Please wait a minute...
Advanced Search
数据分析与知识发现  2018, Vol. 2 Issue (12): 23-32     https://doi.org/10.11925/infotech.2096-3467.2018.0583
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于层级交互网络的文本阅读理解与问答方法研究*
程勇1(), 徐德宽1, 吕学强2
1鲁东大学文学院 烟台 264025
2北京信息科技大学计算机学院 北京 100192
Comprehending Texts and Answering Questions Based on Hierarchical Interactive Network
Cheng Yong1(), Xu Dekuan1, Lv Xueqiang2
1School of Chinese Language and Literature, Ludong University, Yantai 264025, China
2School of Computer Science, Beijing University of Information Technology, Beijing 100192, China
全文: PDF (1865 KB)   HTML ( 4
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】实现基于文本阅读理解的精确问答。【方法】提出一种基于层级交互机制的神经网络模型。该模型借鉴人类在阅读理解过程中的思维习惯, 将分层处理机制、内容过滤机制、多维注意力机制等人类在阅读时的特性融合到神经网络构建中, 提升机器对文本信息的分析和理解能力。【结果】在中文阅读理解评测CMRC 2017发布的数据上验证本文模型, 测试集上的准确率达到0.78, 性能优于目前的主流模型以及评测比赛上发布的最好成绩。【局限】未对候选答案做进一步优化和排序, 性能距离人类阅读理解水平还有一定差距。【结论】本文构建的层级交互网络显著提升了对文本的自动分析与理解能力, 使机器能够在理解文本内容的基础上回答相关问题。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
程勇
徐德宽
吕学强
关键词 层级交互网络机器阅读理解自动问答    
Abstract

[Objective] This paper aims to help computer answer questions accurately based on text comprehension. [Methods] First, we proposed a neural network model based on hirrarchical interaction mechanism. We introduced various human thinking mechanism to build this model, which contained hierarchical processing, content filtering and multi-dimensional attention. Then, we ran the proposed model with dataset from Chinese Machine Reading Comprehension (CMRC) 2017. [Results] The precision of the proposed method on test-set was 0.78, which was better than the best result of other published models. [Limitations] There was no further optimization for the potential answers. [Conclusions] The proposed hierarchical interactive network improves machine’s ability to answer questions based on text comprehension.

Key wordsHirarchical Interactive Network    Machine Comprehension    Automatic Question Answering
收稿日期: 2018-05-24      出版日期: 2019-01-16
ZTFLH:  G353  
基金资助:*本文系国家自然科学基金面上项目“中文专利侵权自动检测研究”(项目编号: 61671070)和国家语言文字工作委员会重点项目“汉语智能写作关键技术研究与应用”(项目编号: ZDI135-53)的研究成果之一
引用本文:   
程勇, 徐德宽, 吕学强. 基于层级交互网络的文本阅读理解与问答方法研究*[J]. 数据分析与知识发现, 2018, 2(12): 23-32.
Cheng Yong,Xu Dekuan,Lv Xueqiang. Comprehending Texts and Answering Questions Based on Hierarchical Interactive Network. Data Analysis and Knowledge Discovery, 2018, 2(12): 23-32.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.0583      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2018/V2/I12/23
  本文网络的核心框架
  基于注意力机制的层级交互网络
比较项
中央处理器 Intel Xeon CPU8 Cores
内存 DDR4 64GB
显卡 Titan Xp 12GB
单轮平均迭代时间 2小时 2分钟
总训练时间 30小时 30分钟
  模型训练配置与时间
超参数 性能比较
词向量维度 维度 64 128 192 256
准确率 0.759 0.762 0.757 0.756
状态向量
维度
维度 64 128 192 256
准确率 0.741 0.762 0.761 0.762
神经元
保存率
保存率 0.5 0.6 0.7 0.8
准确率 0.751 0.762 0.757 0.755
  不同超参数对性能的影响
网络结构 准确率(校验集) 准确率(测试集)
All information 0.762 0.775
-query_attention 0.749 0.759
-doc_attention 0.754 0.766
-add 0.759 0.770
-multi 0.759 0.769
  不同网络结构对性能的影响
  网络的训练过程与性能变化
对比方法 准确率(校验集) 准确率(测试集)
基线
模型
Random Guess 0.017 0.017
Top Frequency 0.107 0.087
As Reader[6] 0.698 0.713
GA Reader[8] 0.748 0.751
CMRC
评测方法
Top 1 0.761 0.777
Top 2 0.772 0.775
Top 3 0.779 0.774
Our Model 0.763 0.780
  与现有方法的比较结果
[1] 郭利敏. 基于卷积神经网络的文献自动分类研究[J]. 图书与情报, 2017(6): 96-103.
doi: 10.11968/tsyqb.1003-6938.2017119
[1] (Guo Limin.Study of Automatic Classification of Literature Based on Convolution Neural Network[J]. Library and Information, 2017(6): 96-103.)
doi: 10.11968/tsyqb.1003-6938.2017119
[2] 李慧宗, 胡学钢, 杨恒宇,等. 基于LDA的社会化标签综合聚类方法[J]. 情报学报, 2015, 34(2): 146-155.
doi: 10.3772/j.issn.1000-0135.2015.002.004
[2] (Li Huizong, Hu Xuegang, Yang Hengyu, et al.A Comprehensive Clustering Method of Social Tags Based on LDA[J]. Journal of the China Society for Scientific and Technical Information, 2015, 34(2): 146-155.)
doi: 10.3772/j.issn.1000-0135.2015.002.004
[3] 徐彤阳, 尹凯. 大数据背景下微博语义检索[J]. 情报杂志, 2017, 36(12): 173-179.
[3] (Xu Tongyang, Yin Kai.Semantic Retrieval of Microblogging in the Background of Large Data[J]. Journal of Intelligence, 2017, 36(12): 173-179.)
[4] 张志昌. 开放域阅读理解关键技术研究[D]. 哈尔滨: 哈尔滨工业大学, 2010.
[4] (Zhang Zhichang.Key Technologies of Reading Comprehension for Open-Domain[D]. Harbin: Harbin Institute of Technology, 2010.)
[5] Hermann K M, Kočiský T, Grefenstette E, et al.Teaching Machines to Read and Comprehend[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015: 1693-1701.
[6] Kadlec R, Schmid M, Bajgar O, et al.Text Understanding with the Attention Sum Reader Network[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 908-918.
[7] Chen D, Bolton J, Manning C D.A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2358-2367.
[8] Dhingra B, Liu H, Yang Z, et al.Gated-Attention Readers for Text Comprehension[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1832-1846.
[9] Sordoni A, Bachman P, Trischler A, et al.Iterative Alternating Neural Attention for Machine Reading[OL]. arXiv Preprint, arXiv: 1606.02245.
[10] Cui Y, Chen Z, Wei S, et al.Attention-over-Attention Neural Networks for Reading Comprehension[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 593-602.
[11] Rajpurkar P, Zhang J, Lopyrev K, et al.SQuAD: 100,000+ Questions for Machine Comprehension of Text[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 2383-2392.
[12] Wang W, Yang N, Wei F, et al.Gated Self-Matching Networks for Reading Comprehension and Question Answering[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 189-198.
[13] Seo M, Kembhavi A, Farhadi A, et al.Bidirectional Attention Flow for Machine Comprehension[OL]. arXiv Preprint, arXiv: 1611.01603.
[14] Gong Y, Bowman S R.Ruminating Reader: Reasoning with Gated Multi-Hop Attention[OL]. arXiv Preprint, arXiv: 1704.07415.
[15] Zhang J, Zhu X, Chen Q, et al.Exploring Question Understanding and Adaptation in Neural-network Based Question Answering[OL]. arXiv Preprint, arXiv: 1703.04617.
[16] Shen Y, Huang P, Gao J, et al.ReasoNet: Learning to Stop Reading in Machine Comprehension[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017: 1047-1055.
[17] Hu M, Peng Y, Huang Z, et al.Mnemonic Reader for Machine Comprehension[OL]. arXiv Preprint, arXiv: 1705.02798.
[18] Cui Y, Liu T, Chen Z, et al.Dataset for First Evaluation on Chinese Machine Reading Comprehension[OL]. arXiv Preprint, arXiv: 1709.08299.
[19] 顾明远. 教育大辞典[M]. 上海: 上海教育出版社, 1998.
[19] (Gu Mingyuan.Dictionary of Education[M]. Shanghai: Shanghai Educational Publishing House, 1998.)
[20] 张杰, 魏维. 基于视觉注意力模型的显著性提取[J]. 计算机技术与发展, 2010, 20(11): 109-113.
doi: 10.3969/j.issn.1673-629X.2010.11.027
[20] (Zhang Jie, Wei Wei.Saliency Extraction Based on Visual Attention Model[J]. Computer Technology and Development, 2010, 20(11): 109-113.)
doi: 10.3969/j.issn.1673-629X.2010.11.027
[21] 张家俊, 宗成庆. 神经网络语言模型在统计机器翻译中的应用[J]. 情报工程, 2017, 3(3): 21-28.
doi: 10.3772/j.issn.2095-915x.2017.03.004
[21] (Zhang Jiajun, Zong Chengqing.Application of Neural Network Language Model in Statistical Machine Translation[J]. Technology Intelligence Engineering, 2017, 3(3): 21-28.)
doi: 10.3772/j.issn.2095-915x.2017.03.004
[1] 段建勇,魏晓鹏,王昊. 基于多角度共同匹配的多项选择机器阅读理解模型 *[J]. 数据分析与知识发现, 2021, 5(4): 134-141.
[2] 石磊,王毅,成颖,魏瑞斌. 自然语言处理中的注意力机制研究综述*[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[3] 孙素芬,罗长寿,魏清凤. Web农业实用技术自动问答系统设计实现*[J]. 现代图书情报技术, 2009, 25(7-8): 70-74.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn