Please wait a minute...
Advanced Search
数据分析与知识发现  2019, Vol. 3 Issue (5): 86-92    DOI: 10.11925/infotech.2096-3467.2018.0818
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于注意力机制的农业金融文本关系抽取研究*
吴粤敏,丁港归,胡滨()
南京农业大学信息科学技术学院 南京 210095
Extracting Relationship of Agricultural Financial Texts with Attention Mechanism
Yuemin Wu,Ganggui Ding,Bin Hu()
College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China
全文: PDF(710 KB)   HTML ( 16
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】研究中文文本中关系自动抽取的方法。【方法】以224家农业上市公司2015年-2017年的678份年报为数据来源, 采用基于双重注意力机制的门控循环单元算法, 进行中文文本关系自动抽取研究。【结果】最终模型在农业金融文本数据集上的平均准确率达78%, 相较循环神经网络算法, 该算法平均准确率提高约12%。【局限】仅针对224家农业上市公司的数据进行研究, 研究涉农企业对象有待进一步拓展。【结论】该模型能够在农业金融相关文本的关系抽取上取得较好效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
吴粤敏
丁港归
胡滨
关键词 注意力机制关系抽取农业金融    
Abstract

[Objective] This paper proposes a new method to extract relations from Chinese texts automatically. [Methods] We retrieved annual reports of 224 listed agricultural companies from 2015 to 2017. Then we adopted the Gated Recurrent Unit algorithm based on double attention mechanism to extract the needed data. [Results] The average accuracy of our model on the agricultural financial dataset reached 78%. Compared with the Recurrent Neural Network algorithm, the average accuracy of the new model increased by about 12%. [Limitations] We only studied data from 224 companies, which needs to be expanded. [Conclusions] The proposed model can effectively extract relationship from agricultural financial texts.

Key wordsAttention Mechanism    Relationship Extraction    Agricultural Finance
收稿日期: 2018-07-24     
基金资助:*本文系江苏省大学生创新训练计划项目“农业企业投资领域知识可视化应用研究”(项目编号: 201810307075X)和南京农业大学中央高校基本科研业务费项目“大数据环境下面向农业知识库构建的信息自动抽取技术研究”(项目编号: SK2016016)的研究成果之一
引用本文:   
吴粤敏,丁港归,胡滨. 基于注意力机制的农业金融文本关系抽取研究*[J]. 数据分析与知识发现, 2019, 3(5): 86-92.
Yuemin Wu,Ganggui Ding,Bin Hu. Extracting Relationship of Agricultural Financial Texts with Attention Mechanism. Data Analysis and Knowledge Discovery, DOI:10.11925/infotech.2096-3467.2018.0818.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.0818
[1] 董静, 孙乐, 冯元勇, 等. 中文实体关系抽取中的特征选择研究[J]. 中文信息学报, 2007, 21(4): 80-91.
[1] (Dong Jing, Sun Le, Feng Yuanyong, et al.Chinese Automatic Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2007, 21(4): 80-91.)
[2] 贾真, 冶忠林, 尹红风, 等. 基于Tri-training与噪声过滤的弱监督关系抽取[J]. 中文信息学报, 2016, 30(4): 142-149.
[2] (Jia Zhen, Ye Zhonglin, Yin Hongfeng, et al.Weakly Supervised Relation Extraction Based on Tri-training and Noise Filtering[J]. Journal of Chinese Information Processing, 2016, 30(4): 142-149.)
[3] 黄蓓静. 深度学习技术在中文人物关系抽取中的应用研究[D]. 上海:华东师范大学, 2017.
[3] (Huang Beijing.Study on the Application of Deep Learning Technology in Chinese Personal Relation Extraction[D]. Shanghai: East China Normal University, 2017.)
[4] Culotta A, McCallum A, Betz J. Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text[C]// Proceedings of the Main Conference on Human Language Technology. 2006: 296-303.
[5] 韩冰, 林鸿飞. 基于支撑向量机的人物关系抽取[C]// 第七届中文信息处理国际会议. 北京: 电子工业出版社, 2007.
[5] (Han Bing, Lin Hongfei.Characters Extraction Based on Support Vector Machine[C]// Proceedings of the 7th International Conference on Chinese Information Processing. Beijing: Publishing House of Electronics Industry, 2007.)
[6] Zhao S, Grishman R.Extracting Relations with Integrated Information Using Kernel Methods[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005: 419-426.
[7] Bunescu R C, Mooney R J.Subsequence Kernels for Relation Extraction[C]// Proceedings of the 2006 International Conference on Neural Information Processing Systems. MIT Press, 2006: 171-178.
[8] 车万翔, 刘挺, 李生. 实体关系自动抽取[J]. 中文信息学报, 2005, 19(2): 2-7.
[8] (Che Wanxiang, Liu Ting, Li Sheng.Automatic Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2005, 19(2): 2-7.)
[9] Mintz M, Bills S, Snow R, et al.Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
[10] Socher R, Huval B, Manning C D, et al.Semantic Compositionality Through Recursive Matrix-Vector Spaces[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012: 1201-1211.
[11] Zeng D, Liu K, Lai S, et al.Relation Classification via Convolutional Deep Neural Network[C]// Proceedings of the 25th International Conference on Computational Linguistics. 2014: 2335-2344.
[12] Zhang D, Wang D.Relation Classification via Recurrent Neural Network[OL]. arXiv Preprint. arXiv: 1508. 01006.
[13] Zhou P, Shi W, Tian J, et al.Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 207-212.
[14] Lin Y, Shen S, Liu Z, et al.Neural Relation Extraction with Selective Attention over Instances[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 2124-2133.
[15] Cho K, Van Merrienboer B, Gulcehre C, et al.Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[OL]. arXiv Preprint. arXiv: 1406. 1078.
[16] Santos C N, Xiang B, Zhou B.Classifying Relations by Ranking with Convolutional Neural Networks[OL]. arXiv Preprint. arXiv: 1504. 06580.
[1] 张琴,郭红梅,张智雄. 融合词嵌入表示特征的实体关系抽取方法研究*[J]. 数据分析与知识发现, 2017, 1(9): 8-15.
[2] 王秀艳, 崔雷. 采用混合方法抽取生物医学实体间语义关系[J]. 现代图书情报技术, 2013, 29(3): 77-82.
[3] 黄勋, 游宏梁, 于洋. 关系抽取技术研究综述[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[4] 谷俊, 许鑫. 中文专利中本体关系获取研究[J]. 现代图书情报技术, 2013, 29(10): 73-78.
[5] 王秀艳, 崔雷. 应用关键动词抽取生物医学实体间语义关系研究综述[J]. 现代图书情报技术, 2011, 27(9): 21-27.
[6] Miao Chen,Xiaozhong Liu,Jian Qin . 从社会性标签中进行语义关系抽取——一种元数据生成方法[J]. 现代图书情报技术, 2009, 3(3): 38-45.
[7] 傅继彬,刘杰,贾可亮,毛金涛. 基于知网和术语相关度的本体关系抽取研究*[J]. 现代图书情报技术, 2008, 24(9): 36-40.
[8] 徐健,张智雄,吴振新. 实体关系抽取的技术方法综述*[J]. 现代图书情报技术, 2008, 24(8): 18-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn