Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (7): 36-47     https://doi.org/10.11925/infotech.2096-3467.2020.1296
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于ELECTRA模型与词性特征的金融事件抽取方法研究*
陈星月,倪丽萍(),倪志伟
合肥工业大学管理学院 合肥 230009
合肥工业大学过程优化与智能决策教育部重点实验室 合肥 230009
Extracting Financial Events with ELECTRA and Part-of-Speech
Chen Xingyue,Ni Liping(),Ni Zhiwei
School of Management, Hefei University of Technology, Hefei 230009, China
Key Laboratory of Process Optimization & Intelligent Decision-making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
全文: PDF (1171 KB)   HTML ( 15
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 针对金融事件抽取中金融事件实体边界模糊、抽取不准确的问题,提出基于预训练模型ELECTRA和词性特征的金融事件抽取方法。【方法】 为增强模型对金融关键实体的感知力,充分考虑语料原始的语义信息以及词性特征信息,将语料分别通过两个ELECTRA预训练模型后进行融合操作,达到增强语义的效果;将学习到的信息传入BiGRU中,获取上下文长距离的语义依赖,输出原始的序列标签;利用CRF克服标签偏差问题,通过上述步骤完成金融事件抽取。【结果】 基于预训练模型ELECTRA和词性特征的金融事件抽取方法在金融事件数据集上F1值达到70.96%,比经典的抽取模型BiLSTM-CRF性能提升20.74个百分点。【局限】 数据集中事件数较少,预训练模型体积较大,会受到GPU/TPU内存的限制。【结论】 本文模型能够更加全面地捕捉金融事件元素之间的联系,提升金融事件抽取的效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
陈星月
倪丽萍
倪志伟
关键词 ELECTRA词性特征金融事件抽取预训练模型    
Abstract

[Objective] This paper proposes a method to extract financial events based on the ELECTRA model and part-of-speech, aiming to address the issues of blurred entity boundaries and inaccurate extractions. [Methods] First, we input corpus to two models pre-trained by ELECTRA, which identified key entities, the original semantic information, and part-of-speech. Then, we used the BiGRU model to extract contextual semantic dependency and generated the original sequence tags. Finally, we addressed the issues of label deviation with the CRF model and extracted the financial events. [Results] We examined the new model with financial event dataset and found its F-value reached 70.96%, which was 20.74 percentage point higher than the BiLSTM-CRF model. [Limitations] The number of events in the dataset needs to be increased. The size of pre-trained model is large, which might be limited by the memory of GPU/TPU. [Conclusions] The model based on ELECTRA and part-of-speech could effectively identify the relationships among financial events to extract them.

Key wordsELECTRA    Part-of-Speech    Financial Event Extraction    Pre-trained Model
收稿日期: 2020-12-26      出版日期: 2021-04-19
ZTFLH:  TP183  
基金资助:*国家自然科学基金青年项目(71301041);国家自然科学基金重大研究计划培育项目(91546108);国家自然科学基金青年项目(71701061)
通讯作者: 倪丽萍,ORCID:0000-0002-7067-302X     E-mail: niliping@hfut.edu.cn
引用本文:   
陈星月, 倪丽萍, 倪志伟. 基于ELECTRA模型与词性特征的金融事件抽取方法研究*[J]. 数据分析与知识发现, 2021, 5(7): 36-47.
Chen Xingyue, Ni Liping, Ni Zhiwei. Extracting Financial Events with ELECTRA and Part-of-Speech. Data Analysis and Knowledge Discovery, 2021, 5(7): 36-47.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.1296      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I7/36
Fig.1  模型整体架构
Fig.2  文本预处理过程示例
Fig.3  BERT模型架构
Fig.4  ELECTRA预训练任务改进
Fig.5  词性标签预处理示例
抽取框架 抽取结果
原始内容 美锦集团持有的上市公司28.37亿股股份中,已有27.8亿股处于质押状态,占其持股的97.98%。
事件类型 质押
事件元素 “美锦集团”:质押公司
“股份”:质押物
“27.8亿”:质押数量
“97.98%”:质押比例
Table 1  事件抽取输出结果
事件类型 事件框架
质押 触发词、质押公司、质押人、质权公司、质权人、质押物、质押日期、质押金额、质押数量、质押比例
股份股权转让 触发词、股份股权转让公司、股份股权转让人、受转让公司、受转让人、股份股权转让物、转让日期、转让交易金额、转让数量、转让比例、标的公司
投资 触发词、原告(个人)、原告(公司)、被告(个人)、被告(公司)、起诉日期
起诉 触发词、发起投资的组织或单位、被投资的组织或单位、投资金额、日期
减持 触发词、减持方、减持方的职务、日期、减持的股份占个人股份百分比、减持的股份占公司股份的百分比、减持方所在组织或单位
Table 2  事件框架
事件类型 质押 股份股权转让 投资 起诉 减持 共计
事件数 851 1 572 1 081 533 739 4 776
Table 3  事件个数统计
实验环境 环境配置
操作系统 Ubuntu
CPU Intel Xeon E5-2620 2.10 GHz
GPU TITAN X(12GB)
Python 3.6
内存 128GB
深度学习框架 TensorFlow(1.14.0)+Keras(2.2.4)
Table 4  实验环境
参数 参数值
学习率 0.000 02
最大序列长度 256
Batch Size 8
GRU单元数 128
Epoch 30
Dropout 0.5
Optimizer Adam
Table 5  模型参数
模型 P(%) R(%) F 1(%)
IDCNN-CRF 72.10 33.14 45.41
BiLSTM-CRF 79.19 36.77 50.22
BERT-IDCNN-CRF 86.10 56.13 67.96
BERT-BiLSTM-CRF 87.03 55.81 68.01
ELECTRA-POS-BiGRU-CRF 87.56 59.64 70.96
Table 6  基准算法对比
模型 P(%) R(%) F 1(%)
BiGRU-CRF 75.33 35.34 48.11
ELECTRA-CRF 84.04 58.65 69.09
ELECTRA-BiGRU-CRF 85.59 58.93 69.80
ELECTRA-POS-BiGRU-CRF 87.56 59.64 70.96
Table 7  消融实验
预训练模型 层数 隐藏层单元 注意力头 模型大小
BERT-base 12 768 12 392 MB
NEZHA-base 12 768 12 1 173 MB
ELECTRA-base 12 768 12 102 MB
ALBERT-large 24 1 024 16 64 MB
RoBERTa-base 12 768 12 392 MB
Table 8  预训练模型参数
预训练模型 P(%) R(%) F 1(%)
BERT-POS-BiGRU-CRF 85.19 58.91 69.65
NEZHA-POS-BiGRU-CRF 86.44 58.99 70.13
RoBERTa-POS-BiGRU-CRF 84.62 60.11 70.29
ALBERT-POS-BiGRU-CRF 87.63 55.67 68.08
ELECTRA-POS-BiGRU-CRF 87.56 59.64 70.96
Table 9  与其他预训练模型对比情况
输入句子 事件元素 抽取方法 元素抽取结果
公告显示,中南建设本次质押股数2 400万股,占其所持股份比例为1.19%,占公司总股本比例0.64%,质押日期自2019年11月15日至2021年4月18日,质权人为华夏银行股份有限公司南通分行。 '华夏银行股份有限公司南通分行' : ('质押', 'obj-org'),
'1.19%': ('质押', 'proportion'),
'股份': ('质押', 'collateral'),
'2400万': ('质押', 'number'),
'质押': ('质押', 'trigger'),
'中南建设': ('质押', 'sub-org')
IDCNN-CRF '股份比例为1.19%,占公司': ('股份股权转让', 'obj-org'),
'64%': ('投资', 'trigger')
BiLSTM-CRF '股份比例为1.19%,占公司': ('投资', 'sub'),
'64%': ('股份股权转让', 'trigger'),
',质押日': ('股份股权转让', 'target-company'),
'期自': ('股份股权转让', 'collateral')}
BERT-IDCNN-CRF '中南建设': ('质押', 'sub-org'),
'质押': ('质押', 'trigger'),
'2400万': ('质押', 'number'),
'1.19%': ('质押', 'proportion'),
'0.64%': ('质押', 'proportion'),
'华夏银行股份有限公司南通分行': ('质押', 'obj-org')
BERT-BiLSTM-CRF '中南建设': ('质押', 'sub-org'), '质押': ('质押', 'trigger'),
'股': ('质押', 'collateral'),
'2400万':('质押', 'number'),
'股份':('质押', 'collateral'),
'1.19%':('质押', 'proportion'),
'2019年11月15日至2021年4月18日': ('质押', 'date'),
'华夏银行股份有限公司南通分行': ('质押', 'obj-org')
ELECTRA-POS-
BiGRU-CRF
'中南建设':('质押', 'sub-org'),
'质押':('质押', 'trigger'),
'2400万':('质押', 'number'),
'股份': ('质押', 'collateral'),
'1.19%':('质押', 'proportion'),
'华夏银行股份有限公司南通分行': ('质押', 'obj-org')
Table 10  金融事件元素抽取结果对比示例
[1] Ding X, Liao K, Liu T, et al. Event Representation Learning Enhanced with External Commonsense Knowledge[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 4894-4903.
[2] Liang X, Cheng D, Yang F, et al. F-HMTC: Detecting Financial Events for Investment Decisions Based on Neural Hierarchical Multi-Label Text Classification[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020:4490-4496.
[3] Wollmer M, Weninger F, Knaup T, et al. YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context[J]. IEEE Intelligent Systems, 2013, 28(3):46-53.
doi: 10.1109/MIS.2013.34
[4] Feldman R, Rosenfeld B, Bar-Haim R, et al. The Stock Sonar - Sentiment Analysis of Stocks Based on a Hybrid Approach[C]// Proceedings of the 23rd Conference on Innovative Applications of Artificial Intelligence. 2011: 1642-1647.
[5] Arendarenko E, Kakkonen T. Ontology-Based Information and Event Extraction for Business Intelligence[C]// Proceedings of the 15th International Conference on Artificial Intelligence: Methodology, Systems, and Applications. Springer Berlin Heidelberg, 2012:89-102.
[6] Hogenboom A, Hogenboom F, Frasincar F, et al. Semantics-based Information Extraction for Detecting Economic Events[J]. Multimedia Tools and Applications, 2012, 64(1):27-52.
doi: 10.1007/s11042-012-1122-0
[7] Ronnqvist S, Sarlin P. Bank Distress in the News: Describing Events Through Deep Learning[J]. Neurocomputing, 2016, 264(11):57-70.
doi: 10.1016/j.neucom.2016.12.110
[8] Jacobs G, Lefever E, Hoste V. Economic Event Detection in Company-Specific News Text[C]// Proceedings of the 1st Workshop on Economics and Natural Language Processing. 2018:1-10.
[9] Yang H, Chen Y, Liu K, et al. DCFEE: A Document-Level Chinese Financial Event Extraction System Based on Automatically Labeled Training Data[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 50-55.
[10] Zheng S, Cao W, Xu W, et al. Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 337-346.
[11] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017). 2017: 5998-6008.
[12] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[13] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014:1532-1543.
[14] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[15] Liu Y, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[OL]. arXiv Preprint, arXiv: 1907. 11692.
[16] Yang S, Feng D, Qiao L, et al. Exploring Pre-trained Language Models for Event Extraction and Generation[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 5284-5294.
[17] Du X, Cardie C. Event Extraction by Answering (Almost) Natural Questions[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP). 2020: 671-683.
[18] Liu J, Chen Y, Liu K, et al. Event Extraction as Machine Reading Comprehension[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020: 1641-1651.
[19] Ein-Dor L, Gera A, Toledo-Ronen O, et al. Financial Event Extraction Using Wikipedia-Based Weak Supervision[C]// Proceedings of the 2nd Workshop on Economics and Natural Language Processing. 2019: 10-15.
[20] Li Q, Zhang Q. A Unified Model for Financial Event Classification, Detection and Summarization[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI-20). 2020:4668-4678.)
[21] Zhao L, Li L, Zheng X. A BERT Based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts[OL]. arXiv Preprint, arXiv: 2001. 05326.
[22] 赵军. 知识图谱[M]. 北京: 高等教育出版社, 2018.
[22] (Zhao Jun. Knowledge Graph[M]. Beijing: Higher Education Press, 2018.)
[23] Clark K, Luong M T, Le Q V, et al. Electra: Pre-training Text Encoders as Discriminators Rather Than Generators[OL]. arXiv Preprint, arXiv: 2003. 10555.
[24] Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C]// Proceedings of the19th Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 1724-1734.
[25] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780.
pmid: 9377276
[26] Du X, Cardie C. Document-Level Event Role Filler Extraction Using Multi-Granularity Contextualized Encoding[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics(ACL). 2020: 8010-8020.
[27] 李妮, 关焕梅, 杨飘, 等. 基于BERT-IDCNN-CRF的中文命名实体识别方法[J]. 山东大学学报(理学版), 2020, 55(1):102-109.
[27] (Li Ni, Guan Huanmei, Yang Piao, et al. BERT-IDCNN-CRF for Named Entity Recognition in Chinese[J]. Journal of Shandong University (Natural Science), 2020, 55(1):102-109.)
[28] Lample G, Ballesteros M, Subramanian S, et al. Neural Architectures for Named Entity Recognition[C]// Proceedings of the 2016 Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 260-270.)
[29] Lan Z, Chen M, Goodman S, et al. ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations[OL]. arXiv Preprint, arXiv: 1909. 11942.
[30] Wei J, Ren X, Li X, et al. NEZHA: Neural Contextualized Representation for Chinese Language Understanding[OL]. arXiv Preprint, arXiv: 1909. 00204.
[31] You Y, Li J, Hseu J, et al. Reducing BERT Pre-Training Time from 3 Days to 76 Minutes[OL]. arXiv Preprint, arXiv: 1904. 00962.
[32] Cui Y, Che W, Liu T, et al. Revisiting Pre-Trained Models for Chinese Natural Language Processing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (EMNLP). 2020: 657-668.
[1] 赵旸, 张智雄, 刘欢, 丁良萍. 基于BERT模型的中文医学文献分类研究*[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn