Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (8): 1-12    DOI: 10.11925/infotech.2096-3467.2021.0181
Current Issue | Archive | Adv Search |
Extracting Relationship Among Military Domains with BERT and Relation Position Features
Ma Jiangwei1,Lv Xueqiang1,You Xindong1(),Xiao Gang2,Han Junmei2
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
2National Key Laboratory of Complex System Simulation, Beijing 100101, China
Download: PDF (1209 KB)   HTML ( 22
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This article addresses the difficulties of relationship extraction due to overlapping entity relationship in military texts. [Methods] We used the BERT model as the encoder for the input texts, and used the hierarchical reinforcement learning approach to decode relationship and their corresponding entities. Then, we merged the relational position features in the entity decoding process to construct a relationship extraction model for military domains. [Results] The F1 value reached 82.2% on the military weapon and equipment dataset, which was about 8% higher than other methods. Using the publicly available NYT10 and NYT10-sub datasets, the F1 values reached 71.8% and 69.0%, which was about 7% and 9% higher than other methods. [Limitations] The new method’s extraction performance is better on manually annotated datasets. More research is needed to improve it performance on remotely supervised datasets with much noise. [Conclusions] The HBP method could effectively extract relationship among the military domains, and has some generalization potentiality.

Key wordsRelation Extraction      BERT      Relation Position Feature      Reinforcement Learning     
Received: 24 February 2021      Published: 15 September 2021
ZTFLH:  TP391  
Fund:Natural Science Foundation of Beijing(4212020);Defense-related Science and Technology Key Lab Fund Project(6412006200404);Qin Xin Talents Cultivation Program, Beijing Information Science & Technology University(QXTCP B201908)
Corresponding Authors: You Xindong ORCID:0000-0002-3351-4599     E-mail: youxindong@bistu.edu.cn

Cite this article:

Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features. Data Analysis and Knowledge Discovery, 2021, 5(8): 1-12.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0181     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I8/1

HBP Model Structure
Hierarchical Reinforcement Learning Process of HBP
High-Level Relation Detection Structure
Low-level Entity Extraction Structure
项目 军事武器装备数据集 NYT10
关系类型 16 29
训练集 2 532 66 816
验证集 133 3 516
测试集 297 4 006
Statistics of the Datasets
数据样本 “贝劳伍德”号航空母舰1942年12月6日在纽约海军造船厂下水,1943年3月31日服役。
下水时间 “/H_B 贝/H_I 劳/H_I 伍/H_I 德/H_I ”/H_I 号/H_I 航/H_I 空/H_I 母/H_I 舰/H_I
1/T_B 9/T_I 4/T_I 2/T_I 年/T_I 1/T_I 2/T_I 月/T_I 6/T_I 日/T_I 在/N
纽/N 约/N 海/N 军/N 造/N 船/N 厂/N 下/N 水/N ,/N 1/N
9/N 4/N 3/N 年/N 3/N 月/N 3/N 1/N 日/N 服/N 役/N
服役时间 “/H_B 贝/H_I 劳/H_I 伍/H_I 德/H_I ”/H_I 号/H_I 航/H_I 空/H_I 母/H_I 舰/H_I
1/N 9/N 4/N 2/N 年/N 1/N 2/N 月/N 6/N 日/N 在/N
纽/N 约/N 海/N 军/N 造/N 船/N 厂/N 下/N 水/N ,/N 1/T_B
9/T_I 4/ T_I 3/ T_I 年/ T_I 3/ T_I 月/ T_I 3/ T_I 1/ T_I 日/ T_I 服/N 役/N
Example of Data Annotation
关系类型 关系类型
舰船-装备-武器 舰船-满载排水量-重量
舰船-搭载-飞机 舰船-舰宽-宽度
舰船-属于-国家、舰队 舰船-舰长-长度
舰船-继承-舰船 舰船-造价-价格
舰船-级下号-舰船 舰船-服役时间-时间
舰船-代号-字符 舰船-下水时间-时间
舰船-舷号-字符 舰船-航速-速度
舰船-标准排水量-重量 舰船-吃水-深度
Types of Relationships for Military Weaponry Datasets
参数 参数值
状态向量大小 768
隐藏层向量大小 768
位置向量大小 30
Batch Size 16
学习率 4e-5
β 0.90
γ 0.95
Experiment Parameters
Examples of Military Weaponry Datasets
方法 准确率 召回率 F 1
NovelTagging 0.468 0.158 0.237
CopyR 0.336 0.206 0.256
GraphRel 0.438 0.305 0.360
HRL 0.538 0.336 0.414
BL 0.960 0.605 0.742
HBP 0.838 0.807 0.822
Experimental Results of Military Weaponry Datasets
方法 NYT10 NYT10-sub
准确率 召回率 F 1 准确率 召回率 F 1
SPTree 0.492 0.557 0.522 0.272 0.315 0.292
NovelTagging 0.593 0.381 0.464 0.256 0.237 0.246
CopyR 0.569 0.452 0.504 0.392 0.263 0.315
GraphRel 0.639 0.600 0.619
HRL 0.714 0.586 0.644 0.815 0.475 0.600
HBP 0.749 0.689 0.718 0.842 0.585 0.690
Experimental Results of NYT10 Datasets
数据集 方法 准确率 召回率 F 1
NYT10 HBP(+LSTM+POS) 0.707 0.589 0.643
HBP(+BERT) 0.710 0.661 0.685
HBP(+BERT+POS) 0.749 0.689 0.718
NYT10-sub HBP(+LSTM+POS) 0.803 0.485 0.605
HBP(+BERT) 0.816 0.580 0.678
HBP(+BERT+POS) 0.842 0.585 0.690
军事武器装备
数据集
HBP(+LSTM+POS) 0.540 0.537 0.538
HBP(+BERT) 0.804 0.761 0.782
HBP(+BERT+POS) 0.838 0.807 0.822
Results of Ablation Experiment
[1] 林旺群, 汪淼, 王伟, 等. 知识图谱研究现状及军事应用[J]. 中文信息学报, 2020, 34(12):9-16.
[1] ( Lin Wangqun, Wang Miao, Wang Wei, et al. A Survey to Knowledge Graph and Its Military Application[J]. Journal of Chinese Information Processing, 2020, 34(12):9-16.)
[2] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述[J]. 软件学报, 2019, 30(6):1793-1818.
[2] ( E Haihong, Zhang Wenjing, Xiao Siqi, et al. Survey of Entity Relationship Extraction Based on Deep Learning[J]. Journal of Software, 2019, 30(6):1793-1818.)
[3] Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction without Labeled Data [C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
[4] Gormley M R, Yu M, Dredze M. Improved Relation Extraction with Feature-Rich Compositional Embedding Models [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1774-1784.
[5] Zheng S C, Wang F, Bao H Y, et al. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1227-1236.
[6] Takanobu R, Zhang T Y, Liu J X, et al. A Hierarchical Framework for Relation Extraction with Reinforcement Learning [C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019: 7072-7079.
[7] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019: 4171-4186.
[8] Lample G, Ballesteros M, Subramanian S, et al. Neural Architectures for Named Entity Recognition [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 260-270.
[9] Santos C N D, Xiang B, Zhou B W. Classifying Relations by Ranking with Convolutional Neural Networks [C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 626-634.
[10] 张琴, 郭红梅, 张智雄. 融合词嵌入表示特征的实体关系抽取方法研究[J]. 数据分析与知识发现, 2017, 1(9):8-15.
[10] ( Zhang Qin, Guo Hongmei, Zhang Zhixiong. Extracting Entity Relationship with Word Embedding Representation Features[J]. Data Analysis and Knowledge Discovery, 2017, 1(9):8-15.)
[11] 张东东, 彭敦陆. ENT-BERT:结合BERT和实体信息的实体关系分类模型[J]. 小型微型计算机系统, 2020, 41(12):2557-2562.
[11] ( Zhang Dongdong, Peng Dunlu. ENT-BERT: Entity Relation Classification Model Combining BERT and Entity Information[J]. Journal of Chinese Computer Systems, 2020, 41(12):2557-2562.)
[12] 李卫疆, 李涛, 漆芳. 基于多特征自注意力BLSTM的中文实体关系抽取[J]. 中文信息学报, 2019, 33(10):47-56.
[12] ( Li Weijiang, Li Tao, Qi Fang. Chinese Entity Relation Extraction Based on Multi-Features Self-Attention Bi-LSTM[J]. Journal of Chinese Information Processing, 2019, 33(10):47-56.)
[13] 吴粤敏, 丁港归, 胡滨. 基于注意力机制的农业金融文本关系抽取研究[J]. 数据分析与知识发现, 2019, 3(5):86-92.
[13] ( Wu Yuemin, Ding Ganggui, Hu Bin. Extracting Relationship of Agricultural Financial Texts with Attention Mechanism[J]. Data Analysis and Knowledge Discovery, 2019, 3(5):86-92.)
[14] Li Q, Ji H. Incremental Joint Extraction of Entity Mentions and Relations [C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014: 402-412.
[15] Ren X, Wu Z Q, He W Q, et al. CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases [C]// Proceedings of the 26th International Conference on World Wide Web. 2017: 1015-1024.
[16] Miwa M, Bansal M. End-to-End Relation Extraction Using LSTMs on Sequences and Tree Structures [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[17] Bekoulis G, Deleu J, Demeester T, et al. Joint Entity Recognition and Relation Extraction as a Multi-head Selection Problem[J]. Expert Systems with Applications, 2018, 114:34-45.
doi: 10.1016/j.eswa.2018.07.032
[18] Li X Y, Yin F, Sun Z J, et al. Entity-Relation Extraction as Multi-Turn Question Answering [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 1340-1350.
[19] Wei Z P, Su J L, Wang Y, et al. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 1476-1488.
[20] Zeng D J, Liu K, Lai S, et al. Relation Classification via Convolutional Deep Neural Network [C]// Proceedings of the 25th International Conference on Computational Linguistics. 2014: 2335-2344.
[21] Ji G, Liu K, He S W, et al. Distant Supervision for Relation Extraction with Sentence-level Attention and Entity Descriptions [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 3060-3066.
[22] 李芊芊, 张克亮. 基于依存分析的军事领域英文实体关系抽取研究[J]. 情报工程, 2019, 5(1):98-112.
[22] ( Li Qianqian, Zhang Keliang. Entity Relation Extraction Based on Dependency Parsing in Military Field[J]. Technology Intelligence Engineering, 2019, 5(1):98-112.)
[23] 田佳来, 吕学强, 游新冬, 等. 基于分层序列标注的实体关系联合抽取方法[J]. 北京大学学报(自然科学版), 2021, 57(1):53-60.
[23] ( Tian Jialai, Lü Xueqiang, You Xindong, et al. Joint Extraction of Entities and Relations Based on Hierarchical Sequence Labeling[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57(1):53-60.)
[24] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[J]. Advances in Neural Information Processing Systems, 2017, 30:5998-6008.
[25] 李培芸, 李茂西, 裘白莲, 等. 融合BERT语境词向量的译文质量估计方法研究[J]. 中文信息学报, 2020, 34(3):56-63.
[25] ( Li Peiyun, Li Maoxi, Qiu Bailian, et al. Integrating BERT Word Embedding into Quality Estimation of Machine Translation[J]. Journal of Chinese Information Processing, 2020, 34(3):56-63.)
[26] Sutton R S, Barto A G. Reinforcement Learning: An Introduction[M]. MIT Press, 1998.
[27] Bishop C M. Pattern Recognition and Machine Learning[M]. Springer, 2006.
[28] van Rijsbergen C J. Information Retrieval[M]. Butterworths, 1975.
[29] Sutton R S, McAllester D A, Singh S P, et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation[A]// Advances in Neural Information Processing Systems[M]. MIT Press, 2000: 1057-1063.
[30] Riedel S, Yao L M, McCallum A. Modeling Relations and Their Mentions without Labeled Text [C]// Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2010: 148-163.
[31] Zeng X R, Zeng D J, He S Z, et al. Extracting Relational Facts by an End-to-end Neural Model with Copy Mechanism [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 506-514.
[32] Fu T J, Li P H, Ma W Y. GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 1409-1418.
[33] Su J L. Hybrid Structure of Pointer and Tagging for Relation Extraction: A Baseline[EB/OL].(2019-06-03).[2021-02-24]. https://github.com/bojone/kg-2019-baseline.
[1] Lu Quan, He Chao, Chen Jing, Tian Min, Liu Ting. A Multi-Label Classification Model with Two-Stage Transfer Learning[J]. 数据分析与知识发现, 2021, 5(7): 91-100.
[2] Liu Wenbin, He Yanqing, Wu Zhenfeng, Dong Cheng. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
[3] Li Wenna, Zhang Zhixiong. Entity Alignment Method for Different Knowledge Repositories with Joint Semantic Representation[J]. 数据分析与知识发现, 2021, 5(7): 1-9.
[4] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[5] Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. 数据分析与知识发现, 2021, 5(7): 26-35.
[6] Zhang Le, Leng Jidong, Lv Xueqiang, Cui Zhuo, Wang Lei, You Xindong. RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning[J]. 数据分析与知识发现, 2021, 5(7): 59-69.
[7] Ruan Xiaoyun,Liao Jianbin,Li Xiang,Yang Yang,Li Daifeng. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[8] Yin Pengbo,Pan Weimin,Zhang Haijun,Chen Degang. Identifying Clickbait with BERT-BiGA Model[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[9] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[10] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[11] Wang Qian,Wang Dongbo,Li Bin,Xu Chao. Deep Learning Based Automatic Sentence Segmentation and Punctuation Model for Massive Classical Chinese Literature[J]. 数据分析与知识发现, 2021, 5(3): 25-34.
[12] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[13] Liu Huan,Zhang Zhixiong,Wang Yufei. A Review on Main Optimization Methods of BERT[J]. 数据分析与知识发现, 2021, 5(1): 3-15.
[14] Zhao Yang, Zhang Zhixiong, Liu Huan, Ding Liangping. Classification of Chinese Medical Literature with BERT Model[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[15] Zhao Ping,Sun Lianying,Tu Shuai,Bian Jianling,Wan Ying. Identifying Scenic Spot Entities Based on Improved Knowledge Transfer[J]. 数据分析与知识发现, 2020, 4(5): 118-126.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn