Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (2/3): 289-297    DOI: 10.11925/infotech.2096-3467.2021.0969
Current Issue | Archive | Adv Search |
Extracting Weapon Attributes Based on Word Completion
Ding Shengchun(),You Weijing,Wang Xiaoying
School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China
Download: PDF (709 KB)   HTML ( 18
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to address the issue facing dependency syntactic relation, which could only extract single noun attributes for military equipment. [Methods] First, we analyzed features of the text describing the technology and performance of weapons and equipment. Then, we wrote regular expressions to obtain the attribute values. Third, we extracted the attribute words based on dependency parsing. Finally, we completed the attribute word lists with the part of speech. [Results] We examined our new model with military news data sets and found the accuracy and recall rates for extracting open-source attribute words reached 91.53% and 72.78%. The accuracy of attribute word completion was up to 96.95%, and the accuracy for each category of attribute words was higher than 90%. [Limitations] This paper did not try to extract weapon attributes like the belonging country and the state of service. [Conclusions] The proposed method could effectively extract explicit attribute words.

Key wordsWeapon Equipment      Attribute Extraction      Attribute Word Completion      Dependency Syntax     
Received: 31 August 2021      Published: 14 April 2022
ZTFLH:  E91  
Fund:Social Science Fund of Jiangsu Province(20TQB004)
Corresponding Authors: Ding Shengchun,ORCID:0000-0002-4269-021X     E-mail: todingding@163.com

Cite this article:

Ding Shengchun, You Weijing, Wang Xiaoying. Extracting Weapon Attributes Based on Word Completion. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 289-297.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0969     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I2/3/289

方法 优点 缺点
基于触发词或规则 当建立的触发词表和制定的规则合理时,准确率较高 抽取效果严重依赖于触发词表和规则模板的好坏,工作烦琐,召回率较低
命名实体识别和分类结合 人工参与程度较低,具有一定的移植性 属性抽取效果既依赖于命名实体识别效果也依赖于关系分类的效果
属性抽取转化为关系抽取 能一次性抽取出属性,抽取效果只取决于标注的训练数据和抽取模型的选择 属性抽取的效果不佳
Attribute Extraction Method Comparison
Research Framework of Weapon and Equipment Attribute Extraction
序号 类型组成 类型 示例
1 [数值]+[单位] 精准型 该系统探测范围达到150 km,追踪范围达到120 km,射高达到27 km。
2 [数值]+[至/到/-]+[数值]+[单位] 区间型 11月将试射射程3 000至4 000公里的弹道导弹。
3 [数值]+[单位]+[至/到/-]+[数值]+[单位] 区间型 生产或试验射程在500公里至5 500公里的陆基巡航导弹。
Attribute Value Category
Examples of Dependency Syntactic Relations
序号 属性词指示词 序号 属性词指示词
1 不超过 10
2 仅仅 11 不到
3 提高至 12
4 高达 13 超过
5 提高到 14
6 15
7 减少至 16 达到
8 17
9 较少到
Attribute Indicator
序号 属性与属性值的依存关系 特点 示例
1 名词性主语(nsubj) 句子主体结构是“主语+谓语”,属性值为谓语(句子的核心成分)时,属性词是句子的主语 该舰长232米
2 数量修饰(nummod) 属性描述作为主语或宾语的修饰成分时,属性值是属性词的数量修饰 可搭载10吨级有效载荷
3 关联修饰
(assmod)
属性描述作为句子的主语或宾语,属性值是属性词的关联修饰 拥有超过150 km的射程
4 依赖关系(dep) 属性值与某一名词是依赖关系时,该名词通常为属性词 “烈火-5”发射重量50吨
Dependency Between Attribute Words and Attribute Values
序号 属性与属性指示词的依存关系 属性指示词与属性值的依存关系 特点 示例
1 名词性主语(nsubj) 数量宾语关系(range) 属性词是句子主语,且与作为核心成分谓语的属性指示词之间是名词性主语关系时,属性值很可能是属性指示词的数值宾语 这种高超音速导弹的飞行速度高达20倍音速
2 名词性主语(nsubj) 属性关系(attr) 属性词是句子主语,且与作为核心成分谓语的属性指示词之间是名词性主语关系,同时属性指示词是“为”“是”等表示属性指向的词时,属性值与属性指示词之间是属性关系 “烈火-5”导弹最大射程仅为5500千米
3 名词性主语(nsubj) 依赖关系(dep) 属性词与属性指示词之间是名词性主语关系时,属性值与属性指示词之间还可为依赖关系 其中40N6导弹的射程约400公里
4 主题(top) 属性关系(attr) 这两种关系成对出现,句式结构为“属性词+‘为/是’+属性值”,与依存关系2相似 预计最大起飞重量为38.2吨
5 依赖(dep) 数量宾语关系(range) 属性值与属性词之间是数量宾语关系时,属性词与属性指示词之间还有可为依赖关系 日本“苍龙”级的潜航排水量达4200吨
Dependency Between Attribute Words and Attribute Indicators, Attribute Indicators and Attribute Values
属性与属性值的依存关系 识别的数量 正确的数量 准确率
名词性主语(nsubj) 16 16 100%
数量修饰(nummod) 28 23 82.14%
关联修饰(assmod) 23 16 69.57%
特殊依存关系(dep) 51 49 96.08%
总计 118 104 88.14%
Attribute Word Extraction Result(Based on 4 Kinds of Dependencies)
属性与属性指示词的依存关系 属性指示词与属性值的依存关系 识别的数量 正确的数量 准确率
名词性主语(nsubj) 数值宾语(range) 83 78 93.98%
名词性主语(nsubj) 属性关系(attr) 17 16 94.12%
名词性主语(nsubj) 依赖关系(dep) 17 15 88.24%
主题(top) 属性关系(attr) 34 33 97.06%
依赖(dep) 间接宾语为数量词(range) 26 24 92.31%
总计 177 166 93.79%
Attribute Word Extraction Result (Based on 5 Sets of Dependencies)
属性词类别 总个数 补全后正确的数量 准确率
单名词 149 149 100%
形容词+名词 75 70 93.33%
动词+名词 14 13 92.86%
名词+名词 57 54 94.74%
总计 295 286 96.95%
Attribute Word Completion Result
序号 原属性词 补全后的
属性词
序号 原属性词 补全后的
属性词
1 深度 潜航深度 10 距离 最大距离
2 速度 飞行速度 11 距离 飞行距离
3 航程 总航程 12 速度 最大速度
4 射程 最大射程 13 里程 续航里程
5 距离 拦截距离 14 射程 最远射程
6 距离 探测距离 15 高度 飞行高度
7 当量 爆炸当量 16 速度 移动速度
8 面积 水域面积 17 推力 发动机推力
9 潜深 最大潜深 18 荷载 商业荷载
Example of Attribute Word Completion Result
[1] 丁君军, 郑彦宁, 化柏林. 国内外属性抽取研究综述[J]. 情报科学, 2011, 29(5):793-796.
[1] ( Ding Junjun, Zheng Yanning, Hua Bolin. Survey on Attribute Extraction at Home and Abroad[J]. Information Science, 2011, 29(5):793-796.)
[2] Liu Q, Gao Z Q, Liu B, et al. Automated Rule Selection for Opinion Target Extraction[J]. Knowledge-Based Systems, 2016, 104:74-88.
doi: 10.1016/j.knosys.2016.04.010
[3] 康睿智, 郝文宁, 程恺, 等. 面向军事装备实体的属性抽取[J]. 计算机应用研究, 2016, 33(12):3721-3724.
[3] ( Kang Ruizhi, Hao Wenning, Cheng Kai, et al. Attribute Extraction for Military Equipment Entity[J]. Application Research of Computers, 2016, 33(12):3721-3724.)
[4] 翟劼, 裘江南. 基于规则的知识元属性抽取方法研究[J]. 情报科学, 2016, 34(4):43-47.
[4] ( Zhai Jie, Qiu Jiangnan. Research on the Rule-based Knowledge Unit Attributes Extraction Method[J]. Information Science, 2016, 34(4):43-47.)
[5] 胡梦君. 基于规则的蒙古文人物属性抽取研究[D]. 呼和浩特: 内蒙古大学, 2018.
[5] ( Hu Mengjun. Research on Rule-based Extraction of Mongolian Character Attributes[D]. Hohhot: Inner Mongolia University, 2018.)
[6] Zhao J L, Li B C, Guo L. Open Domain Event Attribute Extraction Method[J]. DEStech Transactions on Computer Science and Engineering. DOI: 10.12783/dtcse/iciti2018/29124.
doi: 10.12783/dtcse/iciti2018/29124
[7] 汪瀛寰, 薛婵, 包先雨, 等. 触发词与属性值对联合抽取方法研究[J]. 计算机工程与应用, 2020, 56(9):168-174.
[7] ( Wang Yinghuan, Xue Chan, Bao Xianyu, et al. Research on Joint Extraction of Triggers and Attribute-Value Pairs[J]. Computer Engineering and Applications, 2020, 56(9):168-174.)
[8] Rif’at M, Mahendra R, Budi I, et al. Towards Product Attributes Extraction in Indonesian E-Commerce Platform[J]. Computación y Sistemas, 2018, 22(4):1367-1375.
[9] Du M, Pang M M, Xu B. Multi-task Learning for Attribute Extraction from Unstructured Electronic Medical Records[C]// Proceeding of the 9th Joint International Semantic Technology Conference. 2019: 117-128.
[10] Peng B, Zhang X M, He Y Y, et al. Attribute Extraction by Combing Feature Ranking and Sequence Labeling[C]// Proceedings of 2018 IEEE International Conference on Big Data and Smart Computing. 2018: 553-556.
[11] Huang Y C, Li A P, Zhou B, et al. Person Entity Attribute Extraction Based on Siamese Network[J]. IEEE Access, 2019, 7:64506-64516.
doi: 10.1109/ACCESS.2019.2917302
[12] 李成梁, 赵中英, 李超, 等. 基于依存关系嵌入与条件随机场的商品属性抽取方法[J]. 数据分析与知识发现, 2020, 4(5):54-65.
[12] ( Li Chengliang, Zhao Zhongying, Li Chao, et al. Extracting Product Properties with Dependency Relationship Embedding and Conditional Random Field[J]. Data Analysis and Knowledge Discovery, 2020, 4(5):54-65.)
[13] Putra H S, Priatmadji F S, Mahendra R. Semi-supervised Named-Entity Recognition for Product Attribute Extraction in Book Domain[C]// Proceedings of the 22nd International Conference on Asia-Pacific Digital Libraries. 2020: 43-51.
[14] 杨宇飞, 戴齐, 贾真, 等. 基于弱监督的属性关系抽取方法[J]. 计算机应用, 2014, 34(1):64-68.
[14] ( Yang Yufei, Dai Qi, Jia Zhen, et al. Weakly Supervised Method for Attribute Relation Extraction[J]. Journal of Computer Applications, 2014, 34(1):64-68.)
[15] More A. Attribute Extraction from Product Titles in eCommerce[OL]. arXiv Preprint, arXiv: 1608.04670.
[16] Miwa M, Bansal M. End-to-End Relation Extraction Using LSTMs on Sequences and Tree Structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[17] Shi X, Yi Y P, Xiong Y, et al. Extracting Entities with Attributes in Clinical Text via Joint Deep Learning[J]. Journal of the American Medical Informatics Association, 2019, 26(12):1584-1591.
doi: 10.1093/jamia/ocz158
[18] 马进, 杨一帆, 陈文亮. 基于远程监督的人物属性抽取研究[J]. 中文信息学报, 2020, 34(6):64-72.
[18] ( Ma Jin, Yang Yifan, Chen Wenliang. Distant Supervision for Person Attribute Recognition[J]. Journal of Chinese Information Processing, 2020, 34(6):64-72.)
[19] Deng W D, Liu Y. Chinese Triple Extraction Based on BERT Model[C]// Proceedings of the 15th International Conference on Ubiquitous Information Management and Communication. 2021: 1-5.
[20] 李昊迪. 医学领域知识抽取方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2018.
[20] ( Li Haodi. Research on Medical Domain Knowledge Extraction Methods[D]. Harbin: Harbin Institute of Technology, 2018.)
[21] 安磊. 构建金融知识图谱的知识抽取服务的设计与实现[D]. 南京: 南京大学, 2019.
[21] ( An Lei. The Design and Implementation of Knowledge Extraction Service for Constructing the Knowledge Graph of the Financial Domain[D]. Nanjing: Nanjing University, 2019.)
[22] 邹爱玲. 基于法律的知识图谱构建[D]. 成都: 电子科技大学, 2019.
[22] ( Zou Ailing. Construction of Knowledge Graph Based on Law[D]. Chengdu: University of Electronic Science and Technology of China, 2019.)
[23] 郝培豪. 安保警务知识图谱构建研究[D]. 北京: 中国人民公安大学, 2019.
[23] ( Hao Peihao. Research on the Construction of Knowledge Graph in Security Police[D]. Beijing: People’s Public Security University of China, 2019.)
[24] 徐溥. 旅游领域知识图谱构建方法的研究和实现[D]. 北京: 北京理工大学, 2016.
[24] ( Xu Pu. Research and Implementation on Construction Method of Knowledge Graph in Tourism Domain[D]. Beijing: Beijing Institute of Technology, 2016.)
[1] Bocheng Li,Yunqiu Zhang,Kaixi Yang. Extracting Emotion Tags from Comments of Microblog Commodities[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[2] Shengchun Ding,Linlin Hou,Ying Wang. Product Knowledge Map Construction Based on the E-commerce Data[J]. 数据分析与知识发现, 2019, 3(3): 45-56.
[3] Li Lin,Li Hui. Computing Text Similarity Based on Concept Vector Space[J]. 数据分析与知识发现, 2018, 2(5): 48-58.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn