|
|
Extracting Entity Relationship with Word Embedding Representation Features |
Zhang Qin1,2( ), Guo Hongmei1, Zhang Zhixiong1,3 |
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China 2University of Chinese Academy of Sciences, Beijing 100049, China 3Wuhan Documentation and Information Center, Chinese Academy of Sciences, Wuhan 430071, China |
|
|
Abstract [Objective] This study explores the word embedding representation features for entity relationship extraction, aiming to add semantic message to the existing methods. [Methods] First, we used the feature characteristics at word embedding representation, the vocabulary and the grammar levels to extract relations using Naive Bayesian, Decision Tree and Random Forest models. Then, we obtained the optimal subset of the full features. [Results] The accuracy of the Decision Tree algorithm was 0.48 with full features, which was the best. The F1 score of Member-Collection (E2, E1) was 0.70, and the dependency could help us extract the relations. [Limitations] We need to improve the relation extraction results with small sample size and complex situation. The word vector training method could be further optimized. [Conclusions] This study proves the effectiveness of three types of features. And the word embedding representation level feature plays an important role to extract relations.
|
Received: 15 June 2017
Published: 18 October 2017
|
|
[1] |
Bunescu R C, Mooney R J.Subsequence Kernels for Relation Extraction[C]//Proceeding of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2005: 171-178.
|
[2] |
Zelenko D, Aone C, Richardella A.Kernel Methods for Relation Extraction[J]. The Journal of Machine Learning Research, 2003, 3(3): 1083-1106.
doi: 10.3115/1118693.1118703
|
[3] |
Culotta A, Sorensen J.Dependency Tree Kernels for Relation Extraction[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. USA: ACL, 2004: 423-429.
|
[4] |
Bunescu R C, Mooney R J.A Shortest Path Dependency Kernel for Relation Extraction[C]// Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. USA: ACL, 2005: 724-731.
|
[5] |
郭剑毅, 陈鹏, 余正涛, 等. 基于多核融合的中文领域实体关系抽取[J]. 中文信息学报, 2016, 30(1): 24-29.
|
[5] |
(Guo Jianyi, Chen Peng, Yu Zhengtao, et al.Domain Specific Chinese Semantic Relation Extraction Based on Composite Kernel[J]. Journal of Chinese Information Processing, 2016, 30(1): 24-29.)
|
[6] |
Xiang Y, Wang X L, Zhang Y Y, et al.Distant Supervision for Relation Extraction via Group Selection[C]// Proceedings of the 22nd International Conference on Neural Information Processing (Part II). USA: Springer, 2015: 250-258.
|
[7] |
Mintz M, Bills S, Snow R, et al.Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. USA: ACL, 2009: 1003-1011.
|
[8] |
Banko M, Cafarella M J, Soderland S, et al.Open Information Extraction from the Web[C]// Proceedings of the 20th International Joint Conference on Artificial Intelligence. USA: Morgan Kaufmann Publishers, 2007: 2670-2676.
|
[9] |
Wu F, Weld D S.Open Information Extraction Using Wikipedia[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. USA: ACL, 2010: 118-127.
|
[10] |
Fader A, Soderland S, Etzioni O.Identifying Relations for Open Information Extraction[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. USA: ACL, 2011: 1535-1545.
|
[11] |
Kambhatla N. Combining Lexical, Syntactic and Semantic Features with Maximum Entropy Models for Extracting Relations [C]// Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. USA: ACL, 2004: Article No. 22.
|
[12] |
Zhou G D, Su J, Zhang J, et al.Exploring Various Knowledge in Relation Extraction[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. USA: ACL, 2005: 427-434.
|
[13] |
高俊平, 张晖, 赵旭剑, 等. 面向维基百科的领域知识演化关系抽取[J]. 计算机学报, 2016, 39(10): 2088-2101.
|
[13] |
(Gao Junping, Zhang Hui, Zhao Xujian, et al.Evolutionary Relation Extraction for Domain Knowledge in Wikipedia[J]. Chinese Journal of Computers, 2016, 39(10): 2088-2101.)
|
[14] |
甘丽新, 万常选, 刘德喜, 等. 基于句法语义特征的中文实体关系抽取[J].计算机研究与发展, 2016, 53(2): 284-302.
doi: 10.7544/issn1000-1239.2016.20150842
|
[14] |
(Gan Lixin, Wan Changxuan, Liu Dexi, et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J]. Journal of Computer Research and Development, 2016, 53(2): 284-302.
doi: 10.7544/issn1000-1239.2016.20150842
|
[15] |
Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26: 3111-3119.
|
[16] |
Bengio Y, Ducharme R, Vincent P, et al.A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003, 3(6): 1137-1155.
doi: 10.1007/3-540-33486-6_6
|
[17] |
Mikolov T, Kombrink S, Burget L.Extensions of Recurrent Neural Network Language Model[C]// Proceedings of 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). USA: IEEE, 2010: 1045-1048.
|
[18] |
GitHub [EB/OL]. [2017-05-16]..
|
[19] |
Google Code [EB/OL]. [2017-05-16]. .
|
[20] |
The Stanford Natural Language Group [EB/OL]. [2017-05- 16]. .
|
[21] |
Kononenko I.Estimating Attributes: Analysis and Extensions of RELIEF[C]// Proceedings of the European Conference on Machine Learning. USA: Springer, 1994: 171-182.
|
[22] |
Hall M A.Correlation-based Feature Subset Selection for Machine Learning [D]. New Zealand: The University of Waikato, 1998.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|