Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (9): 8-15    DOI: 10.11925/infotech.2096-3467.2017.09.01
Orginal Article Current Issue | Archive | Adv Search |
Extracting Entity Relationship with Word Embedding Representation Features
Qin Zhang1,2(),Hongmei Guo1,Zhixiong Zhang1,3
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2University of Chinese Academy of Sciences, Beijing 100049, China
3Wuhan Documentation and Information Center, Chinese Academy of Sciences, Wuhan 430071, China
Download: PDF(464 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      

[Objective] This study explores the word embedding representation features for entity relationship extraction, aiming to add semantic message to the existing methods. [Methods] First, we used the feature characteristics at word embedding representation, the vocabulary and the grammar levels to extract relations using Naive Bayesian, Decision Tree and Random Forest models. Then, we obtained the optimal subset of the full features. [Results] The accuracy of the Decision Tree algorithm was 0.48 with full features, which was the best. The F1 score of Member-Collection (E2, E1) was 0.70, and the dependency could help us extract the relations. [Limitations] We need to improve the relation extraction results with small sample size and complex situation. The word vector training method could be further optimized. [Conclusions] This study proves the effectiveness of three types of features. And the word embedding representation level feature plays an important role to extract relations.

Key wordsRelation Extraction      Word Embedding Representation      Word2Vec     
Received: 15 June 2017      Published: 18 October 2017

Cite this article:

Qin Zhang,Hongmei Guo,Zhixiong Zhang. Extracting Entity Relationship with Word Embedding Representation Features. Data Analysis and Knowledge Discovery, 2017, 1(9): 8-15.

URL:     OR

[1] Bunescu R C, Mooney R J.Subsequence Kernels for Relation Extraction[C]//Proceeding of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2005: 171-178.
[2] Zelenko D, Aone C, Richardella A.Kernel Methods for Relation Extraction[J]. The Journal of Machine Learning Research, 2003, 3(3): 1083-1106.
[3] Culotta A, Sorensen J.Dependency Tree Kernels for Relation Extraction[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. USA: ACL, 2004: 423-429.
[4] Bunescu R C, Mooney R J.A Shortest Path Dependency Kernel for Relation Extraction[C]// Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. USA: ACL, 2005: 724-731.
[5] 郭剑毅, 陈鹏, 余正涛, 等. 基于多核融合的中文领域实体关系抽取[J]. 中文信息学报, 2016, 30(1): 24-29.
[5] (Guo Jianyi, Chen Peng, Yu Zhengtao, et al.Domain Specific Chinese Semantic Relation Extraction Based on Composite Kernel[J]. Journal of Chinese Information Processing, 2016, 30(1): 24-29.)
[6] Xiang Y, Wang X L, Zhang Y Y, et al.Distant Supervision for Relation Extraction via Group Selection[C]// Proceedings of the 22nd International Conference on Neural Information Processing (Part II). USA: Springer, 2015: 250-258.
[7] Mintz M, Bills S, Snow R, et al.Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. USA: ACL, 2009: 1003-1011.
[8] Banko M, Cafarella M J, Soderland S, et al.Open Information Extraction from the Web[C]// Proceedings of the 20th International Joint Conference on Artificial Intelligence. USA: Morgan Kaufmann Publishers, 2007: 2670-2676.
[9] Wu F, Weld D S.Open Information Extraction Using Wikipedia[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. USA: ACL, 2010: 118-127.
[10] Fader A, Soderland S, Etzioni O.Identifying Relations for Open Information Extraction[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. USA: ACL, 2011: 1535-1545.
[11] Kambhatla N. Combining Lexical, Syntactic and Semantic Features with Maximum Entropy Models for Extracting Relations [C]// Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. USA: ACL, 2004: Article No. 22.
[12] Zhou G D, Su J, Zhang J, et al.Exploring Various Knowledge in Relation Extraction[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. USA: ACL, 2005: 427-434.
[13] 高俊平, 张晖, 赵旭剑, 等. 面向维基百科的领域知识演化关系抽取[J]. 计算机学报, 2016, 39(10): 2088-2101.
[13] (Gao Junping, Zhang Hui, Zhao Xujian, et al.Evolutionary Relation Extraction for Domain Knowledge in Wikipedia[J]. Chinese Journal of Computers, 2016, 39(10): 2088-2101.)
[14] 甘丽新, 万常选, 刘德喜, 等. 基于句法语义特征的中文实体关系抽取[J].计算机研究与发展, 2016, 53(2): 284-302.
[14] (Gan Lixin, Wan Changxuan, Liu Dexi, et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J]. Journal of Computer Research and Development, 2016, 53(2): 284-302.
[15] Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26: 3111-3119.
[16] Bengio Y, Ducharme R, Vincent P, et al.A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003, 3(6): 1137-1155.
[17] Mikolov T, Kombrink S, Burget L.Extensions of Recurrent Neural Network Language Model[C]// Proceedings of 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). USA: IEEE, 2010: 1045-1048.
[18] GitHub [EB/OL]. [2017-05-16]..
[19] Google Code [EB/OL]. [2017-05-16]. .
[20] The Stanford Natural Language Group [EB/OL]. [2017-05- 16]. .
[21] Kononenko I.Estimating Attributes: Analysis and Extensions of RELIEF[C]// Proceedings of the European Conference on Machine Learning. USA: Springer, 1994: 171-182.
[22] Hall M A.Correlation-based Feature Subset Selection for Machine Learning [D]. New Zealand: The University of Waikato, 1998.
[1] Cuiqing Jiang,Yibo Guo,Yao Liu. Constructing a Domain Sentiment Lexicon Based on Chinese Social Media Text[J]. 数据分析与知识发现, 2019, 3(2): 98-107.
[2] Xinlei Li,Hao Wang,Xiaomin Liu,Sanhong Deng. Comparing Text Vector Generators for Weibo Short Text Classification[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[3] Yongbing Gao,Guipeng Yang,Di Zhang,Zhanfei Ma. Detecting Events from Official Weibo Profiles Based on Post Clustering with Burst Words[J]. 数据分析与知识发现, 2017, 1(9): 57-64.
[4] Tian Xia. Extracting Keywords with Modified TextRank Model[J]. 数据分析与知识发现, 2017, 1(2): 28-34.
[5] Ruilun Liu,Wenhao Ye,Ruiqing Gao,Mengjia Tang,Dongbo Wang. Research on Text Clustering Based on Requirements of Big Data Jobs[J]. 数据分析与知识发现, 2017, 1(12): 32-40.
[6] Jianhua Liu,Ying Wang,Zhixiong Zhang,Chuanxi Li. Extracting Semantic Knowledge from Plant Species Diversity Collections[J]. 数据分析与知识发现, 2017, 1(1): 37-46.
[7] Luo Wenxin,Chen Chong,Deng Siyi. Detecting Disease Associations with Word2Vec from Consumer Health Information[J]. 现代图书情报技术, 2016, 32(9): 78-87.
[8] Ning Jianfei,Liu Jiangzhen. Using Word2vec with TextRank to Extract Keywords[J]. 现代图书情报技术, 2016, 32(6): 20-27.
[9] Huang Xun, You Hongliang, Yu Yang. A Review of Relation Extraction[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[10] Wang Xiuyan, Cui Lei. Overview of Semantic Relations Extraction Between Biomedical Entities by Key Verbs[J]. 现代图书情报技术, 2011, 27(9): 21-27.
[11] Liu Jianhua ,Zhang Zhixiong. Relation Extraction Based on Stanford Parser[J]. 现代图书情报技术, 2009, 25(5): 1-5.
[12] Miao Chen,Xiaozhong Liu,Jian Qin. Semantic Relation Extraction from Socially-generated Tags:A Methodology for Metadata Generation[J]. 现代图书情报技术, 2009, 3(3): 38-45.
[13] Xu Jian,Zhang Zhixiong,Wu Zhenxin. Review on Techniques of Entity Relation Extraction[J]. 现代图书情报技术, 2008, 24(8): 18-23.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938