|
|
Automatically Identifying Hypernym-Hyponym Relations of Domain Concepts with Patterns and Projection Learning |
Wang Sili1,2( ),Zhu Zhongming1,2,Yang Heng1,Liu Wei1 |
1Literature and Information Center of Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China 2University of Chinese Academy of Sciences, Beijing 100049, China |
|
|
Abstract [Objective] This paper tries to automatically identify the hypernym-hyponym relations of domain concepts and establish their ontology. [Methods] First, we combined the traditional unsupervised pattern-based method and the advanced supervised-based projection learning method to automatically extract domain concepts. Then, we examined our new method with an empirical study. [Results] The proposed method could identify the hypernym sets of domain concepts. The identification accuracy in medical and general fields, as well as with the benchmark dataset BLESS were 0.88, 0.83, and 0.85 respectively. [Limitations] More research is needed to reduce the weight of high-frequency top-level words and improve the corpus quality. There are also some misidentified relationships. [Conclusions] The proposed model could find hypernym with different meanings for the same concept, which could also extract low-frequency words and named entities.
|
Received: 09 April 2020
Published: 04 December 2020
|
|
Corresponding Authors:
Wang Sili
E-mail: wangsl@llas.ac.cn
|
[1] |
WordNet-A Lexical Database for English[DB/OL]. [2019-10-20]. https://wordnet.princeton.edu/.
|
[2] |
Cyc: Logical Reasoning with the World’s Largest Knowledge Base[DB/OL]. [2019-11-09]. http://www.cyc.com/.
|
[3] |
程韵如. 基于维基百科的领域实体上下位关系抽取[J]. 价值工程, 2016,35(18):160-163.
|
[3] |
( Cheng Yunru. Hyponymy Extraction of Domain Entity Based on Wikipedia[J]. Value Engineering, 2016,35(18):160-163.)
|
[4] |
唐恩博. 基于WordNet的蒙古文名词语义网上下位语义关系树构造方法的研究[D]. 呼和浩特: 内蒙古师范大学, 2014.
|
[4] |
( Tang Enbo. Research on Construction Method of Mongolian Noun Semantic Network Hyponymy Tree Based on WordNet[D]. Huhhot: Inner Mongolia Normal University, 2014.)
|
[5] |
Gunawan, Pranata E. Acquisition of Hypernymy-Hyponymy Relation Between Nouns for WordNet Building[C]// Proceedings of the 2010 International Conference on Asian Language Processing. 2010: 114-117.
|
[6] |
Hearst M A. Automatic Acquisition of Hyponyms from Large Text Corpora[C]// Proceedings of the 14th International Conference on Computational Linguistics. 1992,2:539-545.
|
[7] |
Roller S, Katrin E K. Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 2163-2172.
|
[8] |
刘磊, 曹存根, 王海涛, 等. 一种基于“是一个”模式的下位概念获取方法[J]. 计算机科学, 2006,33(9):146-151.
|
[8] |
( Liu Lei, Cao Cungen, Wang Haitao, et al. A Method of Hyponym Acquisition Based on “isa” Pattern[J]. Computer Science, 2006,33(9):146-151.)
|
[9] |
汤青, 吕学强, 李卓. 本体概念间上下位关系抽取研究[J]. 微电子学与计算机, 2014(6):68-71.
|
[9] |
( Tang Qing, Lv Xueqiang, Li Zhuo. Research on Domain Ontology Concept Hyponymy Relation Extraction[J]. Microelectronics & Computer, 2014(6):68-71.)
|
[10] |
Geffet M, Dagan I. The Distributional Inclusion Hypotheses and Lexical Entailment[C]// Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005: 107-114.
|
[11] |
Kotlerman L, Dagan I, Szpektor I, et al. Directional Distributional Similarity for Lexical Inference[J]. Natural Language Engineering, 2010,16(4):359-389.
doi: 10.1017/S1351324910000124
|
[12] |
Baroni M, Lenci A. How We BLESSed Distributional Semantic Evaluation[C]// Proceedings of the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics. 2011: 1-10.
|
[13] |
Mei K W, Syed S R A, Ian D J. A Multi-Phase Correlation Search Framework for Mining Non-Taxonomic Relations from Unstructured Text[J]. Knowledge and Information Systems, 2014,38(3):641-667.
doi: 10.1007/s10115-012-0593-7
|
[14] |
Roller S, Erk K, Boleda G. Inclusive Yet Selective: Supervised Distributional Hypernymy Detection[C]// Proceedings of the 25th International Conference on Computational Linguistics. 2014: 1025-1036.
|
[15] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301. 3781.
|
[16] |
Pennington J, Socher R, Manning C D. GloVe: Global Vectors for Word Representation[DB/OL]. [2018-12-29]. https://nlp.stanford.edu/projects/glove/.
|
[17] |
Peters M, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv: 1802. 05365.
|
[18] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810. 04805.
|
[19] |
Fu R J, Guo J, Qin B, et al. Learning Semantic Hierarchies via Word Embeddings[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. USA, 2014: 1199-1209.
|
[20] |
Yu Z, Wang H X, Lin X M, et al. Learning Term Embeddings for Hypernymy Identification[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015). 2015: 1390-1397.
|
[21] |
Wang C Y, He X F. Chinese Hypernym-Hyponym Extraction from User Generated Categories[C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1350-1361.
|
[22] |
余弦相似度[EB/OL]. [2019-10-15]. https://baike.baidu.com/item/余弦相似度.
|
[22] |
( Cosine Similarity[EB/OL]. [2019-10-15]. https://baike.baidu.com/item/余弦相似度.)
|
[23] |
Yamane J, Takatani T, Yamada H, et al. Distributional Hypernym Generation by Jointly Learning Clusters and Projections[C]// Proceedings of the 26th International Conference on Computational Linguistics. 2016: 1871-1879.
|
[24] |
Ustalov D, Arefyev N, Biemann C, et al. Negative Sampling Improves Hypernymy Extraction Based on Projection Learning[OL]. arXiv Preprint, arXiv: 1707. 03903.
|
[25] |
PubMed Data[DB/OL]. [2019-08-15]. https://www.nlm.nih.gov/databases/download/pubmed_medline.html.
|
[26] |
SnomedCT[DB/OL]. [2019-08-10]. http://browser.ihtsdotools.org/.
|
[27] |
UMBC Corpus[DB/OL]. [2019-10-25]. http://ebiquity.umbc.edu/blogger/2013/05/01/umbc-webbase-corpus-of-3b-english-words.
|
[28] |
WordNet[DB/OL]. [2019-10-25]. http://wordnetweb.princeton.edu/perl/webwn?s=dog&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=1010000000.
|
[29] |
Python Interface to Google Word2Vec[DB/OL]. [2019-08-15]. https://github.com/danielfrg/word2vec.
|
[30] |
PyTorch[DB/OL]. [2019-08-15]. https://pytorch.org/.
|
[31] |
BLESS Dataset[DB/OL]. [2019-11-27]. https://sites.google.com/site/geometricalmodels/shared-evaluation.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|