Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (11): 15-25    DOI: 10.11925/infotech.2096-3467.2020.0299
Automatically Identifying Hypernym-Hyponym Relations of Domain Concepts with Patterns and Projection Learning
Wang Sili1,2(),Zhu Zhongming1,2,Yang Heng1,Liu Wei1
1Literature and Information Center of Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China
2University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] This paper tries to automatically identify the hypernym-hyponym relations of domain concepts and establish their ontology. [Methods] First, we combined the traditional unsupervised pattern-based method and the advanced supervised-based projection learning method to automatically extract domain concepts. Then, we examined our new method with an empirical study. [Results] The proposed method could identify the hypernym sets of domain concepts. The identification accuracy in medical and general fields, as well as with the benchmark dataset BLESS were 0.88, 0.83, and 0.85 respectively. [Limitations] More research is needed to reduce the weight of high-frequency top-level words and improve the corpus quality. There are also some misidentified relationships. [Conclusions] The proposed model could find hypernym with different meanings for the same concept, which could also extract low-frequency words and named entities.

Key wordsHearst Pattern      Projection Learning      Word Embedding      Hypernym-Hyponym Relations      Domain Concept     
Received: 09 April 2020      Published: 04 December 2020
ZTFLH:  TP391  
Wang Sili,Zhu Zhongming,Yang Heng,Liu Wei. Automatically Identifying Hypernym-Hyponym Relations of Domain Concepts with Patterns and Projection Learning. Data Analysis and Knowledge Discovery, 2020, 4(11): 15-25.

Framework of Automatic Recognition of Hypernym-Hyponym Relationship Based on Pattern and Projection Learning
英文模式 中文模式
Y such as X Y例如/比如X
Y other than X 除了Y之外的X/ Y不仅是X
Y including X Y包含X
Y especially X Y尤其/特别是X
not all Y are X 不全是/并不是所有的Y都是X
Y like X Y类似X
Y for example X Y例如/比如/示例X
Y which includes X Y是那些包含X
X are also Y X也是Y
X are all Y X都是Y
not Y so much as X 没有Y而是X
Y is a X Y是一种/个/只…X
Recognition Mode of Hypernym Based on Extended Hearst Pattern
实验方法 实验设置 实验结果
①模式 扩展Hearst模式: 分布假设 + 共同下位词识别模式 通用领域:0.38
②投影学习 Word2Vec 100维、训练迭代次数10、单投影1、无负采样、无高频词亚采样 通用领域:0.54
③投影学习 Word2Vec 200维、训练迭代次数20、多投影24、负采样15、高频词亚采样阈值1e-5 通用领域:0.66
④模式 +
扩展Hearst模式 + 训练迭代次数20、Word2Vec 200维、多投影24、负采样15、高频词亚采样阈值1e-5 通用领域:0.83
Tests on Recognition of Hypernym-Hyponym Relationship
医学领域概念词 上位词集合(Top5)
Aneurysm(动脉瘤) procedure; clinical finding; soft tissue lesion; anatomical structure; disease
Diagnostic lumbar puncture(诊断性腰椎穿刺) clinical finding; disease; procedure; sickness; illness
Vertebra(脊椎) body region; bone; body structure; fracture; anatomical structure
Thymosin(胸腺肽) protein; biopolymer; enzyme;
hydrolase; lyase
Pain assessment(疼痛评估) pain; sickness; disease; illness;
practice of medicine
Recognition Results of Hypernym-Hyponym Relationship in Medical Field
通用领域概念词 上位词集合(Top5)
Miscreant(不法之徒) person; bad person; wrongdoer; actor; politician
Queen Elizabeth
person; king; monarch; aristocrat; patrician
Microcontroller(微控制器) electronic circuit; circuitry; pc board; computer chip; electrical device
Business concern
corporation; business organization; government agency; business firm; written agreement
Vegetarian(素食者/素的) dessert; dish; recipe; food product; person
Recognition Results of Hypernym-Hyponym Relationship in General Field
