Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (1): 73-80    DOI: 10.11925/infotech.1003-3513.2016.01.11
Orginal Article Current Issue | Archive | Adv Search |
Study on Construction of Domain Terminology Taxonomic Relation
Hui Zhu(),Jianlin Yang,Hao Wang
School of Information Management, Nanjing University, Nanjing 210023, China.Jiangsu Key Laboratory of Data Engineering and Knowledge Services, Nanjing 210023, China
Download: PDF(480 KB)   HTML ( 41
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Discuss how to obtain the terminology taxonomic relation from Chinese domain unstructured text. [Methods] Based on Digital Library domain text from CNKI, construct terminology hierarchy by terminology extraction, terminology Vector Space Model construction, BIRCH clustering and cluster tag distribution. [Results] Obtain the terminology taxonomic relation of Digital Library domain, and evaluate the effectiveness. The accuracy of clustering reaches up to 80.88%, and the accuracy of cluster tag extraction reaches up to 89.71%. [Limitations] Evaluate the effectiveness by random sampling, and in comparison with one method only. [Conclusions] Making use of BIRCH algorithm to construct terminology taxonomic relation, this algorithm has obvious advantage compared with K-means clustering method, and has higher execution and clustering effectiveness.

Key wordsTerminology      Taxonomic relation      Ontology      Ontology learning      Clustering     
Received: 19 June 2015      Published: 04 February 2016

Cite this article:

Hui Zhu,Jianlin Yang,Hao Wang. Study on Construction of Domain Terminology Taxonomic Relation. New Technology of Library and Information Service, 2016, 32(1): 73-80.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.01.11     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I1/73

[1] Gruber T R.A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition, 1993, 5(2): 199-220.
[2] Rios-Alvarado A B, Lopez-Arevalo I, Sosa-Sosa V J. Learning Concept Hierarchies from Textual Resources for Ontologies Construction[J]. Expert Systems with Applications, 2013, 40(15): 5907-5915.
[3] 温春, 石昭祥, 张霄. 本体概念层次获取方法综述[J]. 计算机应用与软件, 2010, 27(9): 103-107.
[3] (Wen Chun, Shi Zhaoxiang, Zhang Xiao.A Survey on Ontology Concept Hierarchy Acquisition[J]. Computer Applications and Software, 2010, 27(9): 103-107.)
[4] Harries Z S.Mathematical Structures of Language[M]. New York: Wiley, 1968.
[5] Miller G A, Charles W.Contextual Correlates of Semantic Similarity[J]. Language and Cognitive Processes, 1991, 6(1): 1-28.
[6] Sun C, Zhao M, Long Y J.Learning Concepts and Taxonomic Relations by Metric Learning for Regression[J]. Communications in Statistics-Theory and Methods, 2014, 43(14): 2938-2950.
[7] Hu F H, Shao Z Q, Ruan T. Self-Supervised Chinese Ontology Learning from Online Encyclopedias [J]. The Scientific World Journal, 2014: Article ID 848631.
[8] Colace F, De Santo M, Greco L, et al.Terminological Ontology Learning and Population Using Latent Dirichlet Allocation[J]. Journal of Visual Languages and Computing, 2014, 25(6): 818-826.
[9] Meijer K, Frasincar F, Hogenboom F.A Semantic Approach for Extracting Domain Taxonomies from Text[J]. Decision Support Systems, 2014,62:78-93.
[10] De Knijff J, Frasincar F, Hogenboom F.Domain Taxonomy Learning from Text: The Subsumption Method Versus Hierarchical Clustering[J]. Data & Knowledge Engineering, 2013, 83: 54-69.
[11] 季培培, 鄢小燕, 岑咏华, 等. 面向领域中文文本信息处理的术语语义层次获取研究[J]. 现代图书情报技术, 2010(9): 37-41.
[11] (Ji Peipei, Yan Xiaoyan, Cen Yonghua, et al.Research of Term Semantic Hierarchy Induction for Domain-specific Chinese Text Information Processing[J]. New Technology of Library and Information Service, 2010(9): 37-41.)
[12] 林源, 陈志泊, 孙俏. 计算机领域术语的自动获取与层次构建[J]. 计算机工程, 2011, 37(2): 172-174.
[12] (Lin Yuan, Chen Zhibo, Sun Qiao.Computer Domain Term Automatic Extraction and Hierarchical Structure Building[J]. Computer Engineering, 2011, 37(2): 172-174.)
[13] 彭成, 季培培. 基于确定性退火的中文术语语义层次关联研究[J]. 计算机应用研究, 2011, 28(9): 3235-3238.
[13] (Peng Cheng, Ji Peipei.Research of Term Semantic Hierarchy Correlations Based on Deterministic Annealing[J]. Application Research of Computers, 2011, 28(9): 3235-3238.)
[14] 谷俊, 朱紫阳. 基于聚类算法的本体层次关系获取研究[J]. 现代图书情报技术, 2011(12): 46-51.
[14] (Gu Jun, Zhu Ziyang.Study on Ontology Hierarchy Relation Induction on Clustering Algorithm[J]. New Technology of Library and Information Service, 2011(12): 46-51.)
[15] 韩红旗, 徐硕, 桂婕, 等. 基于词形规则模板的术语层次关系抽取方法[J]. 情报学报, 2013, 32(7): 708-715.
[15] (Han Hongqi, Xu Shuo, Gui Jie, et al.Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7): 708-715.)
[16] 涂鼎, 陈岭, 陈根才, 等. 基于多路层次聚类的商品评论数据概念分类构建[J]. 计算机研究与发展, 2013, 50(S): 208-215.
[16] (Tu Ding, Chen Ling, Chen Gencai, et al.Multi-way Hierarchical Clustering Based Concept Taxonomy Construction for Product Reviews[J]. Journal of Computer Research and Development, 2013, 50(S): 208-215.)
[17] 李树青. 基于引文关键词加权共现技术的图情学科领域本体自动构建方法研究[J]. 情报学报, 2012, 31(4): 371-380.
[17] (Li Shuqing.Research on Automatic Construction of Domain Ontology in Library and Information Science Based on Weighted Co-occurrence of Citation Keywords[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 371-380.)
[18] Zhang T, Ramakrishnan R, Livny M.BIRCH: A New Data
[18] Clustering Algorithm and Its Applications[J]. Data Mining and Knowledge Discovery, 1997, 1(2): 141-182.
[19] NLPIR [EB/OL]. [2014-06-03]. .
[20] 王昊, 苏新宁, 朱惠. 中文医学专业术语的层次结构生成研究[J]. 情报学报, 2014, 33(6): 594-604.
[20] (Wang Hao, Su Xinning, Zhu Hui.Study on Hierarchy Structure Generation of Chinese Medical Terminology[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(6): 594-604.)
[1] Ke Li,Yuya Sasaki. Analyzing Sentiment Distribution with Spatial-textual Data of Multi-dimensional Clustering[J]. 数据分析与知识发现, 2019, 3(7): 14-22.
[2] Shiqi Deng,Liang Hong. Constructing Domain Ontology for Intelligent Applications: Case Study of Anti Tele-Fraud[J]. 数据分析与知识发现, 2019, 3(7): 73-84.
[3] Zhu Fu,Yuefen Wang,Xuhui Ding. Semantic Representation of Design Process Knowledge Reuse[J]. 数据分析与知识发现, 2019, 3(6): 21-29.
[4] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[5] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[6] Jiang Wu,Yinghui Zhao,Jiahui Gao. Research on Weibo Opinion Leaders Identification and Analysis in Medical Public Opinion Incidents[J]. 数据分析与知识发现, 2019, 3(4): 53-62.
[7] Lianjie Xiao,Mengrui Gao,Xinning Su. An Under-sampling Ensemble Classification Algorithm Based on Fuzzy C-Means Clustering for Imbalanced Data[J]. 数据分析与知识发现, 2019, 3(4): 90-96.
[8] Guangshang Gao. A Survey of User Profiles Methods[J]. 数据分析与知识发现, 2019, 3(3): 25-35.
[9] Jiaxin Ye,Huixiang Xiong. Recommending Personalized Contents from Cross-Domain Resources Based on Tags[J]. 数据分析与知识发现, 2019, 3(2): 21-32.
[10] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[11] Tao Zhang,Haiqun Ma. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[12] Xiangdong Li,Fan Gao,Youhai Li. Categorizing Documents Automatically within Common Semantic Space[J]. 数据分析与知识发现, 2018, 2(9): 66-73.
[13] Youshi He,Shufang He. Sentiment Mining of Online Product Reviews Based on Domain Ontology[J]. 数据分析与知识发现, 2018, 2(8): 60-68.
[14] Huihui Tang,Hao Wang,Zixuan Zhang,Xueying Wang. Extracting Names of Historical Events Based on Chinese Character Tags[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[15] Xiufang Wang,Shu Sheng,Yan Lu. Analyzing Public Opinion from Microblog with Topic Clustering and Sentiment Intensity[J]. 数据分析与知识发现, 2018, 2(6): 37-47.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn