|
|
Study on Construction of Domain Terminology Taxonomic Relation |
Hui Zhu(),Jianlin Yang,Hao Wang |
School of Information Management, Nanjing University, Nanjing 210023, China.Jiangsu Key Laboratory of Data Engineering and Knowledge Services, Nanjing 210023, China |
|
|
Abstract [Objective] Discuss how to obtain the terminology taxonomic relation from Chinese domain unstructured text. [Methods] Based on Digital Library domain text from CNKI, construct terminology hierarchy by terminology extraction, terminology Vector Space Model construction, BIRCH clustering and cluster tag distribution. [Results] Obtain the terminology taxonomic relation of Digital Library domain, and evaluate the effectiveness. The accuracy of clustering reaches up to 80.88%, and the accuracy of cluster tag extraction reaches up to 89.71%. [Limitations] Evaluate the effectiveness by random sampling, and in comparison with one method only. [Conclusions] Making use of BIRCH algorithm to construct terminology taxonomic relation, this algorithm has obvious advantage compared with K-means clustering method, and has higher execution and clustering effectiveness.
|
Received: 19 June 2015
Published: 04 February 2016
|
[1] | Gruber T R.A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition, 1993, 5(2): 199-220. | [2] | Rios-Alvarado A B, Lopez-Arevalo I, Sosa-Sosa V J. Learning Concept Hierarchies from Textual Resources for Ontologies Construction[J]. Expert Systems with Applications, 2013, 40(15): 5907-5915. | [3] | 温春, 石昭祥, 张霄. 本体概念层次获取方法综述[J]. 计算机应用与软件, 2010, 27(9): 103-107. | [3] | (Wen Chun, Shi Zhaoxiang, Zhang Xiao.A Survey on Ontology Concept Hierarchy Acquisition[J]. Computer Applications and Software, 2010, 27(9): 103-107.) | [4] | Harries Z S.Mathematical Structures of Language[M]. New York: Wiley, 1968. | [5] | Miller G A, Charles W.Contextual Correlates of Semantic Similarity[J]. Language and Cognitive Processes, 1991, 6(1): 1-28. | [6] | Sun C, Zhao M, Long Y J.Learning Concepts and Taxonomic Relations by Metric Learning for Regression[J]. Communications in Statistics-Theory and Methods, 2014, 43(14): 2938-2950. | [7] | Hu F H, Shao Z Q, Ruan T. Self-Supervised Chinese Ontology Learning from Online Encyclopedias [J]. The Scientific World Journal, 2014: Article ID 848631. | [8] | Colace F, De Santo M, Greco L, et al.Terminological Ontology Learning and Population Using Latent Dirichlet Allocation[J]. Journal of Visual Languages and Computing, 2014, 25(6): 818-826. | [9] | Meijer K, Frasincar F, Hogenboom F.A Semantic Approach for Extracting Domain Taxonomies from Text[J]. Decision Support Systems, 2014,62:78-93. | [10] | De Knijff J, Frasincar F, Hogenboom F.Domain Taxonomy Learning from Text: The Subsumption Method Versus Hierarchical Clustering[J]. Data & Knowledge Engineering, 2013, 83: 54-69. | [11] | 季培培, 鄢小燕, 岑咏华, 等. 面向领域中文文本信息处理的术语语义层次获取研究[J]. 现代图书情报技术, 2010(9): 37-41. | [11] | (Ji Peipei, Yan Xiaoyan, Cen Yonghua, et al.Research of Term Semantic Hierarchy Induction for Domain-specific Chinese Text Information Processing[J]. New Technology of Library and Information Service, 2010(9): 37-41.) | [12] | 林源, 陈志泊, 孙俏. 计算机领域术语的自动获取与层次构建[J]. 计算机工程, 2011, 37(2): 172-174. | [12] | (Lin Yuan, Chen Zhibo, Sun Qiao.Computer Domain Term Automatic Extraction and Hierarchical Structure Building[J]. Computer Engineering, 2011, 37(2): 172-174.) | [13] | 彭成, 季培培. 基于确定性退火的中文术语语义层次关联研究[J]. 计算机应用研究, 2011, 28(9): 3235-3238. | [13] | (Peng Cheng, Ji Peipei.Research of Term Semantic Hierarchy Correlations Based on Deterministic Annealing[J]. Application Research of Computers, 2011, 28(9): 3235-3238.) | [14] | 谷俊, 朱紫阳. 基于聚类算法的本体层次关系获取研究[J]. 现代图书情报技术, 2011(12): 46-51. | [14] | (Gu Jun, Zhu Ziyang.Study on Ontology Hierarchy Relation Induction on Clustering Algorithm[J]. New Technology of Library and Information Service, 2011(12): 46-51.) | [15] | 韩红旗, 徐硕, 桂婕, 等. 基于词形规则模板的术语层次关系抽取方法[J]. 情报学报, 2013, 32(7): 708-715. | [15] | (Han Hongqi, Xu Shuo, Gui Jie, et al.Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7): 708-715.) | [16] | 涂鼎, 陈岭, 陈根才, 等. 基于多路层次聚类的商品评论数据概念分类构建[J]. 计算机研究与发展, 2013, 50(S): 208-215. | [16] | (Tu Ding, Chen Ling, Chen Gencai, et al.Multi-way Hierarchical Clustering Based Concept Taxonomy Construction for Product Reviews[J]. Journal of Computer Research and Development, 2013, 50(S): 208-215.) | [17] | 李树青. 基于引文关键词加权共现技术的图情学科领域本体自动构建方法研究[J]. 情报学报, 2012, 31(4): 371-380. | [17] | (Li Shuqing.Research on Automatic Construction of Domain Ontology in Library and Information Science Based on Weighted Co-occurrence of Citation Keywords[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 371-380.) | [18] | Zhang T, Ramakrishnan R, Livny M.BIRCH: A New Data | [18] | Clustering Algorithm and Its Applications[J]. Data Mining and Knowledge Discovery, 1997, 1(2): 141-182. | [19] | NLPIR [EB/OL]. [2014-06-03]. . | [20] | 王昊, 苏新宁, 朱惠. 中文医学专业术语的层次结构生成研究[J]. 情报学报, 2014, 33(6): 594-604. | [20] | (Wang Hao, Su Xinning, Zhu Hui.Study on Hierarchy Structure Generation of Chinese Medical Terminology[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(6): 594-604.) |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|