Study on Construction of Domain Terminology Taxonomic Relation
Hui Zhu(),Jianlin Yang,Hao Wang
School of Information Management, Nanjing University, Nanjing 210023, China.Jiangsu Key Laboratory of Data Engineering and Knowledge Services, Nanjing 210023, China
[Objective] Discuss how to obtain the terminology taxonomic relation from Chinese domain unstructured text. [Methods] Based on Digital Library domain text from CNKI, construct terminology hierarchy by terminology extraction, terminology Vector Space Model construction, BIRCH clustering and cluster tag distribution. [Results] Obtain the terminology taxonomic relation of Digital Library domain, and evaluate the effectiveness. The accuracy of clustering reaches up to 80.88%, and the accuracy of cluster tag extraction reaches up to 89.71%. [Limitations] Evaluate the effectiveness by random sampling, and in comparison with one method only. [Conclusions] Making use of BIRCH algorithm to construct terminology taxonomic relation, this algorithm has obvious advantage compared with K-means clustering method, and has higher execution and clustering effectiveness.
朱惠,杨建林,王昊. 中文领域专业术语层次关系构建研究*[J]. 现代图书情报技术, 2016, 32(1): 73-80.
Hui Zhu,Jianlin Yang,Hao Wang. Study on Construction of Domain Terminology Taxonomic Relation. New Technology of Library and Information Service, 2016, 32(1): 73-80.
Gruber T R.A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition, 1993, 5(2): 199-220.
[2]
Rios-Alvarado A B, Lopez-Arevalo I, Sosa-Sosa V J. Learning Concept Hierarchies from Textual Resources for Ontologies Construction[J]. Expert Systems with Applications, 2013, 40(15): 5907-5915.
(Wen Chun, Shi Zhaoxiang, Zhang Xiao.A Survey on Ontology Concept Hierarchy Acquisition[J]. Computer Applications and Software, 2010, 27(9): 103-107.)
[4]
Harries Z S.Mathematical Structures of Language[M]. New York: Wiley, 1968.
[5]
Miller G A, Charles W.Contextual Correlates of Semantic Similarity[J]. Language and Cognitive Processes, 1991, 6(1): 1-28.
[6]
Sun C, Zhao M, Long Y J.Learning Concepts and Taxonomic Relations by Metric Learning for Regression[J]. Communications in Statistics-Theory and Methods, 2014, 43(14): 2938-2950.
[7]
Hu F H, Shao Z Q, Ruan T. Self-Supervised Chinese Ontology Learning from Online Encyclopedias [J]. The Scientific World Journal, 2014: Article ID 848631.
[8]
Colace F, De Santo M, Greco L, et al.Terminological Ontology Learning and Population Using Latent Dirichlet Allocation[J]. Journal of Visual Languages and Computing, 2014, 25(6): 818-826.
[9]
Meijer K, Frasincar F, Hogenboom F.A Semantic Approach for Extracting Domain Taxonomies from Text[J]. Decision Support Systems, 2014,62:78-93.
[10]
De Knijff J, Frasincar F, Hogenboom F.Domain Taxonomy Learning from Text: The Subsumption Method Versus Hierarchical Clustering[J]. Data & Knowledge Engineering, 2013, 83: 54-69.
(Ji Peipei, Yan Xiaoyan, Cen Yonghua, et al.Research of Term Semantic Hierarchy Induction for Domain-specific Chinese Text Information Processing[J]. New Technology of Library and Information Service, 2010(9): 37-41.)
(Peng Cheng, Ji Peipei.Research of Term Semantic Hierarchy Correlations Based on Deterministic Annealing[J]. Application Research of Computers, 2011, 28(9): 3235-3238.)
(Gu Jun, Zhu Ziyang.Study on Ontology Hierarchy Relation Induction on Clustering Algorithm[J]. New Technology of Library and Information Service, 2011(12): 46-51.)
(Han Hongqi, Xu Shuo, Gui Jie, et al.Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7): 708-715.)
(Tu Ding, Chen Ling, Chen Gencai, et al.Multi-way Hierarchical Clustering Based Concept Taxonomy Construction for Product Reviews[J]. Journal of Computer Research and Development, 2013, 50(S): 208-215.)
(Li Shuqing.Research on Automatic Construction of Domain Ontology in Library and Information Science Based on Weighted Co-occurrence of Citation Keywords[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 371-380.)
[18]
Zhang T, Ramakrishnan R, Livny M.BIRCH: A New Data
[18]
Clustering Algorithm and Its Applications[J]. Data Mining and Knowledge Discovery, 1997, 1(2): 141-182.
(Wang Hao, Su Xinning, Zhu Hui.Study on Hierarchy Structure Generation of Chinese Medical Terminology[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(6): 594-604.)