Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (1): 73-80    DOI: 10.11925/infotech.1003-3513.2016.01.11
Orginal Article Current Issue | Archive | Adv Search |
Study on Construction of Domain Terminology Taxonomic Relation
Hui Zhu(),Jianlin Yang,Hao Wang
School of Information Management, Nanjing University, Nanjing 210023, China.Jiangsu Key Laboratory of Data Engineering and Knowledge Services, Nanjing 210023, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] Discuss how to obtain the terminology taxonomic relation from Chinese domain unstructured text. [Methods] Based on Digital Library domain text from CNKI, construct terminology hierarchy by terminology extraction, terminology Vector Space Model construction, BIRCH clustering and cluster tag distribution. [Results] Obtain the terminology taxonomic relation of Digital Library domain, and evaluate the effectiveness. The accuracy of clustering reaches up to 80.88%, and the accuracy of cluster tag extraction reaches up to 89.71%. [Limitations] Evaluate the effectiveness by random sampling, and in comparison with one method only. [Conclusions] Making use of BIRCH algorithm to construct terminology taxonomic relation, this algorithm has obvious advantage compared with K-means clustering method, and has higher execution and clustering effectiveness.

Key wordsTerminology      Taxonomic relation      Ontology      Ontology learning      Clustering     
Received: 19 June 2015      Published: 04 February 2016

Cite this article:

Hui Zhu,Jianlin Yang,Hao Wang. Study on Construction of Domain Terminology Taxonomic Relation. New Technology of Library and Information Service, 2016, 32(1): 73-80.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.01.11     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I1/73

[1] Gruber T R.A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition, 1993, 5(2): 199-220.
[2] Rios-Alvarado A B, Lopez-Arevalo I, Sosa-Sosa V J. Learning Concept Hierarchies from Textual Resources for Ontologies Construction[J]. Expert Systems with Applications, 2013, 40(15): 5907-5915.
[3] 温春, 石昭祥, 张霄. 本体概念层次获取方法综述[J]. 计算机应用与软件, 2010, 27(9): 103-107.
[3] (Wen Chun, Shi Zhaoxiang, Zhang Xiao.A Survey on Ontology Concept Hierarchy Acquisition[J]. Computer Applications and Software, 2010, 27(9): 103-107.)
[4] Harries Z S.Mathematical Structures of Language[M]. New York: Wiley, 1968.
[5] Miller G A, Charles W.Contextual Correlates of Semantic Similarity[J]. Language and Cognitive Processes, 1991, 6(1): 1-28.
[6] Sun C, Zhao M, Long Y J.Learning Concepts and Taxonomic Relations by Metric Learning for Regression[J]. Communications in Statistics-Theory and Methods, 2014, 43(14): 2938-2950.
[7] Hu F H, Shao Z Q, Ruan T. Self-Supervised Chinese Ontology Learning from Online Encyclopedias [J]. The Scientific World Journal, 2014: Article ID 848631.
[8] Colace F, De Santo M, Greco L, et al.Terminological Ontology Learning and Population Using Latent Dirichlet Allocation[J]. Journal of Visual Languages and Computing, 2014, 25(6): 818-826.
[9] Meijer K, Frasincar F, Hogenboom F.A Semantic Approach for Extracting Domain Taxonomies from Text[J]. Decision Support Systems, 2014,62:78-93.
[10] De Knijff J, Frasincar F, Hogenboom F.Domain Taxonomy Learning from Text: The Subsumption Method Versus Hierarchical Clustering[J]. Data & Knowledge Engineering, 2013, 83: 54-69.
[11] 季培培, 鄢小燕, 岑咏华, 等. 面向领域中文文本信息处理的术语语义层次获取研究[J]. 现代图书情报技术, 2010(9): 37-41.
[11] (Ji Peipei, Yan Xiaoyan, Cen Yonghua, et al.Research of Term Semantic Hierarchy Induction for Domain-specific Chinese Text Information Processing[J]. New Technology of Library and Information Service, 2010(9): 37-41.)
[12] 林源, 陈志泊, 孙俏. 计算机领域术语的自动获取与层次构建[J]. 计算机工程, 2011, 37(2): 172-174.
[12] (Lin Yuan, Chen Zhibo, Sun Qiao.Computer Domain Term Automatic Extraction and Hierarchical Structure Building[J]. Computer Engineering, 2011, 37(2): 172-174.)
[13] 彭成, 季培培. 基于确定性退火的中文术语语义层次关联研究[J]. 计算机应用研究, 2011, 28(9): 3235-3238.
[13] (Peng Cheng, Ji Peipei.Research of Term Semantic Hierarchy Correlations Based on Deterministic Annealing[J]. Application Research of Computers, 2011, 28(9): 3235-3238.)
[14] 谷俊, 朱紫阳. 基于聚类算法的本体层次关系获取研究[J]. 现代图书情报技术, 2011(12): 46-51.
[14] (Gu Jun, Zhu Ziyang.Study on Ontology Hierarchy Relation Induction on Clustering Algorithm[J]. New Technology of Library and Information Service, 2011(12): 46-51.)
[15] 韩红旗, 徐硕, 桂婕, 等. 基于词形规则模板的术语层次关系抽取方法[J]. 情报学报, 2013, 32(7): 708-715.
[15] (Han Hongqi, Xu Shuo, Gui Jie, et al.Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7): 708-715.)
[16] 涂鼎, 陈岭, 陈根才, 等. 基于多路层次聚类的商品评论数据概念分类构建[J]. 计算机研究与发展, 2013, 50(S): 208-215.
[16] (Tu Ding, Chen Ling, Chen Gencai, et al.Multi-way Hierarchical Clustering Based Concept Taxonomy Construction for Product Reviews[J]. Journal of Computer Research and Development, 2013, 50(S): 208-215.)
[17] 李树青. 基于引文关键词加权共现技术的图情学科领域本体自动构建方法研究[J]. 情报学报, 2012, 31(4): 371-380.
[17] (Li Shuqing.Research on Automatic Construction of Domain Ontology in Library and Information Science Based on Weighted Co-occurrence of Citation Keywords[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 371-380.)
[18] Zhang T, Ramakrishnan R, Livny M.BIRCH: A New Data
[18] Clustering Algorithm and Its Applications[J]. Data Mining and Knowledge Discovery, 1997, 1(2): 141-182.
[19] NLPIR [EB/OL]. [2014-06-03]. .
[20] 王昊, 苏新宁, 朱惠. 中文医学专业术语的层次结构生成研究[J]. 情报学报, 2014, 33(6): 594-604.
[20] (Wang Hao, Su Xinning, Zhu Hui.Study on Hierarchy Structure Generation of Chinese Medical Terminology[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(6): 594-604.)
[1] Wang Ruolin, Niu Zhendong, Lin Qika, Zhu Yifan, Qiu Ping, Lu Hao, Liu Donglei. Disambiguating Author Names with Embedding Heterogeneous Information and Attentive RNN Clustering Parameters[J]. 数据分析与知识发现, 2021, 5(8): 13-24.
[2] Wang Xiwei,Jia Ruonan,Wei Yanan,Zhang Liu. Clustering User Groups of Public Opinion Events from Multi-dimensional Social Network[J]. 数据分析与知识发现, 2021, 5(6): 25-35.
[3] Lu Linong,Zhu Zhongming,Zhang Wangqiang,Wang Xiaochun. Cross-database Knowledge Integration and Fingerprint of Institutional Repositories with Lingo3G Clustering Algorithm[J]. 数据分析与知识发现, 2021, 5(5): 127-132.
[4] Zhang Mengyao, Zhu Guangli, Zhang Shunxiang, Zhang Biao. Grouping Microblog Users of Trending Topics Based on Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
[5] Sheng Shu, Huang Qi, Yang Yang, Xie Qiwen, Qin Xinguo. Exchanging Chinese Medical Information Based on HL7 FHIR[J]. 数据分析与知识发现, 2021, 5(11): 13-28.
[6] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[8] Yu Fengchang,Cheng Qikai,Lu Wei. Locating Academic Literature Figures and Tables with Geometric Object Clustering[J]. 数据分析与知识发现, 2021, 5(1): 140-149.
[9] Wen Pingmei,Ye Zhiwei,Ding Wenjian,Liu Ying,Xu Jian. Developments of Named Entity Disambiguation[J]. 数据分析与知识发现, 2020, 4(9): 15-25.
[10] Zeng Zhen,Li Gang,Mao Jin,Chen Jinghao. Data Governance and Domain Ontology of Regional Public Security[J]. 数据分析与知识发现, 2020, 4(9): 41-55.
[11] Wu Jinming,Hou Yuefang,Cui Lei. Automatic Expression of Co-occurrence Clustering Based on Indexing Rules of Medical Subject Headings[J]. 数据分析与知识发现, 2020, 4(9): 133-144.
[12] Xi Yunjiang, Du Diedie, Liao Xiao, Zhang Xuehong. Analyzing & Clustering Enterprise Microblog Users with Supernetwork[J]. 数据分析与知识发现, 2020, 4(8): 107-118.
[13] Yang Xu,Qian Xiaodong. Synchronous Clustering Algorithm for Social Networks Based on Improved Vicsek Model[J]. 数据分析与知识发现, 2020, 4(4): 119-128.
[14] Xiong Huixiang,Li Xiaomin,Li Yueyan. Group Recommendation Based on Attribute Mining of Book Reviews[J]. 数据分析与知识发现, 2020, 4(2/3): 214-222.
[15] Wei Jiaze,Dong Cheng,He Yanqing,Liu Zhihui,Peng Keyun. Detecting News Topics Based on Equalized Paragraph and Sub-topic Vector[J]. 数据分析与知识发现, 2020, 4(10): 70-79.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn