Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (4): 26-33    DOI: 10.11925/infotech.1003-3513.2015.04.04
Current Issue | Archive | Adv Search |
A Domain Concepts Triple-layer Filter Method
Duan Yufeng1, Zhu Wenjing2
1 Institute of Quality Development Strategy, Wuhan University, Wuhan 430072, China;
2 School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF(562 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] To improve the efficiency of concepts filter by using three concept filter method with thesaurus and text. [Methods] This paper proposes a method for domain concepts triple-layer filter. Extract domain concepts from data sources containing thesaurus and text. Focuse on calculating the concepts properties and field properties of domain concepts through concepts correlation, concepts context and concepts territoriality. [Results] Experimental results show that the precision reaches 74.71% and the recall reaches 71.25% based on triple-layer filter method. [Limitations] Data sources are only about mapping, this paper doesn't use the data in other fields to demonstrate the feasibility of method. [Conclusions] This paper improves the precision and recall of domain concepts filter. Comprehensive efficiency is higher than other methods. This method could filter domain concepts from different subjects with high efficiency.

Key wordsTriple-layer concepts filter      Concepts correlation      Concepts context      Concepts territoriality      Thesaurus     
Received: 08 October 2014      Published: 21 May 2015
:  TP391  

Cite this article:

Duan Yufeng, Zhu Wenjing, Chen Qiao, Liu Wei, Liu Fenghong. A Domain Concepts Triple-layer Filter Method. New Technology of Library and Information Service, 2015, 31(4): 26-33.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.04.04     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I4/26

[1] 丁晟春, 傅柱. 基于航天叙词表的领域本体半自动化构建研究[J]. 情报理论与实践, 2011, 34(11): 113-116. (Ding Shengchun, Fu Zhu. Research on Semi-automatic Construction of Domain Ontology Based on Space Thesaurus [J]. Information Studies: Theory & Application, 2011, 34(11): 113-116.)
[2] Hahn V. Turning Informal Thesauri into Formal Ontologies: A Feasibility Study on Biomedical Knowledge Re-use [J]. Comparative and Functional Genomics, 2003, 4(1): 94-97.
[3] Missikoff M, Navigli R, Velardi P. Integrated Approach to Web Ontology Learning and Engineering [J]. Computer, 2002, 35(11): 60-63.
[4] 涂新辉, 何婷娉, 李芳, 等. 基于排序学习的文本概念标注方法研究[J]. 北京大学学报: 自然科学版, 2013, 49(1): 153-158. (Tu Xinhui, He Tingping, Li Fang, et al. Learning to Rank Concept Annotation for Text [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2013, 49(1): 153-158.)
[5] Shamsfard M., Barforoush A. Learning Ontologies from Natural Language Texts [J]. International Journal of Human Computer Studies, 2004, 60(1): 17-63.
[6] Damerau F J. Evaluating Domain-oriented Muiti-Word Terms from Text [J]. Information Processing and Management, 2006, 29(4): 433-447.
[7] Cohen J D. Highlights: Language-and Domain-Independent Automatic Indexing Terms for Abstracting [J]. Journal of the American Society Information Science, 2007, 46(3): 162-174.
[8] 顾晓雪, 章成志. 中文博客标签的聚类及可视化研究[J].情报理论与实践, 2014, 37(7): 116-121. (Gu Xiaoxue, Zhang Chengzhi. Clustering Analysis and Visualization on Chinese Blog Labels [J]. Information Studies: Theory & Application, 2014, 37(7): 116-121.)
[9] 常春, 赖院根. 数字环境下通用概念获取方法[J]. 图书情报工作, 2011, 55(22): 22-25. (Chang Chun, Lai Yuangen. Some Methods of Obtaining General Concepts in Digital Environment [J]. Library and Information Service, 2011, 55(22): 22-25.)
[10] ICTCLAS [EB/OL]. [2013-07-20]. http://ictclas.nlpir.org/.
[11] 段宇锋, 鞠菲. 基于N-Gram的专业领域中文新词识别研究[J]. 现代图书情报技术, 2012(2): 41-47. (Duan Yufeng, Ju Fei. Research on Chinese New Word Recognition in Specialized Field Based on N-Gram [J]. New Technology of Library and Information Service, 2012(2): 41-47.)
[12] 刘海峰, 陈琦, 张以皓. 一种基于互信息的改进文本特征选择[J]. 计算机工程与应用, 2012, 48(25): 1-4. (Liu Haifeng, Chen Qi, Zhang Yihao. Improved Mutual Information Method of Feature Selection in Text Categorization [J]. Computer Engineering and Applications, 2012, 48(25): 1-4.)
[13] 刘文龙, 张桂芸, 陈喆, 等. 基于加权信息熵相似性的协同过滤算法[J]. 郑州大学学报: 工学版, 2012, 33(5): 118-120. (Liu Wenlong, Zhang Guiyun, Chen Zhe, et al. Collaborative Filtering Algorithm Based on Weighted Information Entropy Similarity [J]. Journal of Zhengzhou University: Engineering Science, 2012, 33(5): 118-120.)
[14] 程波波, 张友华, 李绍稳, 等. 茶学本体学习中的概念抽取[J]. 计算机系统应用, 2010, 19(7): 111-114. (Cheng Bobo, Zhang Youhua, Li Shaowen, et al. Concept Extraction in Tea Ontology Learning [J]. Computer Systems & Applications, 2010, 19(7): 111-114.)
[15] 何琳. 基于多策略的领域本体术语抽取研究[J]. 中国索引, 2013, 11(1): 45-52. (He Lin. Domain Ontology Terminology Extraction Based on Integrated Strategy Method [J]. China Index, 2013, 11(1): 45-52.)

[1] Ying Wang,Sizhu Wu. Converting STKOS Metathesaurus to RDF Triples with R2RML[J]. 数据分析与知识发现, 2018, 2(12): 89-97.
[2] Zeng Xinhong, Cai Qinghe, Huang Huajun, Lin Weiming. Research on Non-uniform Node Clustered Graph Layout Algorithm for Visualization Based on Force Directed Model[J]. 现代图书情报技术, 2014, 30(9): 33-43.
[3] Li Peng, Zhu Lijun, Liu Yajie, Yan Yingying. Realization of Improved RBAC Model in Task Management in Normative Concepts Collaborative Construction Platform[J]. 现代图书情报技术, 2014, 30(2): 86-91.
[4] Yang He, Yang Yihong, Li Ning. Construction of Keywords-Chinese Library Classification Codes Integrated Thesaurus[J]. 现代图书情报技术, 2013, 29(7/8): 107-113.
[5] Ye Chunlei, Leng Fuhai. Building the Future-oriented Technology Thesaurus of Technology Roadmap[J]. 现代图书情报技术, 2013, (5): 59-63.
[6] Xian Guojian, Zhao Ruixue, Kou Yuantao, Zhu Liang, Zhang Jie. Study and Practice on Converting and Publishing Chinese Agricultural Thesaurus as Linked Open Data[J]. 现代图书情报技术, 2013, 29(11): 8-14.
[7] Zeng Xinhong, Cai Qinghe, Zeng Hanlong, Tang Cheng, Huang Huajun, Lin Weiming. The Research and Implementation of Clustered Graphs Layout Algorithm for OntoThesaurus Visualization[J]. 现代图书情报技术, 2012, (10): 8-15.
[8] Xian Guojian, Zhao Ruixue, Zhu Liang, Kou Yuantao. Conversion and Consumption of Chinese Agricultural Thesaurus as SKOS[J]. 现代图书情报技术, 2012, (10): 16-20.
[9] Ye Huanzhuo, Wu Di. Approximately Duplicate Data Cleaning Algorithm Based on Improved Edit Distance[J]. 现代图书情报技术, 2011, 27(7/8): 82-90.
[10] Ren Ruijuan, Mi Jia, Pu Demin, Zhang Shouhua, Liu Libin, Wang Le. The Design and Realization of ADORES[J]. 现代图书情报技术, 2011, 27(3): 9-16.
[11] Ji Shiyan. Discussion of Bibliographic Data Replacement Rules on the Second Edition Chinese Classified Thesaurus Upgrade[J]. 现代图书情报技术, 2011, 27(3): 94-97.
[12] Tian Jinfeng, Zeng Xinhong, Huang Huajun, Lin Weiming. Research on Automatic Construction of Definition Notes for Concepts in OntoThesaurus[J]. 现代图书情报技术, 2011, (11): 9-16.
[13] Chang Chun, Lai Yuangen. Research on Machine-aided Classification Methods of Domain Concepts[J]. 现代图书情报技术, 2011, 27(10): 34-39.
[14] Zeng Xinhong Huang Huajun Lin Weiming. Research on Retrieval and Reasoning of Ultra-Large-Scale OntoThesaurus[J]. 现代图书情报技术, 2010, 26(7/8): 58-65.
[15] Xiong Xia Chang Chun Wu Wenna. The Design and Implementation of Loop Error Checking Algorithm for Hierarchical Relationship[J]. 现代图书情报技术, 2010, 26(5): 18-22.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn