Please wait a minute...
New Technology of Library and Information Service  2011, Vol. Issue (11): 9-16    DOI: 10.11925/infotech.1003-3513.2011.11.02
Current Issue | Archive | Adv Search |
Research on Automatic Construction of Definition Notes for Concepts in OntoThesaurus
Tian Jinfeng1, Zeng Xinhong1,2, Huang Huajun2, Lin Weiming2
1. College of Computer and Software, Shenzhen University, Shenzhen 518060, China;
2. Shenzhen University Library, Shenzhen 518060, China
Download: PDF(984 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  The paper proposes some methods of definition extraction for concepts in the comprehensive OntoThesaurus. They achieve good experiment effects and are applied to the actual OTCSS. Among them, an integrated algorithm named “two-dimensional relative quantity” based on “high-frequency words vector”and “TF*IDF vector”is presented. This algorithm can much effectively extract good results from that of the first two methods, and the effective information improving ratio can reach 60% generally.
Key wordsOntoThesaurus      OTCSS      Definition extraction      VSM      High-frequency words vector      TF*IDF vector      Two-dimensional relative quantity     
Received: 22 September 2011      Published: 06 January 2012
: 

TP18 TP301.6

 

Cite this article:

Tian Jinfeng, Zeng Xinhong, Huang Huajun, Lin Weiming. Research on Automatic Construction of Definition Notes for Concepts in OntoThesaurus. New Technology of Library and Information Service, 2011, (11): 9-16.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2011.11.02     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2011/V/I11/9

[1] W3C. SKOS Simple Knowledge Organization System Reference: W3C Recommendation . http://www.w3.org/TR/skos-reference/.

[2] 宋炜, 张铭. 语义网简明教程[M]. 北京:高等教育出版社, 2004: 22.

[3] 曾新红. 中文叙词表本体——叙词表与本体的融合[J]. 现代图书情报技术, 2009(1): 34-43.

[4] 曾新红, 明仲, 蒋颖,等.中文叙词表本体共建共享系统研究[J]. 情报学报 ,2008, 27(3): 386-394.

[5] 深圳大学图书馆NKOS研究室. 中国分类主题词表本体共建共享系统CCT1_OTCSS CCT1_OTCSS . http://nkos.lib.szu.edu.cn:8080/ThesaurusProjectForCCTWL/login.jsp.

[6] Riloff E, Jones R. Learning Dictionaries for Information Extraction by Multi-Level Boots trapping . In: Proceedings of the 16th National Conference on Artificial Intelligence(AAAI-99), Florida. AAAI Press / The MIT Press,1999.

[7] 贾爱平. 科技文献中术语定义的语言模式研究 . 北京:北京语言大学,2002.

[8] 张榕, 宋柔. 术语定义提取研究[J]. 术语标准化与信息技术, 2006 (1):29-32.

[9] Cui H, Kan M Y, Chua T S. Unsupervised Learning of Soft Patterns for Generating Definitions from Online News . In:Proceedings of the 13th World Wide Web Conference, NewYork. 2004:90-99.

[10] Lampouras G, Androutsopoulos I. Finding Short Definitions of Terms on the Web Pages . In:Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009: 1270-1279.

[11] 荀恩东, 李晟. 采用术语定义模式和多特征的新术语及定义识别方法[J]. 计算机研究与发展, 2009, 46(1):62-69.

[12] 许勇, 荀恩东, 贾爱平,等. 基于互联网的术语定义获取系统[J]. 中文信息学报, 2004, 18(4):37-43.

[13] Joho H, Sanderson M. Retrieving Descriptive Phrases from Large Amounts of Free Text . In:Proceedings of the 9th International Conference on Information and Knowledge Management. New York: ACM Press,2000: 180-186.

[14] Klavans J L, Muresan S. Evaluation of DEFINDER: A System to Mine Definitions from Consumer-Oriented Medical Text . In: Proceedings of the 1st ACM/IEEE Joint Conference on Digital Libraries. Virginia. ACM Press, 2001: 201-202.

[15] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008.

[16] 程显毅, 朱倩, 王进. 中文信息抽取原理及应用[M]. 北京: 科学出版社, 2010.

[17] 黄萱菁, 夏迎炬, 吴立德. 基于向量空间模型的文本过滤系统[J]. 软件学报 ,2003, 14(3):435-442.
[1] Xiangquan Yin,Shuning Li. Analyzing Website Navigation Features of Top U.S. Academic Libraries[J]. 数据分析与知识发现, 2017, 1(3): 90-95.
[2] Zeng Xinhong, Cai Qinghe, Huang Huajun, Lin Weiming. Research on Non-uniform Node Clustered Graph Layout Algorithm for Visualization Based on Force Directed Model[J]. 现代图书情报技术, 2014, 30(9): 33-43.
[3] Tan Xueqing, Zhou Tong, Luo Lin. A Text Classification Algorithm Based on the Average Category Similarity[J]. 现代图书情报技术, 2014, 30(9): 66-73.
[4] Shen Gengyu, Huang Shuiqing, Wang Dongbo. On the Scientific Research Teams Identification Method Taking Co-authorship of Collaboration as the Source Data[J]. 现代图书情报技术, 2013, 29(1): 57-62.
[5] Huang Huajun, Zeng Xinhong, Lin Weiming. Research and Implementation about Linked Data Service of OTCSS[J]. 现代图书情报技术, 2012, 28(7): 40-47.
[6] Zeng Xinhong, Cai Qinghe, Zeng Hanlong, Tang Cheng, Huang Huajun, Lin Weiming. The Research and Implementation of Clustered Graphs Layout Algorithm for OntoThesaurus Visualization[J]. 现代图书情报技术, 2012, (10): 8-15.
[7] Zeng Xinhong Huang Huajun Lin Weiming. Research on Retrieval and Reasoning of Ultra-Large-Scale OntoThesaurus[J]. 现代图书情报技术, 2010, 26(7/8): 58-65.
[8] Wang Kai,Wang Chaofei. A Table Retrieval Algorithm Based on the Vector Space Model[J]. 现代图书情报技术, 2010, 26(4): 41-45.
[9] Sun Sufen,Luo Changshou,Wei Qingfeng. Design and Implementation of Automatic Question-answering System About Agricultural Operative Technology Based on Web[J]. 现代图书情报技术, 2009, 25(7-8): 70-74.
[10] Chen Bing,Tai Xiaoying. Semantic Retrieval Using Ontology and Document Refinement[J]. 现代图书情报技术, 2009, 25(12): 42-46.
[11] Zeng Xinhong,Lin Weiming,Ming Zhong . Research and Implementation of Consistency Checking Mechanism for OntoThesaurus[J]. 现代图书情报技术, 2008, 24(5): 1-9.
[12] Zeng Xinhong, LIN Weiming, Ming Zhong. Implementing Retrieval to OntoThesaurus and Research on Its Terminology Service[J]. 现代图书情报技术, 2008, 24(2): 8-13.
[13] Li Lei,Zhou Guomin . A Personalized Search Engine System[J]. 现代图书情报技术, 2007, 2(1): 81-85.
[14] Bai Rujiang . A Hybrid Classifier Based on the Rough Sets and RBF Neural Networks[J]. 现代图书情报技术, 2006, 1(6): 47-51.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn