Abstract:The paper proposes some methods of definition extraction for concepts in the comprehensive OntoThesaurus. They achieve good experiment effects and are applied to the actual OTCSS. Among them, an integrated algorithm named “two-dimensional relative quantity” based on “high-frequency words vector”and “TF*IDF vector”is presented. This algorithm can much effectively extract good results from that of the first two methods, and the effective information improving ratio can reach 60% generally.
田金凤, 曾新红, 黄华军, 林伟明. 中文叙词表本体概念定义注释的自动构建研究[J]. 现代图书情报技术, 2011, (11): 9-16.
Tian Jinfeng, Zeng Xinhong, Huang Huajun, Lin Weiming. Research on Automatic Construction of Definition Notes for Concepts in OntoThesaurus. New Technology of Library and Information Service, 2011, (11): 9-16.
[1] W3C. SKOS Simple Knowledge Organization System Reference: W3C Recommendation . http://www.w3.org/TR/skos-reference/.[2] 宋炜, 张铭. 语义网简明教程[M]. 北京:高等教育出版社, 2004: 22.[3] 曾新红. 中文叙词表本体——叙词表与本体的融合[J]. 现代图书情报技术, 2009(1): 34-43.[4] 曾新红, 明仲, 蒋颖,等.中文叙词表本体共建共享系统研究[J]. 情报学报 ,2008, 27(3): 386-394.[5] 深圳大学图书馆NKOS研究室. 中国分类主题词表本体共建共享系统CCT1_OTCSS CCT1_OTCSS . http://nkos.lib.szu.edu.cn:8080/ThesaurusProjectForCCTWL/login.jsp.[6] Riloff E, Jones R. Learning Dictionaries for Information Extraction by Multi-Level Boots trapping . In: Proceedings of the 16th National Conference on Artificial Intelligence(AAAI-99), Florida. AAAI Press / The MIT Press,1999.[7] 贾爱平. 科技文献中术语定义的语言模式研究 . 北京:北京语言大学,2002.[8] 张榕, 宋柔. 术语定义提取研究[J]. 术语标准化与信息技术, 2006 (1):29-32.[9] Cui H, Kan M Y, Chua T S. Unsupervised Learning of Soft Patterns for Generating Definitions from Online News . In:Proceedings of the 13th World Wide Web Conference, NewYork. 2004:90-99.[10] Lampouras G, Androutsopoulos I. Finding Short Definitions of Terms on the Web Pages . In:Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009: 1270-1279.[11] 荀恩东, 李晟. 采用术语定义模式和多特征的新术语及定义识别方法[J]. 计算机研究与发展, 2009, 46(1):62-69.[12] 许勇, 荀恩东, 贾爱平,等. 基于互联网的术语定义获取系统[J]. 中文信息学报, 2004, 18(4):37-43.[13] Joho H, Sanderson M. Retrieving Descriptive Phrases from Large Amounts of Free Text . In:Proceedings of the 9th International Conference on Information and Knowledge Management. New York: ACM Press,2000: 180-186.[14] Klavans J L, Muresan S. Evaluation of DEFINDER: A System to Mine Definitions from Consumer-Oriented Medical Text . In: Proceedings of the 1st ACM/IEEE Joint Conference on Digital Libraries. Virginia. ACM Press, 2001: 201-202.[15] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008.[16] 程显毅, 朱倩, 王进. 中文信息抽取原理及应用[M]. 北京: 科学出版社, 2010.[17] 黄萱菁, 夏迎炬, 吴立德. 基于向量空间模型的文本过滤系统[J]. 软件学报 ,2003, 14(3):435-442.