Study on Ontology Relation Extraction in Chinese Patent Documents
Gu Jun1, Xu Xin2
1. Baoshan Iron and Steel Co, Ltd., Shanghai 201900, China; 2. Department of Informatics, Business School, East China Normal University, Shanghai 200241, China
Abstract:This paper promotes a method which collects the non-taxonomic relation from the Chinese patents' texts. Firstly, it analyzes the syntax of abstract texts, then constructs the sub-sentences extraction rules by domain sentence,character sentence, module & craft sentence and effect sentence. Secondly, artificially labels the terms of sub-sentences by label symbols such as BIEO, creates a scale of training data set. Thirdly, learns the training data and extracts the new data by CRFs. Finally, analyzes the experiment results and verifies the validity of the method.
谷俊, 许鑫. 中文专利中本体关系获取研究[J]. 现代图书情报技术, 2013, 29(10): 73-78.
Gu Jun, Xu Xin. Study on Ontology Relation Extraction in Chinese Patent Documents. New Technology of Library and Information Service, 2013, 29(10): 73-78.
[1] 邵波. 企业竞争与反竞争情报中的专利分析研究[J]. 情报科学, 2006, 24(2):235-238.(Shao Bo. The Patent Analysis of Enterprise Competitive Intelligence and Counterintelligence[J]. Information Science,2006,24(2):235-238.) [2] 100004说明书摘要[EB/OL].[2012-02-13].http://www.sipo.gov.cn/bgxz/zlsqbg/ty/100004smszy.doc. (Instruction Abstract of 100004[EB/OL].[2012-02-13].http://www.sipo.gov.cn/bgxz/zlsqbg/ty/100004smszy.doc.) [3] Girju R,Moldovan D I. Text Mining for Causal Relations[C].In: Proceedings of the 15th International Florida Artificial Intelligence Research Society Conference, Florida, USA.AAAI Press,2002:360-364. [4] Byrd R J, Ravin Y. Identifying and Extracting Relations in Text[C]. In: Proceedings of the 4th International Conference Application of Natural Language to Information System.1999. [5] Maedche A,Staab S. Discovering Conceptual Relations from Text[C]. In: Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2000). 2000:321-325. [6] 谭力,史忠植.基于数据挖掘的本体关系学习算法[J]. 郑州大学学报:理学版,2008,40(3):40-43.(Tan Li,Shi Zhongzhi. Ontology Conceptual Relation Learning Algorithm Based on Data Mining[J].Journal of Zhengzhou University:Natural Science Edition,2008,40(3):40-43.) [7] 董丽丽,胡云飞,张翔.一种领域概念非分类关系的获取方法[J]. 计算机工程与应用, 2013,49(4):157-161.(Dong Lili,Hu Yunfei,Zhang Xiang. Method for Non-taxonomical Relations from Domain Concepts[J]. Computer Engineering and Applications,2013,49(4):157-161.) [8] 于娟,党延忠.本体关系学习方法研究——概念特征词法[J]. 系统工程理论与实践,2012,32(7):1582-1590.(Yu Juan, Dang Yanzhong. Learning Ontology Relations from Documents:The Concept-feature Method[J]. Systems Engineering—Theory & Practice,2012,32(7):1582-1590.) [9] Li L, Zhou R, Huang D. Two-phase Biomedical Named Entity Recognition Using CRFs[J]. Computational Biology and Chemistry,2009,33(4):334-338. [10] Peng L, Liu Z, Zhang L. A Recognition Approach Study on Chinese Field Term Based Mutual Information/Conditional Random Fields[J]. Procedia Engineering,2012,29:1952-1956. [11] Chen L, Qi L, Wang F. Comparison of Feature-level Learning Methods for Mining Online Consumer Reviews[J]. Expert Systems with Applications, 2012,39(10): 9588-9601. [12] Esuli A, Marcheggiani D, Sebastiani F. An Enhanced CRFs-based System for Information Extraction from Radiology Reports[J].Journal of Biomedical Informatics,2013,46(3):425-435. [13] 国家知识产权局[EB/OL].[2012-02-14]. http://www.sipo.gov.cn. (State Intellectual Property Office of the People's Republic of China[EB/OL]. [2012-02-14]. http://www.sipo.gov.cn.) [14] ICTCLAS特色[EB/OL]. [2011-01-10]. http://ictclas.org/ictclas_feature.html. (Features of ICTCLAS[EB/OL]. [2011-01-10]. http://ictclas.org/ictclas_feature.html.) [15] Lafferty J D, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]. In: Proceedings of the 18th International Conference on Machine Learning (ICICML'01), Williamstown, MA, USA. San Francisco,CA, USA: Morgan Kaufmann Publishers Inc.,2001:282-289. [16] CRF + +: Yet Another CRF Toolkit [EB/OL].[2012-03-11].http://crfpp.googlecode.com/svn/trunk/doc/index.html. [17] 谷俊,王昊.基于领域中文文本的术语抽取方法研究[J]. 现代图书情报技术,2011(4):29-34.(Gu Jun,Wang Hao. Study on Term Extraction on the Basis of Chinese Domain Texts[J].New Technology of Library and Information Service,2011(4):29-34.)