Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (1): 66-74    DOI: 10.11925/infotech.1003-3513.2015.01.10
Current Issue | Archive | Adv Search |
Study on Automatic Classification of Patents Oriented to TRIZ
Hu Zhengyin1,2, Fang Shu1, Wen Yi1, Zhang Xian1,2, Liang Tian1
1. Chengdu Document and Information Center, Chinese Academy of Sciences, Chengdu 610041, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes an approach to automatically classify patents oriented to TRIZ applications based on a personalized classification system. [Methods] A personalized classification system is constructed in micro-macro-meso levels using topic model. Then, an appropriate feature and classifier are chosen to preliminarily classify patents. The classifier is optimized by smoothing unbalance data and reducing features dimensions. [Results] This approach implements semi-automatically constructing a personalized classification and automatically classifying patents oriented to TRIZ applications. In medium data size, this approach can classify patents with F-measure value of 90.2%. [Limitations] This approach is not available in small size data set and not verified in big size data set. [Conclusions] This paper can classify patents oriented to TRIZ applications in medium data size.

Key wordsTRIZ      Topic model      Patent classification      Personalized classification system     
Received: 23 July 2014      Published: 12 February 2015
:  G353.1  

Cite this article:

Hu Zhengyin, Fang Shu, Wen Yi, Zhang Xian, Liang Tian. Study on Automatic Classification of Patents Oriented to TRIZ. New Technology of Library and Information Service, 2015, 31(1): 66-74.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.01.10     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I1/66

[1] Kaplan S. An Introduction to TRIZ: The Russian Theory of Inventive Problem Solving [EB/OL]. [2013-07-02]. http://www. trizasia.com/FileStorage/6341665956857300352005-Intro_to_TRIZ%20--%20for%20printer.pdf.
[2] Loh H T, He C, Shen L. Automatic Classification of Patent Documents for TRIZ Users [J]. World Patent Information, 2006, 28(1): 6-13.
[3] Hu Z Y, Fang S, Liang T. Automatic Patent Classification Oriented to Problems & Solutions [C]. In: Proceedings of Conference on Artificial Intelligence and Data Mining (AIDM'13), Sanya, China. 2013: 22-24.
[4] 胡正银, 方曙. 专利文本技术挖掘研究进展综述[J]. 现代图书情报技术, 2014(6): 62-70. (Hu Zhengyin, Fang Shu. Review on Text-based Patent Technology Mining [J]. New Technology of Library and Information Service, 2014(6): 62-70.)
[5] WIPO. International Patent Classification (Version 2014) [EB/OL]. [2014-06-01]. http://www.wipo.int/export/sites/www/classifications/ipc/en/guide/guide_ipc.pdf.
[6] He C, Loh H T. Pattern-oriented Associative Rule-based Patent Classification [J]. Expert Systems with Applications, 2010, 37(3): 2395-2404.
[7] 梁艳红, 檀润华, 马建红. 面向产品创新设计的专利文本分类研究[J]. 计算机集成制造系统, 2013, 19(2): 382-390. (Liang Yanhong, Tan Runhua, Ma Jianhong. Study on Patent Text Classification for Product Innovative Design [J]. Computer Integrated Manufacturing Systems, 2013, 19(2): 382-390.)
[8] Teichert T, Mittermayer M A. Text Mining for Technology Monitoring [C]. In: Proceedings of 2002 IEEE International Engineering Management (IEMC'02). IEEE, 2002: 596-601.
[9] Hu Z, Fang S, Liang T. Empirical Study of Constructing a Knowledge Organization System of Patent Documents Using Topic Modeling [J]. Scientometrics, 2014, 100 (3): 787-799.
[10] Blei D M. Probabilistic Topic Models [EB/OL]. [2013-06-12]. https://www.cs.princeton.edu/~blei/kdd-tutorial.pdf.
[11] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[12] Yates A, Cafarella M, Banko M, et al. TextRunner: Open Information Extraction on the Web [C]. In: Proceedings of NAACL-Demonstrations '07 of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, 2007: 25-26.
[13] Fader A, Soderland S, Etzioni O. Identifying Relations for Open Information Extraction [EB/OL]. [2013-03-02]. http://ai.cs.washington.edu/www/media/papers/reverb.pdf.
[14] Zhang Y, Porter A L, Hu Z, et al. "Term Clumping" for Technical Intelligence: A Case Study on Dye-sensitized Solar Cells [J]. Technological Forecasting and Social Change, 2014, 85: 26-39.
[15] Thomson Reuters. Thomson Data Analyzer [EB/OL]. [2013-03-03]. http://ip-science.thomsonreuters.com.cn/media/tda.pdf.
[16] The Stanford Natural Language Processing Group. Research [EB/OL]. [2013-03-03]. http://www-nlp.stanford.edu/research. shtml.
[17] Mimno D. Machine Learning with MALLET [EB/OL]. [2013-03- 03]. http://mallet.cs.umass.edu/mallet-tutorial.pdf.
[18] 杨建武. 文本自动分类技术 [EB/OL]. [2013-06-13]. http://www. icst.pku.edu.cn/course/mining/11-12spring/TextMining04-%E5%88%86%E7%B1%BB.pdf. (Yang Jianwu. Review on Text Classification [EB/OL]. [2013-06-13]. http://www.icst.pku.edu.cn/course/mining/11-12spring/TextMining04-%E5%88%86%E7%B1%BB.pdf. )
[19] 钱洪波, 贺广南. 非平衡类数据分类概述[J]. 计算机工程与科学, 2010, 32(5): 85-88. (Qian Hongbo, He Guangnan. A Survey of Class-imbalanced Data Classification [J]. Computer Engineering & Science, 2010, 32(5): 85-88.)
[20] Powers D M W. Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation [EB/OL]. [2013-03-03]. http://www.infoeng.flinders.edu.au/research/techreps/SIE07001.pdf.

[1] Yi Huifang,Liu Xiwen. Analyzing Patent Technology Topics with IPC Context-Enhanced Context-LDA Model[J]. 数据分析与知识发现, 2021, 5(4): 25-36.
[2] Zhang Xin,Wen Yi,Xu Haiyun. A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration[J]. 数据分析与知识发现, 2021, 5(3): 88-100.
[3] Zhao Tianzi, Duan Liang, Yue Kun, Qiao Shaojie, Ma Zijuan. Generating News Clues with Biterm Topic Model[J]. 数据分析与知识发现, 2021, 5(2): 1-13.
[4] Chen Hao, Zhang Mengyi, Cheng Xiufeng. Identifying Cross-Region Patent Collaboration Opportunities Using LDA and Decision Trees——Case Study of Universities from Guangdong and Wuhan[J]. 数据分析与知识发现, 2021, 5(10): 37-50.
[5] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[6] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[7] Hu Yongjun,Wei Tingting,Dou Zixin,Huang Yunyin,Liang Ruicheng,Chang Huiyou. Tech-Development Path of Knife-Scissor Industry in Guangdong with TRIZ Analysis of Patents[J]. 数据分析与知识发现, 2020, 4(2/3): 101-109.
[8] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
[9] Chen Wenjie. Predicting Research Collaboration Based on Translation Model[J]. 数据分析与知识发现, 2020, 4(10): 28-36.
[10] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[11] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[12] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[13] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[14] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[15] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn