Please wait a minute...
Advanced Search
现代图书情报技术  2014, Vol. 30 Issue (6): 62-70     https://doi.org/10.11925/infotech.1003-3513.2014.06.07
  情报分析与研究 本期目录 | 过刊浏览 | 高级检索 |
专利文本技术挖掘研究进展综述
胡正银1,2, 方曙1
1. 中国科学院成都文献情报中心 成都 610041;
2. 中国科学院大学 北京 100049
Review on Text-based Patent Technology Mining
Hu Zhengyin1,2, Fang Shu1
1. Chengdu Library, Chinese Academy of Sciences, Chengdu 610041, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China
全文: PDF (479 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]归纳基于文本专利技术挖掘通用流程, 提炼其中关键技术, 并对典型挖掘场景进行分析。[文献范围]以“专利挖掘、专利分析”等关键词在Elsevier、Springer、CNKI数据库进行检索, 并参考全球技术挖掘相关会议, 共阅读相关文献105篇, 实际参考文献66篇。[方法]梳理其关键技术专利知识表示的研究现状与发展趋势, 选取三类典型技术挖掘场景进行分析, 通过归纳总结、提炼出专利技术挖掘未来发展趋势与研究热点。[结果]专利知识表示的粒度与结构决定了专利技术挖掘的深度、广度与维度。基于SAO基础语义单元, 面向技术难题与解决方案的专利技术挖掘有望成为未来发展趋势与研究热点。[局限]本研究仅探讨现有文本挖掘、统计分析、自然语言处理技术在专利技术挖掘中的应用情况, 对这些技术本身的发展趋势关注不足。[结论]本研究有助于全面了解专利技术挖掘的概貌、涉及的关键技术及主要应用场景。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
方曙
胡正银
关键词 专利技术挖掘语义知识表示主题聚类专利分类技术演化    
Abstract

[Objective] This paper generalizes the framework of patent technology mining based on text, extracts the key techniques and analyzes some typical application scenarios. [Coverage]Chooses 105 papers from Elsevier, Springer, CNKI databases and Global TechMining Conference, and refers 66 papers at last. [Methods]Review semantic knowledge representation of patents, analyze the research progress of three typical technology mining scenarios and summarize the hot research topics of patent technology mining based on text. [Results] The result shows that the semantic knowledge representation of patents is very important to patent technology mining. And patent technology mining oriented to problems and solutions based on SAO units will be the hot research topics. [Limitations]Only focuse on the applications in patent technology mining of the techniques (e. g. Text Mining, Statistics and Natural Language Processing), but the development trendency of these techniques need to pay more attention. [Conclusions] This paper will facilitate to give an overview of patent technology mining, the key problems and the typical application scenarios.

Key wordsPatent technology mining    Semantic knowledge representation    Topic clustering    Patent classification    Technology evolution
收稿日期: 2013-12-03      出版日期: 2014-07-09
:  G353.1  
基金资助:

本文系中国科学院西部之光项目“基于本体的专利文献技术挖掘系统研究与实践”的研究成果之一。

通讯作者: 胡正银E-mail:huzy@clas.ac.cn     E-mail: huzy@clas.ac.cn
作者简介: 作者贡献声明:胡正银:研究过程实施,进行具体文献调研、分析与论文撰写;方曙:研究命题的提出、设计,论文修订。
引用本文:   
胡正银, 方曙. 专利文本技术挖掘研究进展综述[J]. 现代图书情报技术, 2014, 30(6): 62-70.
Hu Zhengyin, Fang Shu. Review on Text-based Patent Technology Mining. New Technology of Library and Information Service, 2014, 30(6): 62-70.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2014.06.07      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2014/V30/I6/62

[1] Porter A L, Cunninggham S W. Tech Mining: Exploiting New Technologies for Competitive Advantage[M]. Hoboken, New Jersey: John Wiley & Sons, Inc., 2005: 17-26.
[2] 王朝晖. 专利文献的特点及其利用[J]. 现代情报, 2008, 28 (9): 151-152, 156. (Wang Zhaohui. Characteristics and Utilization of Patent Documentation[J]. Modern Information, 2008, 28(9): 151-152, 156.)
[3] 吕详惠, 仇宝艳, 乔鸿. 基于本体的专利知识发现体系研究[J]. 计算机与信息技术, 2008 (7): 43-46. (Lv Xianghui, Qiu Baoyan, Qiao Hong. Study to Patent Knowledge Discovery Based on Ontology[J]. Computer and Information Technology, 2008(7): 43-46.)
[4] Porter A L, Zhang Y. Text Clumping for Technical Intelligence[EB/OL]. [2013-11-20]. http://www.intechopen. com/books/theory-and-applications-for-advanced-text-mining/text-clumping-for-technical-intelligence.
[5] Giereth M, Stäbler A, Brügmann S, et al. Application of Semantic Technologies for Representing Patent Metadata[C]. In: Proceeding of the 1st International AST Workshop, Informatik 2006, Dresden, Germany.2006.
[6] Wanner L. Advanced Patent Document Processing Techniques[EB/OL]. [2013-11-04]. ftp://ftp.cordis.europa. eu/pub/ist/docs/kct/patexpert-annualrep07_en.pdf.
[7] Wanner L, Baeza-Yates R, Brügmann S, et al. Towards Content-Oriented Patent Document Processing[J]. World Patent Information, 2008, 30(1): 21-33.
[8] Mukherjea S, Bamba B, Kankar P. Information Retrieval and Knowledge Discovery Utilizing a Biomedical Patent Semantic Web[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(8): 1099-1110.
[9] Ghoula N, Khelif K, Dieng-Kuntz R. Supporting Patent Mining by Using Ontology-based Semantic Annotations[C]. In: Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence. IEEE, 2007: 435-438.
[10] 姜彩红, 乔晓东, 朱礼军. 基于本体的专利摘要知识抽取[J]. 现代图书情报技术, 2009(2): 23-28. (Jiang Caihong, Qiao Xiaodong, Zhu Lijun. Ontology-based Patent Abstracts Knowledge Extraction[J]. New Technology of Library and Information Service, 2009(2): 23-28.)
[11] Yoon B, Park Y. A Text-Mining-based Patent Network: Analytical Tool for High-Technology Trend [J]. The Journal of High Technology Management Research, 2004, 15(1): 37-50.
[12] Lee S, Lee S, Seol H, et al. Using Patent Information for Designing New Product and Technology: Keyword Based Technology Roadmapping[J]. R&D Management, 2008, 38(2): 169-188.
[13] Kim Y G, Suh J H, Park S C. Visualization of Patent Analysis for Emerging Technology[J]. Expert Systems with Applications, 2008, 34(3): 1804-1812.
[14] Sekimizu T, Park H S, Tsujii J. Identifying the Interaction Between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts [C]. In: Proceedings of the 9th Workshop on Genome Informatics (GIW' 98), Tokyo, Japan.1998, 9: 62-71.
[15] Blaschke C, Andrade M A, Ouzounis C, et al. Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions[C]. In: Proceedings of the 7th International Conference on Intelligent System for Molecular Biology. The AAAI Press, 1999: 60-67.
[16] Thomas J, Milward D, Ouzounis C, et al. Automatic Extraction of Protein Interactions from Scientific Abstracts[C]. In: Proceedings of Pacific Symposium on Biocomputing. 2000: 541-552.
[17] Ono T, Hishigaki H, Tanigami A, et al. Automated Extraction of Information on Protein-Protein Interactions from the Biological Literature[J]. Bioinformatics, 2001, 17(2): 155-161.
[18] de Bruijin B, Martin J. Getting to the (C) ore of Knowledge: Mining Biomedical Literature[J]. International Journal of Medical Informatics, 2002, 67(1-3): 7-18.
[19] Jensen L J, Saric J, Bork P. Literature Mining for the Biologist: From Information Retrieval to Biological Discovery[J]. Nature Reviews Genetics, 2006, 7(2): 119-129.
[20] Aronson A R, Rindflesch T C. Semantic Knowledge Representation Project[EB/OL]. [2013-03-09]. http://skr.nlm.nih. gov/papers/references/BoSC98_rpt.pdf.
[21] Cascini G. System and Method for Performing Functional Analyses Making Use of a Plurality of Inputs: U.S. Patent Application US20050210382, European Patent Office EP1351156A1, International Publication Number WO20030 77154A3[P]. 2003-09-18.
[22] Cascini G, Rissone P. PAT-Analyzer: A Tool to Speed-up Patent Analyses with a TRIZ Perspective[C]. In: Proceedings of the ETRIA World Conference: TRIZ Future 2003, Aachen, Germany.2003.
[23] Cascini G, Fantechi A, Spinicci E. Natural Language Processing of Patents and Technical Documentation[C]. In: Proceedings of the 6th International Workshop on Document Analysis System. Berlin, Heidelberg: Springer-Verlag, 2004: 508-520.
[24] Verbitsky M. Semantic TRIZ[EB/OL]. [2012-09-10]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.115.1907&rep=rep1&type=pdf.
[25] IHS Inc. Optimize Decision-Making Across the Product Lifecycle[EB/OL]. [2012-12-02]. http://inventionmachine. com/Portals/56687/docs/OptimizingDecisionMakingAcrosstheProductLifecycle_WhitePaper_InventionMachine.pdf.
[26] Sheremetyeva S. Natural Language Analysis of Patent Claims[C]. In: Proceedings of the ACL-2003 Workshop on Patent Corpus Processing.2003: 66-73.
[27] Yang S Y, Lin S Y, Lin S N, et al. Automatic Extraction of Semantic Relations from Patent Claims[J]. International Journal of Electronic Business Management, 2008, 6(1): 45-54.
[28] Yang S Y, Soo V W. Extract Conceptual Graphs from Plain Texts in Patent Claims[J]. Engineering Applications of Artificial Intelligence, 2012, 25(4): 874-887.
[29] Choi S, Park H, Kang D, et al. An SAO-based Text Mining Approach to Building a Technology Tree for Technology Planning[J]. Expert Systems with Applications, 2012, 39(13): 11443-11455.
[30] Choi S, Kang D, Lim J. et al. A Fact-oriented Ontological Approach to SAO-based Function Modeling of Patents for Implementing Function-based Technology Database[J]. Expert Systems with Applications, 2012, 39(10): 9129-9140.
[31] Kim Y, Tian Y, Jeong Y, et al. Automatic Discovery of Technology Trends from Patent Text[C]. In: Proceedings of the 2009 ACM Symposium on Applied Computing. New York, NY, USA: ACM, 2009.
[32] Hu Z Y, Fang S, Liang T. Automatic Patent Classification Oriented to Problems & Solutions[C]. In: Proceedings of Conference on Artificial Intelligence and Data Mining (AIDM 2013), Sanya, China.2013.
[33] Tseng Y H, Lin C J, Lin Y I. Text Mining Techniques for Patent Analysis[J]. Information Processing & Management, 2007, 43(5): 1216-1247.
[34] Kang I S, Na S H, Kim J. Cluster-based Patent Retrieval[J]. Information Processing & Management, 2007, 43(5): 1173-1182.
[35] Wang M Y, Chang D S, Kao C H. Identifying Technology Trends for R&D Planning Using TRIZ and Text Mining[J]. R&D Management, 2010, 40(5): 491-509.
[36] Yoon J, Kim K. Detecting Signals of New Technological Opportunities Using Semantic Patent Analysis and Outlier Detection[J]. Scientometrics, 2012, 90(2): 445-461.
[37] Kisliuk B. Introduction to the Cooperative Patent Classification(CPC)[EB/OL]. [2013-10-10]. http://www.uspto. gov/about/advisory/ppac/120927-09a-international_cpc.pdf.
[38] Fall C J, Törcsvári A, Benzineb K, et al. Automated Categorization in the International Patent Classification[J]. ACM SIGIR Forum, 2003, 37(1): 10-25.
[39] 刘玉琴, 桂婕, 朱东华. 基于IPC知识结构的专利自动分类方法[J]. 计算机工程, 2008, 34(3): 207-209. (Liu Yuqin, Gui Jie, Zhu Donghua. Automated Categorization of Patent Based on IPC Knowledge Construction[J]. Computer Engineering, 2008, 34(3): 207-209.)
[40] Krier M, Zaccà F. Automatic Categorisation Applications at the European Patent Office[J]. World Patent Information, 2002, 24(3): 187-196.
[41] Falasco L. Bases of the United States Patent Classification[J]. World Patent Information, 2002, 24(1): 31-33.
[42] Teichert T, Mittermayer M A. Text Mining for Technology Monitoring[C]. In: Proceedings of 2002 IEEE International Engineering Management Conference(IEMC' 02). IEEE, 2002: 596-601.
[43] Lai K K, Wu S J. Using the Patent Co-Citation Approach to Establish a New Patent Classification System[J]. Information Processing & Management, 2005, 41(2): 313-330.
[44] Hu Z Y, Fang S, Zhang X, et al. Empirical Study of Constructing Knowledge Organization System of Patent Documents Using Topic Model[C]. In: Proceedings of the 2nd Global TechMining Conference, Montreal, Canada.2012.
[45] 郭炜强, 戴天, 文贵华. 基于领域知识的专利自动分类[J].计算机工程, 2005, 34(23): 52-54. (Guo Weiqiang, Dai Tian, Wen Guihua. A Patent Classification Method Based on Domain Knowledge[J]. Computer Engineering, 2005, 34(23): 52-54.)
[46] Kim J H, Choi K S. Patent Document Categorization Based on Semantic Structural Information[J]. Information Processing & Management, 2007, 43(5): 1200-1215.
[47] Mazur G. Theory of Inventive Problem Solving (TRIZ) [EB/OL]. [2013-08-12]. http://www.mazur.net/triz/.
[48] Loh H T, He C, Shen L. Automatic Classification of Patent Documents for TRIZ Users[J]. World Patent Information, 2006, 28(1): 6-13.
[49] He C, Loh H T. Grouping of TRIZ Inventive Principles to Facilitate Automatic Patent Classification[J]. Expert Systems with Applications, 2008, 34(1): 788-795.
[50] He C, Loh H T. Pattern-oriented Associative Rule-based Patent Classification[J]. Expert Systems with Applications, 2010, 37(3): 2395-2404.
[51] 梁艳红, 檀润华, 马建红. 面向产品创新设计的专利文本分类研究[J]. 计算机集成制造系统, 2013, 19(2): 382-390. (Liang Yanhong, Tan Runhua, Ma Jianhong. Study on Patent Text Classification for Product Innovative Design[J]. Computer Integrated Manufacturing Systems, 2013, 19(2): 382-390.)
[52] 翟继强, 王克奇.依据TRIZ发明原理的中文专利自动分类[J]. 哈尔滨理工大学学报, 2013, 18(3): 1-5. (Zhai Jiqiang, Wang Keqi. Automatic Classification of Chinese Patents According to TRIZ Inventive Principles[J]. Journal of Harbin University of Science and Technology, 2013, 18(3): 1-5.)
[53] Yoon B, Park Y. Development of New Technology Forecasting Algorithm: Hybrid Approach for Morphology Analysis and Conjoint Analysis of Patent Information[J]. IEEE Transactions on Engineering Management, 2007, 54(3): 588-599.
[54] Yoon B, Phaal R, Probert D. Morphology Analysis for Technology Road Mapping: Application of Text Mining[J]. R&D Management, 2008, 38(1): 51-68.
[55] Lee S, Yoon B, Park Y. An Approach to Discovering New Technology Opportunities: Keyword-based Patent Map Approach[J]. Technovation, 2009, 29(6-7): 481-497.
[56] 方曙, 胡正银, 庞弘燊, 等. 基于专利文献的技术演化分析方法研究[J]. 图书情报工作, 2011, 55(22): 42-46. (Fang Shu, Hu Zhengyin, Pang Hongshen, et al. Study on the Method of Analyzing Technology Evolution Based on Patent Documents[J]. Library and Information Service, 2011, 55(22): 42-46.)
[57] Petrov V. The Laws of System Evolution[EB/OL]. [2013- 08-12]. http://www.triz-journal.com/archives/2002/03/b/.
[58] Mann D. System Operator Tutorial: - 1) 9-Windows on the World[EB/OL]. [2013-08-12]. http://www.triz-journal.com/archives/2001/09/c/index.htm.
[59] Yoon J, Kim K. An Automated Method for Identifying TRIZ Evolution Trends from Patents[J]. Expert Systems with Applications, 2011, 38(12): 15540-15548.
[60] Yoon J, Kim K. Identifying Rapidly Evolving Technological Trends for R&D Planning Using SAO-based Semantic Patent Networks[J]. Scientometrics, 2011, 88(1): 213-228.
[61] Park H, Ree J J, Kim K. An SAO-based Approach to Patent Evaluation Using TRIZ Evolution Trends[C]. In: Proceedings of the 6th International Conference on Management of Innovation and Technology(ICMIT). IEEE, 2012.
[62] Park H, Ree J J, Kim K. Identification of Promising Patents for Technology Transfers Using TRIZ Evolution Trends[J]. Expert Systems with Applications, 2013, 40(2): 736-743.
[63] Zhang Y, Porter A L, Hu Z Y. An Inductive Method for “Term Clumping”: A Case Study on Dye-Sensitized Solar Cells[C]. In: Proceedings of the International Conference on Innovative Methods for Innovation Management and Policy, Beijing, China.2012.
[64] Zhang Y, Porter A L, Gomila J MV, et al. Discovering Emerging Technology Trends: With TRIZ and Technology Road Mapping[C]. In: Proceedings of the 2nd Global TechMining Conference, Montreal, Canada.2012.
[65] Blei D M. Probabilistic Topic Models[EB/OL]. [2013-06-12]. https: //www.cs.princeton.edu/~blei/kdd-tutorial.pdf.Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.

[1] 侯剑华,刘盼. 专利技术系统演化的技术熵测度模型与实证研究 *[J]. 数据分析与知识发现, 2019, 3(8): 21-29.
[2] 贾杉杉, 刘畅, 孙连英, 刘小安, 彭涛. 基于多特征多分类器集成的专利自动分类研究*[J]. 数据分析与知识发现, 2017, 1(8): 76-84.
[3] 胡正银, 方曙, 文奕, 张娴, 梁田. 面向TRIZ的专利自动分类研究[J]. 现代图书情报技术, 2015, 31(1): 66-74.
[4] 刘雅静, 王衍喜, 郝丹, 周津慧. 机构知识库支撑科研服务方法研究[J]. 现代图书情报技术, 2014, 30(3): 1-7.
[5] 马海群. 网络环境下的国际专利分类法IPC变革与发展[J]. 现代图书情报技术, 2002, 18(6): 41-43.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn