Clustering Technology Topics Based on Patent Multi-Attribute Fusion
Liu Xiaoling1(),Tan Zongying1,2
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China 2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] Reasonable, effective and accurate classification of technology topics is of great significance. This article integrates multiple attributes of patents to improve the division of technology topics. [Methods] First, we constructed the patent text vector, the patent citation vector and the patent classification vector based on text contents, citation relationship and classification information of the patents. Then, we obtained a new patent vector based on multi-attribute fusion of the three vectors. Finally, we identified technology topics through patent clustering analysis. [Results] Compared with the patent vector representation method based on single or two attributes, our method had higher patent classification precision, recall rate and F1 value on different IPC classification levels and sample sizes. Our measurement of patent similarity was also more accurate. [Limitations] We used automatic classification for patents rather than direct methods to evaluate the effect of technology topic division. [Conclusions] The proposed method improves the accuracy of patent similarity measurement and technology topic division.
刘小玲, 谭宗颖. 基于专利多属性融合的技术主题划分方法研究[J]. 数据分析与知识发现, 2022, 6(2/3): 45-54.
Liu Xiaoling, Tan Zongying. Clustering Technology Topics Based on Patent Multi-Attribute Fusion. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 45-54.
( Hu Apei, Zhang Jing, Lei Xiaoping, et al. A Review of Technical Topic Analysis Based on Text Mining[J]. Journal of Intelligence, 2013, 32(12):88-92.)
( Shen Jun, Wang Xukun, Chen Yue, et al. Analysis on Technology Focus from the Perspective of Strategic Diagram: A Case in the Field of 3G Mobile Communication[J]. Journal of Intelligence, 2012, 31(11):88-94.)
( Huang Lu, Zhu Yihe, Zhang Yi. Research on Identification of Emerging Topics Based on Link Prediction with Weighted Networks[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(4):335-341.)
[4]
Kajikawa Y, Yoshikawa J, Takeda Y, et al. Tracking Emerging Technologies in Energy Research: Toward a Roadmap for Sustainable Energy[J]. Technological Forecasting and Social Change, 2008, 75(6):771-782.
doi: 10.1016/j.techfore.2007.05.005
[5]
Small H, Boyack K W, Klavans R. Identifying Emerging Topics in Science and Technology[J]. Research Policy, 2014, 43(8):1450-1467.
doi: 10.1016/j.respol.2014.02.005
[6]
Hopcroft J, Khan O, Kulis B, et al. Tracking Evolving Communities in Large Linked Networks[J]. PNAS, 2004, 101(S1):5249-5253.
doi: 10.1073/pnas.0307750100
[7]
Feng L J, Niu Y X, Liu Z F, et al. Discovering Technology Opportunity by Keyword-Based Patent Analysis: A Hybrid Approach of Morphology Analysis and USIT[J]. Sustainability, 2019, 12(1):136.
doi: 10.3390/su12010136
( Xue Jincheng, Jiang Di, Wu Jiande. Research on Automatic Patent Text Classification Based on Word2Vec[J]. Information Technology, 2020, 44(2):73-77.)
[9]
Trappey A J C, Trappey C V, Chang A C. Intelligent Extraction of a Knowledge Ontology from Global Patents[J]. International Journal on Semantic Web and Information Systems, 2020, 16(4):61-80.
doi: 10.4018/IJSWIS.2020100104
[10]
Wang J, Hsu C C. A Topic-Based Patent Analytics Approach for Exploring Technological Trends in Smart Manufacturing[J]. Journal of Manufacturing Technology Management, 2020, 32(1):110-135.
doi: 10.1108/JMTM-03-2020-0106
[11]
周京生. 融合视角下智能交通技术主题演进研究[D]. 大连: 大连理工大学, 2019.
[11]
( Zhou Jingsheng. Research on the Evolution of Intelligent Transportation Technology Topics on the Perspective of Convergence[D]. Dalian: Dalian University of Technology, 2019.)
( Luo Jian, Cai Lijun, Shi Min. Two-Stage Identification of Emerging Technologies Based on Patent—Take the Field of Image Identification as an Example[J]. Information Science, 2019, 37(12):57-62.)
( Liu Xiaoling, Tan Zongying. Explore the Method of Technology Evolution Research Based on Patents Network[J]. Studies in Science of Science, 2013, 31(5):651-656.)
( Xiao Xue, Wang Zhaowei, Chen Yunwei, et al. Community Detection Algorithm Based on Sample Weighting[J]. Library and Information Service, 2016, 60(20):86-93.)
( Hou Ting, Lü Xueqiang, Li Zhuo, et al. Acquisition of Technical Theme for Patent Technical Theme Analysis[J]. Information Studies: Theory & Application, 2015, 38(5):125-129.)
( Hu Juxiang, Lü Xueqiang, Xu Liping. Technology Subject Detection for Patent[J]. Computer Engineering and Design, 2016, 37(12):3260-3265.)
[17]
Feng S J. The Proximity of Ideas: An Analysis of Patent Text Using Machine Learning[J]. PLoS One, 2020, 15(7):e0234880.
doi: 10.1371/journal.pone.0234880
[18]
Kim H J, Kim T S, Sohn S Y. Recommendation of Startups as Technology Cooperation Candidates from the Perspectives of Similarity and Potential: A Deep Learning Approach[J]. Decision Support Systems, 2020, 130:113229.
doi: 10.1016/j.dss.2019.113229
[19]
Lai K K, Wu S J. Using the Patent Co-Citation Approach to Establish a New Patent Classification System[J]. Information Processing & Management, 2005, 41(2):313-330.
doi: 10.1016/j.ipm.2003.11.004
[20]
Chang S B, Lai K K, Chang S M. Exploring Technology Diffusion and Classification of Business Methods: Using the Patent Citation Network[J]. Technological Forecasting and Social Change, 2009, 76(1):107-117.
doi: 10.1016/j.techfore.2008.03.014
( Zhao Jingsheng, Song Mengxue, Gao Xiang. Overview of Natural Language Processing Development and Application[J]. Information Technology and Informatization, 2019(7):142-145.)
[22]
李生. 自然语言处理的研究与发展[J]. 燕山大学学报, 2013, 37(5):377-384.
[22]
( Li Sheng. Research and Development of Natural Language Processing[J]. Journal of Yanshan University, 2013, 37(5):377-384.)