[Objective] This paper reviews research on detecting patent infringements, aiming to provide theoretical frameworks and development trends for future studies. [Coverage] We retrieved 53 representative literatures from CNKI and Bing Scholar using the keywords of “Patent Infringement” or “Patent Similarity”. [Methods] First, we summarized the methods for detecting patent infringement based on clustering, vector space model, SAO (Subject-Action-Object) structure, deep learning and patent structure. Then, we compared the advantages and disadvantages of popular methods for detecting patent infringements. Finally, we explored some possible optimization solutions for the existing methods. [Results] Patent infringement detection aims to retrieve small number of patents with higher risks of infringement from a large number of patent documents. It reduces the number of patents requiring manual judgments. Our method decides the risk of patent infringement by calculating their similarities based on statistical information of different granularities. [Limitations] Due to the lack of standard data sets, we could not quantitatively compare the methods for detecting patent infringements. [Conclusions] We could optimize patent infringement detection with pre-training models, calculating similarity of different patent components, and constructing high-quality data sets.
( National Intellectual Property Administration. Press Conference on Major Work Statistics and Related Conditions in 2019[R/OL]. http://www.gov.cn/xinwen/2020-01/15/content_5469519.htm. http://www.gov.cn/xinwen/2020-01/15/content_5469519.htm)
( Wu Yuying, Ma Yuxiang, Zhai Dongsheng. Research on Chinese Patent Infringement Detection Based on SOM[J]. Journal of Intelligence, 2014,33(2):33-39.)
[5]
Lee S, Yoon B, Park Y. An Approach to Discovering New Technology Opportunities: Keyword-Based Patent Map Approach[J]. Technovation, 2009,29(6-7):481-497.
[6]
Lee C, Cho Y, Seol H, et al. A Stochastic Patent Citation Analysis Approach to Assessing Future Technological Impacts[J]. Technological Forecasting and Social Change, 2012,79(1):16-29.
( Wang Xuefeng, Liu Yuqin, Liu Jia. Research on Chinese Patent Infringement Retrieval Model[J]. Computer Engineering and Applications, 2009,45(9):212-215.)
( Ma Wenshan, Zhao Haining, Zhai Dongsheng. Research on Chinese Patent Infringement Retrieval Model[J]. Journal of Intelligence, 2012,31(4):175-179, 195.)
( Yu Yan, Chen Lei, Jiang Jinde, et al. Measuring Patent Similarity with Word Embedding and Statistical Features[J]. Data Analysis and Knowledge Discovery, 2019,3(9):53-59.)
( Jin Jian, Zhu Yuquan, Chen Geng. Infringement Detection of Chinese Patent Based on Three Tuple Character and Word Embedding[J]. Application Research of Computers, 2017,34(10):2901-2904.)
[11]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[12]
Dong Z D, Dong Q. HowNet—A Hybrid Language and Knowledge Resource[C]// Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering. 2003: 820-824.
( Du Yufeng, Ji Duo, Jiang Lixue, et al. Patent Similarity Measure Based on SAO Structure[J]. Journal of Chinese Information Processing, 2016,30(1):30-35.)
( Zhang Jie, Sun Ningning, Zhang Haichao, et al. Method and Application of Chinese Similar Patents Recognition Based on SAO Structures[J]. Journal of Intelligence, 2016,35(5):472-482.)
[15]
Park H, Yoon J, Kim K. Identifying Patent Infringement Using SAO Based Semantic Technological Similarities[J]. Scientometrics, 2012,90(2):515-529.
( Ma Xun, Zhou Changsheng, Lv Xueqiang, et al. Extraction of Non-Taxonomic Relations Based on SAO Structure[J]. Computer Engineering and Applications, 2018,54(8):220-225, 235.)
( Zhang Yongzhen, Lv Xueqiang, Shen Yanchun, et al. Chinese Patent Entity Relation Extraction Based on Subject Action Object Structure[J]. Computer Engineering and Design, 2019,40(3):706-712.)
( Zhai Dongsheng, Cai Wenhao, Zhang Jie, et al. A Method of Patent Infringement Detection Based on Graph Similarity[J]. Library and Information Service, 2018,62(5):97-105.)
[19]
Cascini G, Zini M. Measuring Patent Similarity by Comparing Inventions Functional Trees[M]. Springer, 2008.
[20]
Huang P S, He X, Gao J, et al. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. ACM, 2013: 2333-2338.
[21]
He H, Gimpel K, Lin J. Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1576-1586.
[22]
Tai K S, Socher R, Manning C D. Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015: 1556-1566.
[23]
Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press, 2016: 2786-2792.
[24]
Neculoiu P, Versteegh M, Rotaru M. Learning Text Similarity with Siamese Recurrent Networks[C]// Proceedings of the 1st Workshop on Representation Learning for NLP. 2016: 148-157.
[25]
Yoon B, Yoon C, Park Y. On the Development and Application of a Self-Organizing Feature Map-Based Patent Map[J]. R&D Management, 2002(32):291-300.
[26]
Huang S H, Ke H R, Yang W P. Structure Clustering for Chinese Patent Documents[J]. Expert Systems with Applications, 2008,34(4):2290-2297.
( Cao Qi, Zhao Wei, Zhang Yingjie, et al. Comparative Study of Patent Documents Similarity Detection on Deep Learning of Doc2Vec Based Methods[J]. Library and Information Service, 2018,62(13):74-81.)
[28]
Indukuri K V, Ambekar A A, Sureka A, et al. Similarity Analysis of Patent Claims Using Natural Language Processing Techniques[C]// Proceedings of the International Conference on Computational Intelligence and Multimedia Applications, 2007: 169-175.
[29]
Bergmann I, Butzke D, Walter L, et al. Evaluating the Risk of Patent Infringement by Means of Semantic Patent Analysis: The Case of DNA Chips[J]. R & D Management, 2008,38(5):550-562.
[30]
Fujii A, Ishikawa T. Document Structure Analysis for the NTCIR-5 Patent Retrieval Task[C]// Proceedings of the 5th NTCIR Workshop on Evaluation of Information Access Technologies, Information Retrieval, Question Answering and Cross-Lingual Information Access, Tokyo, Japan. 2005.
[31]
Osborn M, Strzalkowski T, Marinescu M. Evaluating Document Retrieval in Patent Database: A Preliminary Report[C]// Proceedings of the 6th International Conference on Information and Knowledge Management. 1997: 216-221.
[32]
Lee C, Song B, Park Y. How to Assess Patent Infringement Risks: A Semantic Patent Claim Analysis Using Dependency Relationships[J]. Technology Analysis & Strategic Management, 2013,25(1):23-38.
[33]
Cheng T Y, Wang M T. The Patent-Classification Technology/Function Matrix - A Systematic Method for Design Around[J]. Journal of Intellectual Property Rights, 2013,18(2):158-167.
[34]
Lin F R, Chen K R, Lin S Y. A Hybrid Patent Prior Art Retrieval Approach Using Claim Structure and Description[C]// Proceedings of the 8th International Conference on Knowledge Management in Organizations. Springer, 2014: 231-248.
[35]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019.
[36]
Miao Q, Zhang S, Zhang B, et al. Extracting and Visualizing Semantic Relationships from Chinese Biomedical Text[C]// Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation, Bali,Indonesia. 2012: 99-107.
[37]
Chen Y, Zheng Q, Zhang W. Omni-word Feature and Soft Constraint for Chinese Relation Extraction[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, USA. 2014: 572-581.
( Duan Liguo, Xu Qing, Li Aiping, et al. Research on Effect of Entities Semantic Information on Chinese Entity Relation Extraction[J]. Application Research of Computers, 2017,34(1):141-146.)
( Liu Dandan, Peng Cheng, Qian Longhua, et al. The Effect of TongYiCi CiLin in Chinese Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2014,28(2):91-99.)
[40]
Yang C, Zhu D, Wang X. SAO-Based Core Technological Components’ Identification[C]// Proceedings of the 10th International Conference on Software, Knowledge, Information Management & Applications. IEEE, 2016: 67-72.
( Liu Yong, Xing Yanyun. Research and Application of Text Classification Based on Improved Random Forest Algorithm[J]. Computer Systems & Applications, 2019,28(5):220-225.)
[42]
Sun A, Lim E P, Liu Y. On Strategies for Imbalanced Text Classification Using SVM: A Comparative Study[J]. Decision Support Systems, 2009,48(1):191-201.
[43]
Zhang X. Interactive Patent Classification Based on Multi-Classifier Fusion and Active Learning[J]. Neurocomputing, 2014,127:200-205.
[44]
Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882.
[45]
Zhang X, Zhao J B, LeCun Y. Character-Level Convolutional Networks for Text Classification[C]// Proceedings of the International Conference on Computational Intelligence and Multimedia Applications, 2015: 649-657.
[46]
Zhou C, Sun C, Liu Z, et al. AC-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511. 08630.
( Wang Jili, Peng Dunlu, Chen Zhang, et al. AM-CNN: A Convolution Neural Network Architecture for Text Classification Based on Attention Mechanism[J]. Journal of Chinese Computer Systems, 2019,40(4):710-714.)
[48]
Lu X, Ni B. BERT-CNN: A Hierarchical Patent Classifier Based on a Pre-Trained Language Model [OL]. arXiv Preprint, arXiv: 1911. 06241.
( Li Jiaquan, Li Baoan, You Xindong, et al. Computing Similarity of Patent Terms Based on Patent Knowledge Graph[J]. Data Analysis and Knowledge Discovery, 2020,4(10):104-112.)
( Xu Yingzhuo, Jia Huan. Ontology Concept Similarity Calculation Based on Tree Structure[J]. Computer Systems & Applications, 2017,26(3):275-279.)
[52]
Bouras C, Tsogkas V. A Clustering Technique for News Articles Using WordNet[J]. Knowledge-Based Systems, 2012,36:115-128.
[53]
Yu X, Ren X, Gu Q, et al. Collaborative Filtering with Entity Similarity Regularization in Heterogeneous Information Networks[C]// Proceedings of the IJCAI-13 HINA Workshop. 2013.
[54]
Zhang J, Tang J, Ma C, et al. Panther: Fast Top-K Similarity Search on Large Networks[C]// Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015: 1445-1454.