Analyzing Sci-Tech Topics Based on Semantic Representation of Patent References
Jinzhu Zhang1,2(),Yue Wang1,Yiming Hu1
1 School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China 2 Jiangsu Collaborative Innovation Center of Social Safety Science and Technology, Nanjing 210094, China
[Objective] This paper explores the content mining method for scientific references in patent (SRP) based on text semantic representation. It also improves the accuracy, comprehensiveness and interpretability of knowledge flow analysis. [Methods] Firstly, we extracted keywords and abstracts from patents to represent the SRPs and created vectors for these items. Then, we computed the distance between vectors to calculate their semantic similarities. Finally, we obtained and mapped the topics of patents and SRP contents from the field of nanotechnology. [Results] We found our method could map relationship among sci-tech topics from the content perspective effectively. [Limitations] We only conducted exploratory research with abstracts and keywords rather than full texts. [Conclusions] The proposed method improves the knowledge flow analysis of patents.
( Chen Liang, Zhang Zhiqiang, Shang Weijiao . Reviews on Development of Patent Citation Research[J]. New Technology of Library and Information Service, 2013(7/8):75-81.)
[3]
Narin F . Patent Bibliometrics[J]. Scientometrics, 1994,30(1):147-155.
[4]
Narin F, Breitzman A, Thomas P . Using Patent Citation Indicators to Manage a Stock Portfolio[M]. Springer Netherlands, 2004.
[5]
Verbeek A, Debackere K, Luwel M , et al. Linking Science to Technology: Using Bibliographic References in Patents to Build Linkage Schemes[J]. Scientometrics, 2002,54(3):399-420.
[6]
Breschi S, Catalini C . Tracing the Links Between Science and Technology: An Exploratory Analysis of Scientists’ and Inventors’ Networks[J]. Research Policy, 2010,39(1):14-26.
( Zhao Liming, Gao Yang, Han Yu . Application of Patent Citation Analysis to the Research of Knowledge-transfer Mechanism[J]. Studies in Science of Science, 2002,20(3):297-300.)
( Zhao Liming, Li Haixia, Han Yu . The Analysis of Citation in Patents and Knowledge Discovery Based on Data Mining[J]. Forecasting, 2002,21(6):6-9.)
[9]
Callaert J, Looy B V . Delineating the Scientific Footprint in Technology: Identifying Scientific Publications within Non-patent References[J]. Scientometrics, 2012,91(2):383-398.
( Zhang Jinzhu, Zhang Xiaolin . Identification of Radical Innovation Based on Mutation of Cited Scientific Knowledge[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(3):259-266.)
( Zhao Zhiyun, Lei Xiaoping . Analysis of Scientific Linkage Between China’s Technology Innovation and Basic Research in Biotechnology Industry Based on Patent Citation[J]. Journal of the China Society for Scientific and Technical Information, 2012,31(12):1283-1289.)
[12]
Mikolov T, Chen K, Corrado G S , et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[13]
Le Q, Mikolov T . Distributed Representations of Sentences and Documents [C]//Proceedings of the 31st International Conference on Machine Learning, 2014: 1188-1196.
[14]
Mahata D, Kuriakose J, Shah R R , et al. Key2Vec: Automatic Ranked Keyphrase Extraction from Scientific Articles Using Phrase Embeddings [C]//Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018: 634-639.
[15]
Pagliardini M, Gupta P, Jaggi M . Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features [C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics. 2017: 528-540.
[16]
Saha T K, Joty S, Al Hasan M . Con-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec [C]//Proceedings of ECML PKDD: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2017: 753-769.
[17]
Tian H, Zhuo H H . Paper2vec: Citation-Context Based Document Distributed Representation for Scholar Recommendation[OL]. arXiv Preprint, arXiv: 1703.06587.
[18]
Jain S, Howe B, Yan J , et al. Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics[OL]. arXiv Preprint, arXiv: 1801.05613.
[19]
Han J, Song Y, Zhao W X , et al. Hyperdoc2vec: Distributed Representations of Hypertext Documents[OL]. arXiv Preprint, arXiv: 1805. 03793.