[Objective] This paper proposes an algorithm to identify potential collaboration opportunities for patents with the LDA and decision tree models, aiming to enhance the cross-region innovation. [Methods] First, we retrieved 22 855 patents from the incoPat database, which were developed by higher education institutions from Guangdong Province and Wuhan City. Then, we used the LDA to extract and cluster patent topics. Third, we constructed decision tree to identify the best potential cooperative relations by adjusting the decision boundaries. Finally, we chose the optimal data mining strategy based on the effective size of the inventors’ network, which helps us identify and recommend cooperative relationships. [Results] We found 18 pairs of potential cross-regional partners from the top four patent categories in the data set, which was much better than the link prediction method. [Limitations] The coverage of patent data needs to be expanded. More research is also needed to study the impacts of the university and industry on the innovation ecology. [Conclusions] The proposed method could identify the potential cross region partners for patents and innovation.
陈浩, 张梦毅, 程秀峰. 融合主题模型与决策树的跨地区专利合作关系发现与推荐*——以广东省和武汉市高校专利库为例[J]. 数据分析与知识发现, 2021, 5(10): 37-50.
Chen Hao, Zhang Mengyi, Cheng Xiufeng. Identifying Cross-Region Patent Collaboration Opportunities Using LDA and Decision Trees——Case Study of Universities from Guangdong and Wuhan. Data Analysis and Knowledge Discovery, 2021, 5(10): 37-50.
(Yang Yong, Wang Luhan. Research on the Structure and Evolution Characteristics of Patent Cooperation Network in China[J]. Studies in Science of Science, 2020, 38(7): 1227-1235.)
[2]
Liben-Nowell D, Kleinberg J. The Link-prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2007, 58(7): 1019-1031.
doi: 10.1002/(ISSN)1532-2890
(Zhang Jinzhu, Han Tao, Wang Xiaomei. Co-authorship Prediction in the Author-Keyword Bipartite Networks[J]. Library and Information Service, 2016, 60(21): 74-80.)
[4]
Luong N T, Nguyen T T, Jung J J, et al. Discovering Co-author Relationship in Bibliographic Data Using Similarity Measures and Random Walk Model [C]//Proceedings of Asian Conference on Intelligent Information and Database Systems. Springer, Cham, 2015: 127-136.
[5]
Zhang J Z. Uncovering Mechanisms of Co-authorship Evolution by Multirelations-Based Link Prediction[J]. Information Processing & Management, 2017, 53(1): 42-51.
doi: 10.1016/j.ipm.2016.06.005
[6]
Lee D H, Brusilovsky P, Schleyer T. Recommending Collaborators Using Social Features and MeSH Terms[J]. Proceedings of the American Society for Information Science and Technology, 2011, 48(1): 1-10.
(Zhai Dongsheng, Guo Cheng, Zhang Jie, et al. Recommending Potential R&D Partners Based on Patents[J]. Data Analysis and Knowledge Discovery, 2017, 1(3): 10-20.)
(Wang Jun, Yue Feng, Wang Gang, et al. Expert Recommendation in Scientific Social Network Based on Link Prediction[J]. Journal of Intelligence, 2015, 34(6): 151-157.)
(Pu Shanshan. Expert Recommendation Model in Scientific and Technical Collaboration Based on Complementary Knowledge[J]. Information Studies: Theory & Application, 2018, 41(8): 96-101.)
(Sheng Jiaqi, Xu Xin. Expanding Scholar Labels with Research Similarity and Co-authorship Network[J]. Data Analysis and Knowledge Discovery, 2020, 4(8): 75-85.)
(Xiong Huixiang, Yang Xueping, Jiang Wuxuan, et al. Scholars Recommend Research Based on Academic Competence and Collaborative Networks[J]. Information Science, 2019, 37(5): 71-78.)
(Liu Ping, Zheng Kailun, Zou Dean. Research on the Recommendation of S&T Collaboration Based on LDA Model[J]. Information Studies: Theory & Application, 2015, 38(9): 79-85.)
(Liu Haiou, Sun Jingjing, Zhang Yaming, et al. Research on User Portrayal and Information Dissemination Behavior in Online Social Activities[J]. Information Science, 2018, 36(12): 17-21.)
(Liu Jing, Sun Wei. Discovery of Potential Scientific and Technical Collaborative Relationship Based on Link Prediction[J]. Information Studies: Theory & Application, 2017, 40(7): 88-92, 121.)
[16]
Eslami H, Ebadi A, Schiffauerova A. Effect of Collaboration Network Structure on Knowledge Creation and Technological Performance: The Case of Biotechnology in Canada[J]. Scientometrics, 2013, 97(1): 99-119.
doi: 10.1007/s11192-013-1069-6
(Hu Yang, Li Xun. Effect of Geographical Proximity on University-Industry Cooperative Innovation and the Mechanism[J]. Economic Geography, 2016, 36(6): 109-115.)
(Chen Guanghua, Wang Ye, Yang Guoliang. Geographical Distance and Non-local University-Industry Collaborations Performance[J]. Studies in Science of Science, 2015, 33(1): 76-82.)
[19]
Burt R S. Structural Holes: The Social Structure of Competition[M]. Harvard: Harvard University Press, 1992.
[20]
Blei D M, Ng A Y, Jordan M L. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
(Qiu Junping, Chen Mupei. Research on Author Collaboration in the Metrology Field in China[J]. Information Studies: Theory & Application, 2012, 35(11): 56-60.)
(Li Rui. Comparing Co-words Analysis and Citations Analysis Between Academic Papers and Patents in the Presentation of Knowledge Transfer[J]. Library and Information Service, 2010, 54(6): 91-93, 140.)
[23]
Freeman L C. Centrality in Social Networks Conceptual Clarification[J]. Social Networks, 1978, 1(3): 215-239.
doi: 10.1016/0378-8733(78)90021-7
[24]
蔡克勇. 论知识积累与知识创新[J]. 教育科学研究, 2001(1): 8-11.
[24]
(Cai Keyong. Discussion on Knowledge Accumulation and Knowledge Innovation[J]. Educational Science Research, 2001(1): 8-11.)
(Liu Tingting, Wu Jie, Zhang Yujie. Research on the University's Knowledge Innovation Capability in the Industry University-Institute Cooperation Using System Dynamics—Based on the Knowledge Transfer Perspective[J]. Journal of Intelligence, 2012, 31(10): 195-200.)
[26]
Yan B W, Luo J X. Measuring Technological Distance for Patent Mapping[J]. Journal of the Association for Information Science and Technology, 2017, 68(2): 423-437.
doi: 10.1002/asi.2017.68.issue-2
(Liu Yang, Du Yanyan. Evolution of the Patent Cooperation Network of Agricultural Universities in China[J]. Journal of Intelligence, 2015, 34(7): 110-116.)
(Liu Guifeng, Lu Zhangping, Liu Qiong, et al. A Study on University-industry Patent Cooperation in Jiangsu Province Based on Social Network Analysis[J]. Journal of Intelligence, 2015, 34(1): 122-126.)
[29]
Rosen Z M, Griffiths T, Steyvers M, et al. The Author-Topic Model for Authors and Document [C]//Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 2004: 487-494.
[30]
Hirschanman A O. National Power and the Structure of Foreign Trade[M]. University of California Press, 1980: 98-99.
[31]
Hamburg B, Hoffmann M, Keller J. Consumption, Wealth and Business Cycles in Germany[J]. Empirical Economics, 2008, 34(3): 451-476.
doi: 10.1007/s00181-007-0130-9
[32]
Hwang C L, Yoon K. Multiple Attribute Decision Making: Methods and Applications[M]. New York: Springer-Verlag, 1981: 58-191.
[33]
Shannon C E. A mathematical Theory of Communication[J]. The Bell System Technical Journal, 1948, 27(3): 379-423.
doi: 10.1002/bltj.1948.27.issue-3
[34]
Arthur D, Vassilvitskii S. K-Means++: The Advantages of Careful Seeding [C]// Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. 2007: 1027-1035.
[35]
Han J, Kambr M. Data Mining: Concepts and Techniques[M]. San Francisco: Morgan Kaufmann Publishers, 2001: 279-333.
[36]
Guan J C, Liu N. Exploitative and Exploratory Innovations in Knowledge Network and Collaboration Network: A Patent Analysis in the Technological Field of Nano-energy[J]. Research Policy, 2016, 45(1): 97-112.
doi: 10.1016/j.respol.2015.08.002