[Objective] This paper proposes a new method to discover collaboration opportunities from emerging issues. [Methods] We used literature corpus of deep learning as the research object. Firstly, we explored the intrinsic characteristics of these literature with the LDA topic model. Then, we calculated their weights, and used topics as nodes to build topic co-occurrence network. Finally, we applied link prediction to find the potential opportunities. [Results] The optimal index of topic co-occurrence network in deep learning was AA. The big data analysis research in deep learning were more likely associated with the biomedical studies and the improvement of related algorithms. [Limitations] Link prediction generated poor results for badly connected networks. [Conclusions] The LDA topic model and link prediction method could help us find new collaboration opportunities from emerging issues.
刘俊婉,龙志昕,王菲菲. 基于LDA主题模型与链路预测的新兴主题关联机会发现研究*[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction. Data Analysis and Knowledge Discovery, 2019, 3(1): 104-117.
(Tian Ruiqiang, Yao Changqing, Pan Yuntao.Progress in Research on Literature-Related Discovery and Innovation[J]. Information Studies: Theory & Application, 2013, 36(8): 117-123.)
(Fan Yunman, Ma Jianxia, Zeng Su.The Analysis for the Study of the Field Emerging Topic Based on Knowledge Mapping[J]. Journal of Intelligence, 2013, 32(9): 88-94.)
[3]
Tu Y N, Seng J L.Indices of Novelty for Emerging Topic Detection[J]. Information Processing & Management, 2012, 48(2):303-325.
[4]
Small H.Tracking and Predicting Growth Areas in Science[J]. Scientometrics, 2006, 68(3): 595-610.
[5]
Morris S A, Yen G, Wu Z, et al.Time Line Visualization of Research Fronts[J]. Journal of the American Society for Information Science and Technology, 2003, 54(5): 413-422.
(Zhang Han, Cui Lei.Study of Bioinformatics Through Co-word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2003, 22(5): 613-617.)
[7]
Chen C.CiteSpace II: Detecting and Visualizing Emerging Trends and Transient Patterns in Scientific Literature[J]. Journal of the American Society for Information Science and Technology, 2006, 57(3):359-377.
[8]
吴霞, 冷伏海. 基于文献的知识挖掘: 概念、关键技术与应用[OL]. .
[8]
(Wu Xia, Leng Fuhai. Knowledge Mining Based on Document: Concept, Key Technology and Application[OL].
(Huang Lucheng, Tang Yueqiang, Wu Feifei, et al.Research on Identification of Emerging Topics Based on Muti-Attribute Measurement of Literature[J]. Science of Science and Management of S.& T., 2015, 36(2): 34-43.)
[14]
Saracevic T.Relevance: A Review of and a Framework for the Thinking on the Notion in Information Science[J].Journal of the American Society for Information Science, 1975, 26(6): 321-343.
(Lei Xue, Hou Renhua, Zeng Jianxun.Research on the Domain Knowledge Recommendation Based on Association Rules[J]. Information Studies: Theory & Application, 2014, 37(12): 67-70.)
(Gao Jiping, Ding Kun, Pan Yuntao, et al.Review of the Knowledge Interaction[J]. Information Studies: Theory & Application, 2015, 38(8): 135-140.)
[17]
吴一占. 基于Web知识关联挖掘的本体进化研究[D]. 南京: 南京航空航天大学, 2011.
[17]
(Wu Yizhan.Research on Ontology Evolution Based on Web Knowledge Association Mining[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2011.)
(Wen Youkui, Cheng Peng.A New Knowledge Discover Based on Knowledge Element[J]. Journal of the China Society for Scientific and Technical Information, 2007, 26(5): 653-658.)
(Zhang Lingling, Zhou Quanliang, Tang Guangwen, et al.Research on Algorithm of Post-processing Association Rules Based on Clustering and Domain Knowledge[J]. Chinese Journal of Management Science, 2015, 23(2): 154-161.)
(Guo Qiuping, Liang Mengli, Liu Xiuli, et al.Research on the Knowledge Association of Super-network Based on the Multiple Co-occurrence of Author, Keywords and Citation[J]. Information Studies: Theory & Application, 2016, 39(7): 20-26.)
[22]
Ramage D, Dumais S T, Liebling D J.Characterizing Microblogs with Topic Models[C]// Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. 2010: 130-137.
(Di Liang, Du Yongping.Application of LDA Model in Microblog User Recommendation[J]. Computer Engineering, 2014, 40(5): 1-6.)
[25]
Ding Y.Scientific Collaboration and Endorsement: Network Analysis of Coauthorship and Citation Networks[J]. Journal of Informetrics, 2011, 5(1): 187-203.
[26]
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[27]
Borgatti S P, Everett M G, Freeman L C.UCINET[A]// Alhajj R, Rokne J. Encyclopedia of Social Network Analysis & Mining[M]. 2014: 2261-2267.
(Liu Hongkun, Lü Linyuan, Zhou Tao.Uncovering the Network Evolution Mechanism by Link Prediction[J]. Scientia Sinica: Physica, Mechanica & Astronomica, 2011, 41(7): 816-823.)
[29]
Liben-Nowell D, Kleinberg J.The Link Prediction Problem for Social Networks[C]// Proceedings of the 12th International Conference on Information and Knowledge Management. 2003: 556-559.
[30]
Lü L, Zhou T.Link Prediction in Complex Networks: A Survey[J]. Physica A: Statistical Mechanics & Its Applications, 2010, 390(6): 1150-1170.
[31]
Liu W, Lü L. Link Prediction Based on Local Random Walk[J]. Europhysics Letters, 2010, 89(5): Article No. 58007.
(Liu Jing, Sun Wei.Discovery of Potential Scientific and Technical Collaborative Relationship Based on Link Prediction[J]. Information Studies: Theory & Application, 2017, 40(7): 88-92, 121.)
[33]
Lorrain F, White H C.Structural Equivalence of Individuals in Social Networks[J]. Social Networks, 1971, 1(1): 49-80.
[34]
Salton G, Mcgill M J.Introduction to Modern Information Retrieval[M]. Auckland: MuGraw-Hill, 1986.
[35]
Jaccard P.Etude Comparative de la Distribution Florale Dans Une Portion des Alpes et des Jura[J]. Bulletin de la Société Vaudoise des Science Naturelles, 1901, 37: 547-579.
[36]
Sorensen T.A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons[J]. Biologiske Skrifter, 1948, 5(4): 1-34.
[37]
Ravasz E, Somera A L, Mongru D A, et al.Hierarchical Organization of Modularity in Metabolic Networks[J]. Science, 2002, 297(5586): 1553-1555.
[38]
Zhou T, Lü L, Zhang Y C.Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
[39]
Leicht E A, Holme P, Newman M E. Vertex Similarity in Networks[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2006, 73(2): Article No. 026120.
[40]
Barabasi A-L, Albert R.Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
[41]
Adamic L A, Adar E.Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3): 211-230.
[42]
Sun D, Zhou T, Liu J G, et al. Information Filtering Based on Transferring Similarity[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2009, 80(1): Article No. 017101.
Lü L, Jin C H, Zhou T. Similarity Index Based on Local Paths for Link Prediction of Complex Networks[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2009, 80(4): Article No. 046122.
[45]
Katz L.A New Status Index Derived from Sociometric Analysis[J]. Psychometrika, 1953, 18(1): 39-43.
[46]
Klein D J, Randic M, Resistance Distance[J]. Journal of Mathematical Chemistry,1993, 12(1): 81-95.
[47]
Fouss F, Pirotte A, Renders J M, et al.Random-Walk Computation of Similarities Between Nodes of a Graph with Application to Collaborative Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355-369.
[48]
Jeh G, Widom J.SimRank: A Measure of Structural-context Similarity[C]// Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2002: 538-543.
[49]
Chebotarev P, Shamis E.The Matrix-Forest Theorem and Measuring Relations in Small Social Groups[J]. Automation & Remote Control, 2006, 58(9): 1505-1514.
[50]
Hinton G E, Osindero S, Teh Y W.A Fast Learning Algorithm for Deep Belief Nets[J]. Neural Computation, 2006, 18(7): 1527-1554.
[51]
2017全国深度学习技术应用大会回顾[OL]. [2018-03-09]..
[51]
(Review of the 2017 National Conference on Application of Deep Learning Technology[OL]. [2018-03-09]. .)
[52]
Wei X, Bruce Croft W.LDA-Based Document Models for Ad-hoc Retrieval[C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2006: 178-185.
[53]
Liu Z, Zhang Q M, Lü L, et al. Link Prediction in Complex Networks: A Local Naive Bayes Model[J]. Europhysics Letters, 2011, 96(4): Article No. 48007.