Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (1): 104-117    DOI: 10.11925/infotech.2096-3467.2018.0394
Current Issue | Archive | Adv Search |
Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction
Junwan Liu(),Zhixin Long,Feifei Wang
School of Economics and Management, Beijing University of Technology, Beijing 100022, China
Download: PDF (3869 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a new method to discover collaboration opportunities from emerging issues. [Methods] We used literature corpus of deep learning as the research object. Firstly, we explored the intrinsic characteristics of these literature with the LDA topic model. Then, we calculated their weights, and used topics as nodes to build topic co-occurrence network. Finally, we applied link prediction to find the potential opportunities. [Results] The optimal index of topic co-occurrence network in deep learning was AA. The big data analysis research in deep learning were more likely associated with the biomedical studies and the improvement of related algorithms. [Limitations] Link prediction generated poor results for badly connected networks. [Conclusions] The LDA topic model and link prediction method could help us find new collaboration opportunities from emerging issues.

Key wordsEmerging Topic Association      LDA Topic Model      Link Prediction     
Received: 09 April 2018      Published: 04 March 2019

Cite this article:

Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction. Data Analysis and Knowledge Discovery, 2019, 3(1): 104-117.

URL:     OR

[1] 田瑞强, 姚长青, 潘云涛. 关联文献的知识发现与创新研究进展[J]. 情报理论与实践, 2013, 36(8): 117-123.
[1] (Tian Ruiqiang, Yao Changqing, Pan Yuntao.Progress in Research on Literature-Related Discovery and Innovation[J]. Information Studies: Theory & Application, 2013, 36(8): 117-123.)
[2] 范云满, 马建霞, 曾苏. 基于知识图谱的领域新兴主题研究现状分析[J]. 情报杂志, 2013, 32(9): 88-94.
[2] (Fan Yunman, Ma Jianxia, Zeng Su.The Analysis for the Study of the Field Emerging Topic Based on Knowledge Mapping[J]. Journal of Intelligence, 2013, 32(9): 88-94.)
[3] Tu Y N, Seng J L.Indices of Novelty for Emerging Topic Detection[J]. Information Processing & Management, 2012, 48(2):303-325.
[4] Small H.Tracking and Predicting Growth Areas in Science[J]. Scientometrics, 2006, 68(3): 595-610.
[5] Morris S A, Yen G, Wu Z, et al.Time Line Visualization of Research Fronts[J]. Journal of the American Society for Information Science and Technology, 2003, 54(5): 413-422.
[6] 张晗, 崔雷. 生物信息学的共词分析研究[J]. 情报学报, 2003, 22(5): 613-617.
[6] (Zhang Han, Cui Lei.Study of Bioinformatics Through Co-word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2003, 22(5): 613-617.)
[7] Chen C.CiteSpace II: Detecting and Visualizing Emerging Trends and Transient Patterns in Scientific Literature[J]. Journal of the American Society for Information Science and Technology, 2006, 57(3):359-377.
[8] 吴霞, 冷伏海. 基于文献的知识挖掘: 概念、关键技术与应用[OL]. .
[8] (Wu Xia, Leng Fuhai. Knowledge Mining Based on Document: Concept, Key Technology and Application[OL].
[9] 殷蜀梅. 判断新兴研究趋势的技术方法分析[J].情报科学, 2008, 26(4): 536-540.
[9] (Yin Shumei.Analysis of the Methods for Detecting Emerging Trend[J]. Information Science, 2008, 26(4): 536-540.)
[10] 靖继鹏, 马费成, 张向先. 情报科学理论[M]. 北京: 科学出版社, 2009.
[10] (Jing Jipeng, Ma Feicheng, Zhang Xiangxian.Information Science Theory[M]. Beijing: Science Press, 2009.)
[11] Glänzel W.Bibliometric Methods for Detecting and Analyzing Emerging Research Topics[J]. EI Professional de la Informacion, 2012, 21(2): 194-201.
[12] Guo H, Weingart S, Börner K.Mixed-indicators Model for Identifying Emerging Research Areas[J]. Scientometrics, 2011, 89(1):421-435.
[13] 黄鲁成, 唐月强, 吴菲菲, 等. 基于文献多属性测度的新兴主题识别方法研究[J]. 科学学与科学技术管理, 2015, 36(2): 34-43.
[13] (Huang Lucheng, Tang Yueqiang, Wu Feifei, et al.Research on Identification of Emerging Topics Based on Muti-Attribute Measurement of Literature[J]. Science of Science and Management of S.& T., 2015, 36(2): 34-43.)
[14] Saracevic T.Relevance: A Review of and a Framework for the Thinking on the Notion in Information Science[J].Journal of the American Society for Information Science, 1975, 26(6): 321-343.
[15] 雷雪, 侯人华, 曾建勋. 关联规则在领域知识推荐中的应用研究[J]. 情报理论与实践, 2014, 37(12): 67-70.
[15] (Lei Xue, Hou Renhua, Zeng Jianxun.Research on the Domain Knowledge Recommendation Based on Association Rules[J]. Information Studies: Theory & Application, 2014, 37(12): 67-70.)
[16] 高继平, 丁堃, 潘云涛, 等. 知识关联研究述评[J].情报理论与实践, 2015, 38(8): 135-140.
[16] (Gao Jiping, Ding Kun, Pan Yuntao, et al.Review of the Knowledge Interaction[J]. Information Studies: Theory & Application, 2015, 38(8): 135-140.)
[17] 吴一占. 基于Web知识关联挖掘的本体进化研究[D]. 南京: 南京航空航天大学, 2011.
[17] (Wu Yizhan.Research on Ontology Evolution Based on Web Knowledge Association Mining[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2011.)
[18] 文庭孝, 龚蛟腾, 张蕊, 等. 知识关联:内涵、特征与类型[J]. 图书馆, 2011(4): 32-35.
[18] (Wen Tingxiao, Gong Jiaoteng, Zhang Rui, et al.Knowledge Connection: Meaning, Characteristic and Type[J]. Library, 2011(4): 32-35.)
[19] 温有奎, 成鹏. 基于知识单元间隐含关联的知识发现[J]. 情报学报, 2007, 26(5): 653-658.
[19] (Wen Youkui, Cheng Peng.A New Knowledge Discover Based on Knowledge Element[J]. Journal of the China Society for Scientific and Technical Information, 2007, 26(5): 653-658.)
[20] 张玲玲, 周全亮, 唐广文, 等. 基于领域知识和聚类的关联规则深层知识发现研究[J]. 中国管理科学, 2015, 23(2): 154-161.
[20] (Zhang Lingling, Zhou Quanliang, Tang Guangwen, et al.Research on Algorithm of Post-processing Association Rules Based on Clustering and Domain Knowledge[J]. Chinese Journal of Management Science, 2015, 23(2): 154-161.)
[21] 郭秋萍, 梁梦丽, 刘秀丽, 等. 基于作者-关键词-引文多重共现的超网络知识关联研究[J]. 情报理论与实践, 2016, 39(7): 20-26.
[21] (Guo Qiuping, Liang Mengli, Liu Xiuli, et al.Research on the Knowledge Association of Super-network Based on the Multiple Co-occurrence of Author, Keywords and Citation[J]. Information Studies: Theory & Application, 2016, 39(7): 20-26.)
[22] Ramage D, Dumais S T, Liebling D J.Characterizing Microblogs with Topic Models[C]// Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. 2010: 130-137.
[23] 张明慧, 王红玲, 周国栋. 基于LDA主题特征的自动文摘方法[J]. 计算机应用与软件, 2011, 28(10): 20-22.
[23] (Zhang Minghui, Wang Hongling, Zhou Guodong.An Automatic Summarization Approach Based on LDA Topic Feature[J]. Computer Applications and Software, 2011, 28(10): 20-22.)
[24] 邸亮, 杜永萍. LDA模型在微博用户推荐中的应用[J]. 计算机工程, 2014, 40(5): 1-6.
[24] (Di Liang, Du Yongping.Application of LDA Model in Microblog User Recommendation[J]. Computer Engineering, 2014, 40(5): 1-6.)
[25] Ding Y.Scientific Collaboration and Endorsement: Network Analysis of Coauthorship and Citation Networks[J]. Journal of Informetrics, 2011, 5(1): 187-203.
[26] Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[27] Borgatti S P, Everett M G, Freeman L C.UCINET[A]// Alhajj R, Rokne J. Encyclopedia of Social Network Analysis & Mining[M]. 2014: 2261-2267.
[28] 刘宏鲲, 吕琳媛, 周涛. 利用链路预测推断网络演化机制[J]. 中国科学: 物理学力学天文学, 2011, 41(7): 816-823.
[28] (Liu Hongkun, Lü Linyuan, Zhou Tao.Uncovering the Network Evolution Mechanism by Link Prediction[J]. Scientia Sinica: Physica, Mechanica & Astronomica, 2011, 41(7): 816-823.)
[29] Liben-Nowell D, Kleinberg J.The Link Prediction Problem for Social Networks[C]// Proceedings of the 12th International Conference on Information and Knowledge Management. 2003: 556-559.
[30] Lü L, Zhou T.Link Prediction in Complex Networks: A Survey[J]. Physica A: Statistical Mechanics & Its Applications, 2010, 390(6): 1150-1170.
[31] Liu W, Lü L. Link Prediction Based on Local Random Walk[J]. Europhysics Letters, 2010, 89(5): Article No. 58007.
[32] 刘竟, 孙薇. 基于链路预测的潜在科研合作关系发现研究[J]. 情报理论与实践, 2017, 40(7): 88-92, 121.
[32] (Liu Jing, Sun Wei.Discovery of Potential Scientific and Technical Collaborative Relationship Based on Link Prediction[J]. Information Studies: Theory & Application, 2017, 40(7): 88-92, 121.)
[33] Lorrain F, White H C.Structural Equivalence of Individuals in Social Networks[J]. Social Networks, 1971, 1(1): 49-80.
[34] Salton G, Mcgill M J.Introduction to Modern Information Retrieval[M]. Auckland: MuGraw-Hill, 1986.
[35] Jaccard P.Etude Comparative de la Distribution Florale Dans Une Portion des Alpes et des Jura[J]. Bulletin de la Société Vaudoise des Science Naturelles, 1901, 37: 547-579.
[36] Sorensen T.A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content and Its Application to Analyses of the Vegetation on Danish Commons[J]. Biologiske Skrifter, 1948, 5(4): 1-34.
[37] Ravasz E, Somera A L, Mongru D A, et al.Hierarchical Organization of Modularity in Metabolic Networks[J]. Science, 2002, 297(5586): 1553-1555.
[38] Zhou T, Lü L, Zhang Y C.Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
[39] Leicht E A, Holme P, Newman M E. Vertex Similarity in Networks[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2006, 73(2): Article No. 026120.
[40] Barabasi A-L, Albert R.Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
[41] Adamic L A, Adar E.Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3): 211-230.
[42] Sun D, Zhou T, Liu J G, et al. Information Filtering Based on Transferring Similarity[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2009, 80(1): Article No. 017101.
[43] 吕琳媛, 周涛. 链路预测[M]. 北京: 高等教育出版社, 2013.
[43] (Lü Linyuan, Zhou Tao.Link Prediction[M]. Beijing: Higher Education Press, 2013.)
[44] Lü L, Jin C H, Zhou T. Similarity Index Based on Local Paths for Link Prediction of Complex Networks[J]. Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2009, 80(4): Article No. 046122.
[45] Katz L.A New Status Index Derived from Sociometric Analysis[J]. Psychometrika, 1953, 18(1): 39-43.
[46] Klein D J, Randic M, Resistance Distance[J]. Journal of Mathematical Chemistry,1993, 12(1): 81-95.
[47] Fouss F, Pirotte A, Renders J M, et al.Random-Walk Computation of Similarities Between Nodes of a Graph with Application to Collaborative Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355-369.
[48] Jeh G, Widom J.SimRank: A Measure of Structural-context Similarity[C]// Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2002: 538-543.
[49] Chebotarev P, Shamis E.The Matrix-Forest Theorem and Measuring Relations in Small Social Groups[J]. Automation & Remote Control, 2006, 58(9): 1505-1514.
[50] Hinton G E, Osindero S, Teh Y W.A Fast Learning Algorithm for Deep Belief Nets[J]. Neural Computation, 2006, 18(7): 1527-1554.
[51] 2017全国深度学习技术应用大会回顾[OL]. [2018-03-09]..
[51] (Review of the 2017 National Conference on Application of Deep Learning Technology[OL]. [2018-03-09]. .)
[52] Wei X, Bruce Croft W.LDA-Based Document Models for Ad-hoc Retrieval[C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2006: 178-185.
[53] Liu Z, Zhang Q M, Lü L, et al. Link Prediction in Complex Networks: A Local Naive Bayes Model[J]. Europhysics Letters, 2011, 96(4): Article No. 48007.
[1] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[2] Wu Shengnan, Pu Hongjun, Tian Ruonan, Liang Wenqi, Yu Qi. Network Structure’s Impacts on Link Prediction Algorithm from Meta-Analysis Perspective[J]. 数据分析与知识发现, 2021, 5(11): 102-113.
[3] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[4] Chen Wenjie. Predicting Research Collaboration Based on Translation Model[J]. 数据分析与知识发现, 2020, 4(10): 28-36.
[5] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[6] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[7] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[8] Li He,Zhu Linlin,Yan Min,Liu Jincheng,Hong Chuang. Identifying Useful Information from Open Innovation Community[J]. 数据分析与知识发现, 2018, 2(12): 12-22.
[9] Qu Jiabin,Ou Shiyan. Analyzing Topic Evolution with Topic Filtering and Relevance[J]. 数据分析与知识发现, 2018, 2(1): 64-75.
[10] Yu Chuanming,Gong Yutian,Zhao Xiaoli,An Lu. Collaboration Recommendation of Finance Research Based on Multi-feature Fusion[J]. 数据分析与知识发现, 2017, 1(8): 39-47.
[11] Lv Weimin,Wang Xiaomei,Han Tao. Recommending Scientific Research Collaborators with Link Prediction and Extremely Randomized Trees Algorithm[J]. 数据分析与知识发现, 2017, 1(4): 38-45.
[12] Guan Peng,Wang Yuefen. Identifying Optimal Topic Numbers from Sci-Tech Information with LDA Model[J]. 现代图书情报技术, 2016, 32(9): 42-50.
[13] Jing Wei,Hengmin Zhu,Ruixiao Song,Shibing Jiang. Link Prediction Analysis of Internet Public Opinion Transfer from the Individual Perspective[J]. 现代图书情报技术, 2016, 32(1): 55-64.
[14] Zhuo Keqiu, Yu Wei, Su Xinning. Parallel Implementing Bursty Events Detection Using MapReduce[J]. 现代图书情报技术, 2015, 31(2): 46-54.
[15] Hu Jiming, Chen Guo. Study on Improvement of Text Classification Using HS-SVM[J]. 现代图书情报技术, 2014, 30(9): 74-80.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938