Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (7-8): 137-146    DOI: 10.11925/infotech.1003-3513.2016.07.17
Orginal Article Current Issue | Archive | Adv Search |
New Research and Application with Co-topics Network
Niu Liang()
School of Economics & Management, China Jiliang University, Hangzhou 310018, China
Download: PDF(2567 KB)   HTML ( 56
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper builds a co-topics network to analyze the relationship among the topics of research articles and then optimize terms representing these topics. [Methods] First, we transformed the “document-topics” bipartite Graph to co-topics networks in accordance with weighted projection rules. Second, we identified the key topics with the combination of betweenness centrality and topic probability. Third, we divided the co-topics network community with the GN algorithm. Finally we optimized topic terms with relevance method. [Results] We compared the co-topics networks and the K-means based on JSD by testing optimal topic number (28) and random subjective topic numbers(20, 30). Their clustering numbers were the same and the consistent degree of clustering content reached 100%, 95% and 87%. [Limitations] We did not include other community partition methods with the proposed co-topics networks. [Conclusions] The co-topics network meets the demands of high-dimensional data and identifies the key topics and the closely linked topics of the target documents.

Key wordsCo-Topics network      LDA      Community partition      K-means     
Received: 09 March 2016      Published: 29 September 2016

Cite this article:

Niu Liang. New Research and Application with Co-topics Network. New Technology of Library and Information Service, 2016, 32(7-8): 137-146.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.07.17     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I7-8/137

[1] 唐果媛, 张薇. 国内外共词分析法研究的发展与分析[J]. 图书情报工作, 2014, 58(22): 138-145.
[1] (Tang Guoyuan, Zhang Wei.Development and Analysis of Co-word Analysis Method at Home and Abroad[J]. Library and Information Service, 2014, 58(22): 138-145.)
[2] Blei D M, Ng A Y, Jordan M I, et al.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[3] Griffiths T L, Steyvers M.Finding Scientific Topics[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(1): 5228-5235.
[4] Sugimoto C R, Li D, Russell T G, et al.The Shifting Sands of Disciplinary Development: Analyzing North American Library and Information Science Dissertations Using Latent Dirichlet Allocation[J]. Journal of the American Society for Information Science and Technology, 2011, 62(1): 85-204.
[5] Rosen-Zvi M, Griffths T, Steyvers M, et al.The Author-topic Model for Authors and Documents [C]. In: Proceedings of the 20th Conference on Uncertainty in Arti?cial Intelligence. 2004.
[6] Wang X, McCallum A. Topics Over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2006: 424-433.
[7] Blei D M, Lafferty J D.A Correlated Topic Model of Science[J]. The Annals of Applied Statistics, 2007, 1(1): 17-35.
[8] Mimno D.Computational Historiography: Data Mining in a Century of Classics Journals[J]. Journal on Computing and Cultural Heritage, 2012, 5(1): 1-19.
[9] Sievert C, Shirley K E.LDAvis: A Method for Visualizing and Interpreting Topics [C]. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. 2014.
[10] Zhang H, Qiu B, Giles C L, et al.An LDA-based Community Structure Discovery Approach for Large-scale Social Networks [C]. In: Proceedings of the 2007 IEEE International Conference on Intelligence and Security Informatics. 2007.
[11] Wang X, Zhang K, Jin X, et al.Mining Common Topics from Multiple Asynchronous Text Streams[C]. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. 2009.
[12] Newman D, Asuncion A, Smyth P, et al.Distributed Algorithms for Topic Models[J]. Journal of Machine Learning Research, 2009, 10(12): 1801-1828.
[13] Gretarsson B, O’Donovan J, Bostandjiev S, et al. TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling[J]. Transactions on Intelligent Systems & Technology, 2012, 3(2): 565-582.
[14] He Q, Chen B, Pei J, et al.Detecting Topic Evolution in Scientific Literature: How Can Citations Help? [C]. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009.
[15] Cha Y, Cho J.Social-network Analysis Using Topic Models [C]. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2012.
[16] Li D, He B, Ding Y, et al.Community-based Topic Modeling for Social Tagging [C]. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 2010.
[17] Chuang J, Ramage D, Manning C D, et al.Interpretation and Trust: Designing Model-Driven Visualizations for Text Analysis [C]. In: Proceedings of the 2012 SIGCHI Conference on Human Factors in Computing Systems. 2012: 443-452.
[18] Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008.
[19] Chang J, Boyd-Graber J, Wang C, et al. Reading Tea Leaves: How Humans Interpret Topic Models [R]. Advances in Neural Information Processing Systems 22 (NIPS2009).
[20] Mimno D, Wallach H M, Talley M, et al.Optimizing Semantic Coherence in Topic Models [C]. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011.
[21] Latapy M, Magnien C, Del Vecchio N.Basic Notions for the Analysis of Large Two-mode Networks[J]. Social Networks, 2008, 30(1): 31-48.
[22] Newman M E J. Scientific Collaboration Networks. I. Network Construction and Fundamental Results[J]. Physical Review E, 2001, 64(1): 016131.
[23] Zhou T, Ren J, Medo M, et al.Bipartite Network Projection and Personal Recommendation[J]. Physical Review E, 2007, 76(4): 046115.
[24] 任晓龙, 吕琳媛. 网络重要节点排序方法综述[J]. 科学通报, 2014, 59(13): 1175-1197.
[24] (Ren Xiaolong, Lv Linyuan.Review of Ranking Nodes in Complex Networks[J]. Chinese Science Bulletin, 2014, 59(13): 1175-1197.)
[25] Newman M E J. Fast Algorithm for Detecting Community Structure in Networks[J]. Physical Review E, 2004, 69(6): 066133.
[26] Girvan M, Newman M.Community Structure in Social and Biological Networks[J]. Proceedings of the National Academy of Sciences, 2002, 99(12): 7821-7826.
[27] Clauset A, Newman M E J, Moore C. Finding Community Structure in Very Large Networks[J]. Phyisical Review E, 2004, 70(6): 066111.
[28] Newman M E J. Modularity and Community Structure in Networks [OL].ArXiv: physics/0602124v1.
[29] Brandes U, Delling D, Gaertler M, et al.Maximizing Modularity is Hard [OL]. arXiv: Physics/0608255.
[30] Taddy M A.On Estimation and Selection for Topic Models [C]. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics.2015.
[31] Bischof J M, Airoldi E M.Summarizing Topical Content with Word Frequency and Exclusivity [C]. In: Proceedings of the 29th International Conference on Machine Learning. Omnipress. 2012.
[32] Arun R, Suresh V, Madhavan V C E, et al. On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations [A]. // Advances in Knowledge Discovery and Data Mining[M]. Springer Berlin Heidelberg, 2010: 391-402.
[33] Cao J, Xia T, Li J, et al.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2008, 72(7-9): 1775-1781.
[34] Deveaud R, SanJuan E, Bellot P. Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval[J]. Document Numérique, 2014, 17(1): 61-84.
[35] Kim D, Oh A.Topic Chains for Understanding a News Corpus [C]. In: Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing.2011.
[36] 朱连江, 马炳先, 赵学泉. 基于轮廓系数的聚类有效性分析[J]. 计算机应用, 2010, 32(S2): 139-141.
[36] (Zhu Lianjiang, Ma Bingxian, Zhao Xuequan.Clusting Validity Analysis Based on Silhouette Coefficient[J]. Journal of Computer Application, 2010, 32(S2): 139-141.)
[37] 王晓光. 科学知识网络的形成与演化(I): 共词网络方法的提出[J]. 情报学报, 2009, 28(4): 599-605.
[37] (Wang Xiaoguang.Formation and Evolution of Science Knowledge Network (I): A New Research Method Based on Co-word Network[J]. Journal of the China Society for Scientific and Technical Information, 2009, 28(4): 599-605.)
[1] Lixin Xia,Jieyan Zeng,Chongwu Bi,Guanghui Ye. Identifying Hierarchy Evolution of User Interests with LDA Topic Model[J]. 数据分析与知识发现, 2019, 3(7): 1-13.
[2] Peng Guan,Yuefen Wang,Zhu Fu. Analyzing Topic Semantic Evolution with LDA: Case Study of Lithium Ion Batteries[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[3] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[4] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[5] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[6] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[7] Guijun Yang,Xue Xu,Fuqiang Zhao. Predicting User Ratings with XGBoost Algorithm[J]. 数据分析与知识发现, 2019, 3(1): 118-126.
[8] Yue He,Yue Feng,Shupeng Zhao,Yufeng Ma. Recommending Contents Based on Zhihu Q&A Community: Case Study of Logistics Topics[J]. 数据分析与知识发现, 2018, 2(9): 42-49.
[9] Tao Zhang,Haiqun Ma. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[10] Yanhua Xu,Yujie Miao,Lin Miao,Xueqiang Lv. Generating HSK Writing Essays with LDA Model[J]. 数据分析与知识发现, 2018, 2(9): 80-87.
[11] Ziming Zeng,Qianwen Yang. Sentiment Analysis for Micro-blogs with LDA and AdaBoost[J]. 数据分析与知识发现, 2018, 2(8): 51-59.
[12] Beibei Pang,Juanqiong Gou,Wenxin Mu. Extracting Topics and Their Relationship from College Student Mentoring[J]. 数据分析与知识发现, 2018, 2(6): 92-101.
[13] Li Wang,Lixue Zou,Xiwen Liu. Visualizing Document Correlation Based on LDA Model[J]. 数据分析与知识发现, 2018, 2(3): 98-106.
[14] Jingqi Wang,Rui Li,Huayi Wu. The Evolution of Online Public Opinion Based on Spatial Autocorrelation[J]. 数据分析与知识发现, 2018, 2(2): 64-73.
[15] Hongwei Liu,Hongming Gao,Li Chen,Mingjun Zhan,Zhouyang Liang. Identifying User Interests Based on Browsing Behaviors[J]. 数据分析与知识发现, 2018, 2(2): 74-85.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn