Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (7-8): 137-146    DOI: 10.11925/infotech.1003-3513.2016.07.17
Orginal Article Current Issue | Archive | Adv Search |
New Research and Application with Co-topics Network
Niu Liang()
School of Economics & Management, China Jiliang University, Hangzhou 310018, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper builds a co-topics network to analyze the relationship among the topics of research articles and then optimize terms representing these topics. [Methods] First, we transformed the “document-topics” bipartite Graph to co-topics networks in accordance with weighted projection rules. Second, we identified the key topics with the combination of betweenness centrality and topic probability. Third, we divided the co-topics network community with the GN algorithm. Finally we optimized topic terms with relevance method. [Results] We compared the co-topics networks and the K-means based on JSD by testing optimal topic number (28) and random subjective topic numbers(20, 30). Their clustering numbers were the same and the consistent degree of clustering content reached 100%, 95% and 87%. [Limitations] We did not include other community partition methods with the proposed co-topics networks. [Conclusions] The co-topics network meets the demands of high-dimensional data and identifies the key topics and the closely linked topics of the target documents.

Key wordsCo-Topics network      LDA      Community partition      K-means     
Received: 09 March 2016      Published: 29 September 2016

Cite this article:

Niu Liang. New Research and Application with Co-topics Network. New Technology of Library and Information Service, 2016, 32(7-8): 137-146.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.07.17     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I7-8/137

[1] 唐果媛, 张薇. 国内外共词分析法研究的发展与分析[J]. 图书情报工作, 2014, 58(22): 138-145.
[1] (Tang Guoyuan, Zhang Wei.Development and Analysis of Co-word Analysis Method at Home and Abroad[J]. Library and Information Service, 2014, 58(22): 138-145.)
[2] Blei D M, Ng A Y, Jordan M I, et al.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[3] Griffiths T L, Steyvers M.Finding Scientific Topics[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(1): 5228-5235.
[4] Sugimoto C R, Li D, Russell T G, et al.The Shifting Sands of Disciplinary Development: Analyzing North American Library and Information Science Dissertations Using Latent Dirichlet Allocation[J]. Journal of the American Society for Information Science and Technology, 2011, 62(1): 85-204.
[5] Rosen-Zvi M, Griffths T, Steyvers M, et al.The Author-topic Model for Authors and Documents [C]. In: Proceedings of the 20th Conference on Uncertainty in Arti?cial Intelligence. 2004.
[6] Wang X, McCallum A. Topics Over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2006: 424-433.
[7] Blei D M, Lafferty J D.A Correlated Topic Model of Science[J]. The Annals of Applied Statistics, 2007, 1(1): 17-35.
[8] Mimno D.Computational Historiography: Data Mining in a Century of Classics Journals[J]. Journal on Computing and Cultural Heritage, 2012, 5(1): 1-19.
[9] Sievert C, Shirley K E.LDAvis: A Method for Visualizing and Interpreting Topics [C]. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. 2014.
[10] Zhang H, Qiu B, Giles C L, et al.An LDA-based Community Structure Discovery Approach for Large-scale Social Networks [C]. In: Proceedings of the 2007 IEEE International Conference on Intelligence and Security Informatics. 2007.
[11] Wang X, Zhang K, Jin X, et al.Mining Common Topics from Multiple Asynchronous Text Streams[C]. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. 2009.
[12] Newman D, Asuncion A, Smyth P, et al.Distributed Algorithms for Topic Models[J]. Journal of Machine Learning Research, 2009, 10(12): 1801-1828.
[13] Gretarsson B, O’Donovan J, Bostandjiev S, et al. TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling[J]. Transactions on Intelligent Systems & Technology, 2012, 3(2): 565-582.
[14] He Q, Chen B, Pei J, et al.Detecting Topic Evolution in Scientific Literature: How Can Citations Help? [C]. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. 2009.
[15] Cha Y, Cho J.Social-network Analysis Using Topic Models [C]. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2012.
[16] Li D, He B, Ding Y, et al.Community-based Topic Modeling for Social Tagging [C]. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 2010.
[17] Chuang J, Ramage D, Manning C D, et al.Interpretation and Trust: Designing Model-Driven Visualizations for Text Analysis [C]. In: Proceedings of the 2012 SIGCHI Conference on Human Factors in Computing Systems. 2012: 443-452.
[18] Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008.
[19] Chang J, Boyd-Graber J, Wang C, et al. Reading Tea Leaves: How Humans Interpret Topic Models [R]. Advances in Neural Information Processing Systems 22 (NIPS2009).
[20] Mimno D, Wallach H M, Talley M, et al.Optimizing Semantic Coherence in Topic Models [C]. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011.
[21] Latapy M, Magnien C, Del Vecchio N.Basic Notions for the Analysis of Large Two-mode Networks[J]. Social Networks, 2008, 30(1): 31-48.
[22] Newman M E J. Scientific Collaboration Networks. I. Network Construction and Fundamental Results[J]. Physical Review E, 2001, 64(1): 016131.
[23] Zhou T, Ren J, Medo M, et al.Bipartite Network Projection and Personal Recommendation[J]. Physical Review E, 2007, 76(4): 046115.
[24] 任晓龙, 吕琳媛. 网络重要节点排序方法综述[J]. 科学通报, 2014, 59(13): 1175-1197.
[24] (Ren Xiaolong, Lv Linyuan.Review of Ranking Nodes in Complex Networks[J]. Chinese Science Bulletin, 2014, 59(13): 1175-1197.)
[25] Newman M E J. Fast Algorithm for Detecting Community Structure in Networks[J]. Physical Review E, 2004, 69(6): 066133.
[26] Girvan M, Newman M.Community Structure in Social and Biological Networks[J]. Proceedings of the National Academy of Sciences, 2002, 99(12): 7821-7826.
[27] Clauset A, Newman M E J, Moore C. Finding Community Structure in Very Large Networks[J]. Phyisical Review E, 2004, 70(6): 066111.
[28] Newman M E J. Modularity and Community Structure in Networks [OL].ArXiv: physics/0602124v1.
[29] Brandes U, Delling D, Gaertler M, et al.Maximizing Modularity is Hard [OL]. arXiv: Physics/0608255.
[30] Taddy M A.On Estimation and Selection for Topic Models [C]. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics.2015.
[31] Bischof J M, Airoldi E M.Summarizing Topical Content with Word Frequency and Exclusivity [C]. In: Proceedings of the 29th International Conference on Machine Learning. Omnipress. 2012.
[32] Arun R, Suresh V, Madhavan V C E, et al. On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations [A]. // Advances in Knowledge Discovery and Data Mining[M]. Springer Berlin Heidelberg, 2010: 391-402.
[33] Cao J, Xia T, Li J, et al.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2008, 72(7-9): 1775-1781.
[34] Deveaud R, SanJuan E, Bellot P. Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval[J]. Document Numérique, 2014, 17(1): 61-84.
[35] Kim D, Oh A.Topic Chains for Understanding a News Corpus [C]. In: Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing.2011.
[36] 朱连江, 马炳先, 赵学泉. 基于轮廓系数的聚类有效性分析[J]. 计算机应用, 2010, 32(S2): 139-141.
[36] (Zhu Lianjiang, Ma Bingxian, Zhao Xuequan.Clusting Validity Analysis Based on Silhouette Coefficient[J]. Journal of Computer Application, 2010, 32(S2): 139-141.)
[37] 王晓光. 科学知识网络的形成与演化(I): 共词网络方法的提出[J]. 情报学报, 2009, 28(4): 599-605.
[37] (Wang Xiaoguang.Formation and Evolution of Science Knowledge Network (I): A New Research Method Based on Co-word Network[J]. Journal of the China Society for Scientific and Technical Information, 2009, 28(4): 599-605.)
[1] Li Yueyan,Wang Hao,Deng Sanhong,Wang Wei. Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers[J]. 数据分析与知识发现, 2021, 5(4): 13-24.
[2] Yi Huifang,Liu Xiwen. Analyzing Patent Technology Topics with IPC Context-Enhanced Context-LDA Model[J]. 数据分析与知识发现, 2021, 5(4): 25-36.
[3] Wang Hongbin,Wang Jianxiong,Zhang Yafei,Yang Heng. Topic Recognition of News Reports with Imbalanced Contents[J]. 数据分析与知识发现, 2021, 5(3): 109-120.
[4] Wang Wei, Gao Ning, Xu Yuting, Wang Hongwei. Topic Evolution of Online Reviews for Crowdfunding Campaigns[J]. 数据分析与知识发现, 2021, 5(10): 103-123.
[5] Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community[J]. 数据分析与知识发现, 2020, 4(6): 69-79.
[6] Ye Guanghui,Zeng Jieyan,Hu Jinglan,Bi Chongwu. Analyzing Public Sentiments from the Perspective of City Profiles[J]. 数据分析与知识发现, 2020, 4(4): 15-26.
[7] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[8] Liu Yuwen,Wang Kai. Finding Geographic Locations of Popular Online Topics[J]. 数据分析与知识发现, 2020, 4(2/3): 173-181.
[9] Ye Guanghui,Xu Tong,Bi Chongwu,Li Xinyue. Analyzing Evolution of City Tourism Portraits with Multi-Dimensional Features and LDA Model[J]. 数据分析与知识发现, 2020, 4(11): 121-130.
[10] Huang Wei,Zhao Jiangyuan,Yan Lu. Empirical Research on Topic Drift Index for Trending Network Events[J]. 数据分析与知识发现, 2020, 4(11): 92-101.
[11] Wang Xiwei,Zhang Liu,Huang Bo,Wei Ya’nan. Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”[J]. 数据分析与知识发现, 2020, 4(10): 47-57.
[12] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[13] Yunfei Shao,Dongsu Liu. Classifying Short-texts with Class Feature Extension[J]. 数据分析与知识发现, 2019, 3(9): 60-67.
[14] Lixin Xia,Jieyan Zeng,Chongwu Bi,Guanghui Ye. Identifying Hierarchy Evolution of User Interests with LDA Topic Model[J]. 数据分析与知识发现, 2019, 3(7): 1-13.
[15] Peng Guan,Yuefen Wang,Zhu Fu. Analyzing Topic Semantic Evolution with LDA: Case Study of Lithium Ion Batteries[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn