Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (6): 69-79    DOI: 10.11925/infotech.2096-3467.2019.1104
Current Issue | Archive | Adv Search |
Identifying Key Users and Topics from Online Learning Community
Cai Yongming1(),Liu Lu1,Wang Kewei2
1Business School, University of Jinan, Jinan 250002, China
2School of Economics and Management, Inner Mongolia University of Technology, Huhhot 010051, China
Download: PDF (2438 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study automatically analyzes resources of a virtual learning community, aiming to address the issue of information overload. [Methods] We proposed a hyper-network LDA model based on the user-document-word cube. Then, we modified this LDA model with the help of word and user analysis. Finally, we improved the cohesiveness of topics in the hyper-network LDA model, through increasing the distribution probability of closely connected words or users for the same topics. [Results] Compared to the traditional social network analysis methods, the proposed LDA model can identify important users, key topics and the relationship among them, as well as user preferences with frequency matrix of user-vocabulary and distribution probability of user-topic. [Limitations] Hyper-network analysis theory is still developing and we only studied the weighted un-directed network, which does not include the relationship of posting and replying. [Conclusions] The hyper-network LDA model effectively analyzes topics of short texts and online interactions, which are of significance to users and online learning community managers.

Key wordsVirtual Learning Community      Hyper-Network LDA Model      Key Users      Core Topics      Joint Analysis     
Received: 08 October 2019      Published: 07 July 2020
ZTFLH:  G434 TP391  
Corresponding Authors: Cai Yongming     E-mail: cymujn@163.com

Cite this article:

Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community. Data Analysis and Knowledge Discovery, 2020, 4(6): 69-79.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.1104     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I6/69

Structure Decomposition of User-Term Hyper-Network
Decomposition Diagram of Traditional LDA Model
Decomposition Diagram of Hyper-Network LDA Model
Experimental Data
排序 用户 用户 中介中心度 用户 CLDCx,1+CLDCx,22
1 zyk20062964 93 zyk20062964 615 148.2 zyk20062964 8.728E-04
2 ydc129 62 jgchen1966 503 900.2 飞天玄舞6 8.403E-04
3 jgchen1966 61 china_cao1 433 808.2 ydc129 8.116E-04
4 widen我的世界 60 水天一色DIY 409 840.8 浪子彦青 8.040E-04
5 Crsky7 60 ydc129 346 632.8 widen我的世界 7.270E-04
6 水天一色DIY 57 420948492 344 053.3 franky_sas 6.709E-04
7 420948492 56 Crsky7 332 498.0 wangfeng666 6.309E-04
8 飞天玄舞6 52 410234198 314 516.4 我的素质低 5.884E-04
9 nightmarehelen 52 飞天玄舞6 261 111.4 tigerwolf 5.368E-04
10 wjj0913 52 浪子彦青 243 607.7 曲歌99 5.199E-04
11 资料狂人 44 kuangsir6 224 751.2 yangbenfa 4.864E-04
12 edward132 40 大家开心 222 211.5 410234198 4.264E-04
13 浪子彦青 40 梦若舞之官世强 195 902.9 nivastuli 4.110E-04
14 wwqqer 37 widen我的世界 189 873.0 数据分析闯天下 3.954E-04
15 数据分析闯天下 33 悬思苦索 184 142.5 Nicolle 3.499E-04
16 china_cao1 33 数据分析闯天下 163 175.3 wwqqer 1.770E-04
17 tigerwolf 32 davil2000 160 709.9 jjxm20060807 8.288E-05
18 劲量小兔888 30 franky_sas 156 655.9 woaiwojia9 5.793E-05
19 franky_sas 27 劲量小兔888 152 875.5 wh7064rg 4.854E-05
20 liucg9999 26 nightmarehelen 143 267.9 420948492 3.192E-05
Top20 Users of Hyper-Network Community
User Community Based on CLECC Algorithm
User-Term Sub Network of Top5 Users
Topic Distribution of Any 30 Bags of Words
Word Cloud of Forum
N Topic1 Topic2 Topic3 Topic4 Topic5 Topic6 Topic7 Topic8 Topic9 Topic10 Topic11 Topic12
1 技术 统计 matlab 分析师 analytics 大数据 data 数据挖掘 问题 下载 数据挖掘
2 案例 数据分析 时代 data 商业智能 mining 数据挖掘 语言 求助 视频 客户
3 统计 年鉴 代码 前景 mining 发布 learning 概念 学习 回归 数据分析 数据
4 研究 中国 数学 了解 系列 独家 machine 数据 软件 数据 教学 序列
5 分享 2011 算法 国内 science 媒体 edition 技术 中文 模型 系列
6 系统 数据 课程 工程师 analysis 挖掘 statistical 跪求 算法 免费 时间
7 资料 经济 优秀论文 数据 经典 信息 business 论文 r 变量 数据挖掘 应用
8 方法 2010 分享 工作 big 互联网 analysis 经典 请问 软件 模型
9 入门 2012 竞赛 数据分析 matlab 文本 methods 论坛 小白 基于
10 免费 城市 建模 中国 教材 电子表格 models 中国 python 神经网络 数据库
11 软件 发展 资料 规则 modeler smartbi knowledge ppt 统计 分析 分析 数据分析
12 下载 地区 神经网络 就业 代码 bi intelligence 报告 书籍 成分 设计
13 教程 excel weka 分析 示例 案例 applications sharepoint sas 问题 论文
14 介绍 工业 分析 企业 课程 解决方案 big 答案 教材 请教 中文 分析师
15 经典 产业 大学生 提升 statistics 分析 pattern 全国 入门 检验 求助 基于
16 论文 科技 教程 未来 handbook 运营 modeling 习题 商业智能 一个 区别 管理
17 spss gdp 工具 问题 algorithms 应用 web 韩家 机器 建模 分析
18 数据分析 汇总 大全 科学家 computing discovery 支持向量机 sas 电子书 作者
19 模式 主要 全国 关联 business 网络 techniques 推荐 样本 预测 企业
20 代码 全国 模型 行业 handbook 电子商务 recognition 行业 网络 有人 计算 挖掘
Top20 Terms for 12 Topics
Topic Participation Probability Distribution of Top5 Users
[1] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003,3:993-1022.
[2] Rosen-Zvi M, Griffiths T, Steyvers M, et al. The Author-Topic Model for Authors and Documents [C]//Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 2004: 487-494.
[3] Tobarra L, Robles-Gómez A, Ros S, et al. Analyzing the Students’ Behavior and Relevant Topics in Virtual Learning Communities[J]. Computers in Human Behavior, 2014,31:659-669.
doi: 10.1016/j.chb.2013.10.001
[4] Jenders M, Krestel R, Naumann F. Which Answer is Best?: Predicting Accepted Answers in Mooc Forums [C]//Proceedings of the International Conference Companion on World Wide Web. 2016: 679-684.
[5] 孙传远, 刘玉梅. 中国大学视频公开课评价——基于爱课程网“精彩评论”的内容分析研究[J]. 现代教育技术, 2013,23(12):91-95.
[5] ( Sun Chuanyuan, Liu Yumei. The Evaluation of University Video Open Courses of China——Content Analysis of “Wonderful Comments” Based on I-Courses Web[J]. Modern Educational Technology, 2013,23(12):91-95.)
[6] 卢露, 丁才昌. 社区中最具影响力博客的探测模型[J]. 计算机科学, 2011,38(S1):165-168.
[6] ( Lu Lu, Ding Caichang. Model of Identifying the Influentials in Blog Community[J]. Computer Science, 2011,38(S1):165-168.)
[7] Li D F, Ding Y, Sugimoto C, et al. Modeling Topic and Community Structure in Social Tagging: The TTR-LDA-Community Model[J]. Journal of the Association for Information Science Technology, 2011,62(9):1849-1866.
[8] 廖晓, 李志宏, 席运江. 基于加权知识网络分析的企业社区创新用户专家知识发现方法[J]. 系统工程理论与实践, 2016,36(5):1268-1279.
[8] ( Liao Xiao, Li Zhihong, Xi Yunjiang. Knowledge Discovery Methods on User-Experts in Enterprise Virtual Communities Based on Weighted Knowledge Network[J]. Systems Engineering-Theory & Practice, 2016,36(5):1268-1279.)
[9] Deng Q, Wang Z. Degree Centrality in Scientific Collaboration Supernetwork[C]//Proceedings of the International Conference on Information Science & Technology, Nanjing, China. 2011: 259-262.
[10] 郭秋萍, 梁梦丽, 刘秀丽, 等. 基于作者—关键词—引文多重共现的超网络知识关联研究[J]. 情报理论与实践, 2016,39(7):20-26.
[10] ( Guo Qiuping, Liang Mengli, Liu Xiuli, et al. Research on Knowledge Correlation in Hypernetwork Based on Author-Keyword-Citation Multiple Co-occurrence[J]. Information Studies: Theory & Application, 2016,39(7):20-26.)
[11] Zhao L M, Zhang H H, Wu W Q. Cooperative Knowledge Creation in an Uncertain Network Environment Based on a Dynamic Knowledge Supernetwork[J]. Scientometrics, 2019,119(2):657-685.
doi: 10.1007/s11192-019-03049-4
[12] 蔡永明, 长青. 共词网络LDA模型的中文短文本主题分析[J]. 情报学报, 2018,37(3):305-317.
[12] ( Cai Yongming, Chang Qing. Chinese Short Text Topic Analysis by Latent Dirichlet Allocation Model with Co-word Network Analysis (CA-LDA)[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(3):305-317.)
[13] Denning P. The Science of Computing: Supernetworks[J]. American Scientist, 1985,73(3):225-227.
[14] Kleinberg J M. Authoritative Sources in a Hyperlinked Environment[J]. Journal of the ACM, 1999,46(5):604-632.
doi: 10.1145/324133.324140
[15] Birkbak A, Carlsen H B. The World of Edgerank: Rhetorical Justifications of Facebook’s News Feed Algorithm [J]. Computational Culture, 2016(5):1-24.
[16] Bródka P, Skibicki K, Kazienko P, et al. A Degree Centrality in Multi-Layered Social Network [C]//Proceedings of the International Conference on Computational Aspects of Social Networks. 2011: 19-21.
[17] Spatocco C, Stilo G, Domeniconi C, et al. A New Framework for Centrality Measures in Multiplex Networks [OL]. arXiv Preprint, arXiv: 1801. 08026.
[18] Girvan M, Newman M E. Community Structure in Social and Biological Networks[J]. Proceedings of the National Academy of Sciences, 2002,99(12):7821-7826.
doi: 10.1073/pnas.122653799
[19] Newman M E J. Detecting Community Structure in Networks[J]. European Physical Journal B, 2004,38:321-330.
doi: 10.1140/epjb/e2004-00124-y
[20] Bródka P, Filipowski T, Kazienko P. An Introduction to Community Detection in Multi-Layered Social Network [A]// Lytras M D, Ruan D, Tennyson R D, et al. Information Systems,E-Learning, and Knowledge Management Research[M]. Springer Berlin Heidelberg, 2013: 185-190.
[21] Wilson J D, Palowitch J, Bhamidi S, et al. Community Extraction in Multilayer Networks with Heterogeneous Community Structure[J]. Journal of Machine Learning Research, 2017,18(1):5458-5506.
[22] Amelio A, Mangioni G, Tagarelli A. Modularity in Multilayer Networks Using Redundancy-Based Resolution and Projection-Based Inter-Layer Coupling[J]. IEEE Transactions on Network Science and Engineering. DOI: 10.1109/TNSE.2019.2913325.
doi: 10.1109/TNSE.2016.2537545 pmid: 28435844
[23] Jordan M. Learning in Graphical Models[M]. Netherlands: Springer, 1998.
[24] Griffiths T L, Steyvers M. Finding Scientific Topics[J]. Proceedings of the National Academy of Sciences, 2004,101(1):5228-5235.
doi: 10.1073/pnas.0307752101
No related articles found!
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn