Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (4): 67-75    DOI: 10.11925/infotech.2096-3467.2017.04.08
Orginal Article Current Issue | Archive | Adv Search |
Predicting Dropout Rates of MOOCs with Sliding Window Model
Lu Xiaohang1, Wang Shengqing2(), Huang Junjie1, Chen Wenguang1, Yan Zengwang1
1Department of Information Management, Peking University, Beijing 100871, China
2Center of Faculty Development, Peking University, Beijing 100871, China
Download: PDF (3345 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to improve the MOOCs curriculum quality and pedagogy by analyzing the dropout behaviors with data from the MOOC of Peking University on Coursera. [Methods] We extracted 19 major features from the logs and then constructed a siding window model to predict the dropout rates. [Results] The precision of the proposed model was maintained above 90%. The SVM and LSTM methods further improved the performance of the proposed model. [Limitations] The new method needs to be examined with smaller sized courses. [Conclusions] Predicting dropout rates could help us improve the course quality effectively.

Key wordsMOOC      Dropout Point      Dropout Rates      Sliding Window Model      Dropout Prediction     
Received: 27 February 2017      Published: 24 May 2017
ZTFLH:  G434  

Cite this article:

Lu Xiaohang,Wang Shengqing,Huang Junjie,Chen Wenguang,Yan Zengwang. Predicting Dropout Rates of MOOCs with Sliding Window Model. Data Analysis and Knowledge Discovery, 2017, 1(4): 67-75.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.04.08     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I4/67

课程ID 课程名 注册人数 记录有成绩 最终成绩大于0 最终成绩大于60 通过比例(%)
methodologysocial2-001 社会调查与研究方法(下) 3 566 3 184 371 185 5.1879
methodologysocial-001 社会调查与研究方法(上) 7 836 6 051 6 051 255 3.2542
pkubioinfo-002 2014生物信息学002 16 714 15 790 1 268 510 3.0513
pkubioinfo-001 2013生物信息学001 18 367 18 367 1 620 520 2.8312
pkubioinfo-003 生物信息学-导论与方法 16 958 16 072 909 360 2.1229
课程ID 有论坛
行为
成绩
大于60
有论坛行为且
成绩大于60
有论坛行为在有成绩
学习者中占比(%)
有论坛行为在成绩大于60
的学习者中的占比(%)
pkubioinfo-001 2 645 580 511 68.3333 88.1034
pkubioinfo-002 1 425 508 395 54.5741 77.7559
pkubioinfo-003 1 523 358 316 66.9967 88.2682
methodologysocial-001 1 165 290 269 17.8318 92.7586
methodologysocial2-001 326 203 153 64.4205 75.3695
特征 字段 数据类型 备注
点击流 page_view
page_view_quiz
page_view_forum
page_view_lecture
page_view_wiki
viedo_view_times
video_pause_times
video_pause_speed
Int
Int
Int
Int
Int
Int
Int
Float
查看网页
查看测试页面
查看论坛页面
查看视频页面
观看课程wiki
观看视频次数
视频暂停次数
播放速率
作业测试 try_hw
try_quiz
try_lec
Int
Int
Int
尝试作业次数
尝试小测次数
尝试讲座次数
论坛行为 view_forum
thread_forum
post_thread
post_comments
Upvote
Downvote
add_tag
del_tag
Int
Int
Int
Int
Int
Int
Int
Int
查看论坛
查看线程
创建线程
发表评论
点赞
反对
增加标签
删除标签
[1] Amnueypornsakul B, Bhat S, Chinprutthiwong P.Predicting Attrition Along the Way: The UIUC Model[C]// Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, Doha, Qatar. Association for Computational Linguistics, 2014: 55-59.
[2] Sinha T, Jermann P, Li N, et al. Your Click Decides Your Fate: Inferring Information Processing and Attrition Behavior from MOOC Video Clickstream Interactions[OL]. arXiv Preprint. arXiv:1407.7131, 2014.
[3] Taylor C, Veeramachaneni K, O’Reilly U M. Likely to Stop? Predicting Stopout in Massive Open Online Courses[OL]. arXiv Preprint. arXiv:1408.3382, 2014.
[4] Kloft M, Stiehler F, Zheng Z, et al.Predicting MOOC Dropout over Weeks Using Machine Learning Methods[C]// Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, Doha, Qatar. Association for Computational Linguistics, 2014.
[5] Sharkey M, Sanders R.A Process for Predicting MOOC Attrition[C]//Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs,Doha, Qatar. Association for Computational Linguistics, 2014: 50-54.
[6] Yang D, Sinha T, Adamson D, et al.“Turn on, Tune in, Drop out”: Anticipating Student Dropouts in Massive Open Online Courses[C]//Proceedings of the 2013 NIPS Data-driven Education Workshop. 2013: 11-14.
[7] Lipsitz S R.Categorical Data Analysis[J]. Statistics in Medicine, 1992, 13(11): 1791-1792.
[8] Cortes C, Vapnik V.Support-Vector Networks[J]. Machine Learning, 1995, 20(3): 273-297.
[9] Rosenblatt F.Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms[J]. American Journal of Psychology, 1962, 7(3): 218-219.
[10] Hochreiter S, Schmidhuber J.Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1-32.
[1] Wu Jiang,He Chaocheng,Ma Panhao. Analyzing Interaction of MOOC Users with Iteration Super Centrality[J]. 数据分析与知识发现, 2017, 1(8): 1-8.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn