Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (11): 84-91    DOI: 10.11925/infotech.2096-3467.2020.0536
Predicting Time Series of Theft Crimes Based on LSTM Network
Yan Jinghua1,2,3(),Hou Miaomiao3
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
3School of Information and Network Security, People’s Public Security University of China, Beijing 100038, China
[Objective] This paper tries to predict the daily number of theft activities. [Methods] We used LSTM network to analyze theft data from a large city in north China. First, we retrieved our data from January 1, 2005 to February 24, 2007 and from January 1, 2009 to January 7, 2011, respectively. Then, we set three different cases to examine the time series prediction of the daily number. Finally, we compared our results with those of ARIMA, Support Vector Regression, Random Forest and XGBoost with the same data set. [Results] The percentage root mean square error (PRMSE) of our model were 18.4%, 11.7% and 41.9%, respectively, which were better than those of ARIMA, Support Vector Regression, Random Forest or XGBoost model. [Limitations] More research is needed to predict the period when the number of theft crimes fluctuates dramatically. [Conclusions] The proposed model could improve the decision makings for community safety, police patrol and other specific missions.

Key wordsCrime Prediction      Time Series      LSTM Network      Theft     
Received: 08 June 2020      Published: 04 December 2020
ZTFLH:  D917  
Corresponding Authors: Yan Jinghua     E-mail:

Cite this article:

Yan Jinghua,Hou Miaomiao. Predicting Time Series of Theft Crimes Based on LSTM Network. Data Analysis and Knowledge Discovery, 2020, 4(11): 84-91.

The Process of LSTM Model Construction and Prediction Performance Evaluation[13,14,15]
Time Distribution of Theft Cases in City A
特征名称 特征取值
month 当前时刻所处月份
weekend 0-工作日,1-非工作日
holiday 0-非节假日,1-节假日
weekday_avg 每工作日盗窃犯罪案件数量均值
weekend_avg 每非工作日盗窃犯罪案件数量均值
month_avg 每月盗窃犯罪案件数量均值
count_lag1 上一时刻的盗窃犯罪案件数量
Data Features
训练集 ADF检验 1%检验
t统计量 p
{X1t} -3.91 0.00 -3.44
{X2t}差分前 -2.82 0.06 -3.44
{X2t}差分后 -14.80 0.00 -3.44
{X3t} -3.77 0.00 -3.44
ADF Unit Root Test Results of Theft Data Time Series
The Fitting Results of LSTM Model
测试集 预测模型 PRMSE
算例1 LSTM 18.4%
ARIMA(4, 1, 1) 32.4%
支持向量回归 24.4%
随机森林 24.1%
XGBoost 25.3%
算例2 LSTM模型 11.7%
ARIMA(1, 1, 2) 19.6%
支持向量回归 15.1%
随机森林 20.0%
XGBoost 20.3%
算例3 LSTM模型 41.9%
ARIMA(4, 1, 1) 65.5%
支持向量回归 76.8%
随机森林 82.6%
XGBoost 84.9%
Comparison of Model Prediction Effect
