利用多源数据识别城市轨道交通个体异常乘车行为<sup>*</sup>

doi:10.11925/infotech.2096-3467.2022.0648

数据分析与知识发现

2023, Vol. 7

Issue (7): 46-57 https://doi.org/10.11925/infotech.2096-3467.2022.0648

研究论文

本期目录 | 过刊浏览 | 高级检索

利用多源数据识别城市轨道交通个体异常乘车行为^*

薛刚^1,²,刘世峰^1,²,宫大庆^1,²(

),张培^1,²,刘忠良³

¹北京交通大学经济管理学院北京 100044
²北京物流信息化研究基地北京 100044
³北京京投亿雅捷交通科技有限公司北京 100101

Identifying Abnormal Riding Behaviour in Urban Rail Transit with Multi-Source Data

Xue Gang^1,²,Liu Shifeng^1,²,Gong Daqing^1,²(

),Zhang Pei^1,²,Liu Zhongliang³

¹School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China
²Beijing Logistics Informatization Research Base, Beijing 100044, China
³Beijing Jingtou Yiyajie Transportation Technology Co., Ltd., Beijing 100101, China

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF (1310 KB) HTML ( 12 )
输出: BibTeX | EndNote (RIS)

摘要

【目的】 构建数据集及算法识别城市轨道交通中的异常乘车行为（偷窃、乞讨卖艺及未授权派发广告等）。【方法】 通过构建时空矩阵将乘客的时空轨迹精炼至时空特征图中，在不提升复杂度的同时保留全部出行记录；将时空特征图作为输入，建立基于注意力机制以及图卷积神经网络的算法框架，提取出乘客的关键轨迹模式特征，进而从常规客流中识别出异常乘车行为。【结果】 实验结果表明本文方法有效，精准度达到93.10%，召回率达到95.30%，F1达到94.19%，较基线模型各评估指标均提升超过3个百分点。【局限】 如何扩充数据集样本数量以及假阳性对常规乘客的冒犯问题有待解决，无法识别常更换智能卡的异常乘客。【结论】 本文实现了一个样本规模更大、工作量更小的异常乘车行为数据集构建方法，一个可以准确识别异常乘车行为的深度学习时空特征提取方法。本文模型可以为轨道交通系统提供准确识别异常乘车行为的工具。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	薛刚
	刘世峰
	宫大庆
	张培
	刘忠良

关键词 ：城市轨道交通, 异常乘车行为, 智能卡, 时空特征提取

Abstract：

[Objective] This study constructs data sets and algorithms to identify abnormal riding behaviour in urban rail transit (theft, begging, performing arts, and unauthorized advertisement distribution). [Methods] By constructing a spatiotemporal matrix, the passengers’ spatiotemporal trajectories are refined into the spatiotemporal feature map. All travel records are retained in the map without increasing complexity. Then, we used the spatiotemporal feature map as input to create an algorithm framework based on the attention mechanism and graph convolution neural networks. This algorithm can extract passengers’ key trajectory pattern features and identify abnormal behaviour from the regular passenger flow. [Results] Experimental results demonstrate the effectiveness of the proposed method, achieving a precision of 93.10%, a recall of 95.30%, and an F1 of 94.19%. All evaluation metrics have improved by over 3% compared to the baseline model. [Limitations] More research is needed to expand the sample size of the dataset and address the false positive issues. Our model cannot identify abnormal passengers who frequently change their smart cards. [Conclusions] This study constructs a dataset for abnormal commuting behavior with a larger sample size and reduced workload. The model can serve as a tool for accurately identifying abnormal commuting behavior in rail transit systems.

Key words： Urban Rail Transit Abnormal Riding Behaviour Smart Card Spatiotemporal Feature Extraction

收稿日期: 2022-06-23 出版日期: 2023-09-07

ZTFLH:	TP393
	G350

基金资助:*北京市自然科学基金项目(9222025);国家自然科学基金项目(62276020);国家社会科学基金项目的研究成果之一(21FGLB059)

通讯作者: 宫大庆，ORCID：0000-0002-2405-851X，E-mail： dqgong@bjtu.edu.cn。

引用本文:

薛刚, 刘世峰, 宫大庆, 张培, 刘忠良. 利用多源数据识别城市轨道交通个体异常乘车行为^*[J]. 数据分析与知识发现, 2023, 7(7): 46-57.
Xue Gang, Liu Shifeng, Gong Daqing, Zhang Pei, Liu Zhongliang. Identifying Abnormal Riding Behaviour in Urban Rail Transit with Multi-Source Data. Data Analysis and Knowledge Discovery, 2023, 7(7): 46-57.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0648 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I7/46

Fig.1 出行和智能卡记录示例

Fig.2 人工标注框架

Fig.3 本研究数据集构建方法与现有方法的对比

Fig.4 识别算法总体框架

Table1 对比方法的设置

方法	精准度	召回率	$F 1$
LR	54.64%	55.68%	55.15%
DT	55.60%	56.72%	56.16%
SVM	65.30%	60.46%	62.78%
NB	56.92%	47.14%	51.57%
MLP	72.46%	55.99%	63.17%
DenseNet-121 （k=32）	85.79%	93.42%	89.45%
DenseNet-169 （k=32）	83.80%	91.29%	87.39%
DenseNet-201 （k=32）	85.46%	92.17%	88.69%
ResNet-34	80.70%	86.80%	83.64%
ResNet-50	85.30%	90.33%	87.74%
ResNet-101	90.68%	86.72%	88.65%
本文方法	93.10%	95.30%	94.19%

Table 2 平衡数据集方法性能对比

方法	精准度	召回率	$F 1$
本文方法	93.10%	95.30%	94.19%
不采用空间注意力模块	87.21%	85.97%	86.58%
只考虑时间特征矩阵	55.71%	63.73%	59.45%
只考虑空间特征矩阵	47.66%	55.96%	51.48%

Table 3 不同结构模型的预测效果对比

方法	精准度	召回率	$F 1$
LR	3.85%	55.68%	7.20%
DT	3.66%	56.72%	6.88%
SVM	4.94%	60.46%	9.14%
NB	4.71%	47.14%	8.57%
MLP	7.03%	55.99%	12.49%
DenseNet-121 （k=32）	9.97%	93.42%	18.02%
DenseNet-169 （k=32）	9.59%	91.29%	17.36%
DenseNet-201 （k=32）	9.31%	92.17%	16.91%
ResNet-34	9.68%	86.80%	17.42%
ResNet-50	10.83%	90.33%	19.33%
ResNet-101	9.60%	86.72%	17.28%
本文方法	11.59%	95.30%	20.67%

Table 4 非平衡测试集方法性能对比

Fig.5 非平衡测试集精准度箱线图

[1]	黄海蕾. 5月1日起北京地铁乞讨卖艺最高罚1千元[N]. 京华时报, 2015-04-30 ( 1).
[1]	(Huang Hailei. From May 1st, the Maximum Penalty of Begging at Beijing Metro is 1,000 Yuan[N]. Jinghua Times, 2015-04-30 ( 1).)
[2]	Pan B, Zheng Y, Wilkie D, et al. Crowd Sensing of Traffic Anomalies Based on Human Mobility and Social Media[C]// Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York: ACM, 2013: 344-353.
[3]	Hong L, Zheng Y, Yung D, et al. Detecting Urban Black Holes Based on Human Mobility Data[C]// Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York: ACM, 2015: 1-10.
[4]	Jiang S, Ferreira J, Gonzalez M C. Discovering Urban Spatial-Temporal Structure from Human Activity Patterns[C]// Proceedings of the 2012 ACM SIGKDD International Workshop on Urban Computing. New York: ACM, 2012: 95-102.
[5]	Wang H Y, Li L Y, Pan P J, et al. Early Warning of Burst Passenger Flow in Public Transportation System[J]. Transportation Research Part C: Emerging Technologies, 2019, 105: 580-598. doi: 10.1016/j.trc.2019.05.022
[6]	王玲, 代前进, 吴晓隽. 基于预警平台大数据的事件旅游客流时空分布研究[J]. 数据分析与知识发现, 2018, 2(8): 31-40.
[6]	(Wang Ling, Dai Qianjin, Wu Xiaojun. The Study on the Temporal and Spatial Distribution of Event Tourism Based on Large-Scale Tourism Early Warning Platform[J]. Data Analysis and Knowledge Discovery, 2018, 2(8): 31-40.)
[7]	李德龙, 刘德海. 引入人脸抓拍系统还是升级安检设备? : 有限资源下的地铁暴恐防御序贯博弈模型[J]. 中国管理科学, 2022, 30(12): 280-292.
[7]	(Li Delong, Liu Dehai. Introduce Face Capture System or Upgrade Security Equipment? -Sequential Game Model of Subway Terrorism Defense under Limited Resources[J]. Chinese Journal of Management Science, 2022, 30(12): 280-292.)
[8]	Bouman P, Van der Hurk E, Kroon L, et al. Detecting Activity Patterns froms Mart Card Data[C]// Proceedings of the 25th Benelux Conference on Artificial Intelligence. 2013.
[9]	Ma X L, Wu Y J, Wang Y H, et al. Mining Smart Card Data for Transit Riders’ Travel Patterns[J]. Transportation Research Part C: Emerging Technologies, 2013, 36: 1-12. doi: 10.1016/j.trc.2013.07.010
[10]	Du B W, Liu C R, Zhou W J, et al. Detecting Pickpocket Suspects from Large-Scale Public Transit Records[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 31(3): 465-478. doi: 10.1109/TKDE.69
[11]	Zhao X, Zhang Y, Liu H, et al. Detecting Pickpocketing Gangs on Buses with Smart Card Data[J]. IEEE Intelligent Transportation Systems Magazine, 2019, 11(3): 181-199. doi: 10.1109/MITS.2019.2919525
[12]	Xue G, Liu S F, Gong D Q. Identifying Abnormal Riding Behavior in Urban Rail Transit: A Survey on “In-Out” in the Same Subway Station[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(4): 3201-3213. doi: 10.1109/TITS.2020.3032843
[13]	Xue G, Gong D Q, Zhang J H, et al. Passenger Travel Patterns and Behavior Analysis of Long-Term Staying in Subway System by Massive Smart Card Data[J]. Energies, 2020, 13(10): 2670. doi: 10.3390/en13102670
[14]	Zhao J J, Qu Q, Zhang F, et al. Spatio-Temporal Analysis of Passenger Travel Patterns in Massive Smart Card Data[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(11): 3135-3146. doi: 10.1109/TITS.2017.2679179
[15]	李建勋, 张锐军, SAFONOV Paul, 等. 基于Copula函数和M-K检验的时空数据异常识别方法[J]. 系统工程理论与实践, 2019, 39(12): 3229-3236. doi: 10.12011/1000-6788-2017-2201-08
[15]	(Li Jianxun, Zhang Ruijun, Safonov P, et al. Outlier Recognition Method for Spatio-Temporal Data Based-on Copula Function and M-K Test[J]. Systems Engineering-Theory & Practice, 2019, 39(12): 3229-3236.) doi: 10.12011/1000-6788-2017-2201-08
[16]	赖永炫, 张璐, 杨帆, 等. 基于时空相关属性模型的公交到站时间预测算法[J]. 软件学报, 2020, 31(3): 648-662.
[16]	(Lai Yongxuan, Zhang Lu, Yang Fan, et al. Bus Arrival Time Prediction Algorithm Based on Spatio-Temporal Correlation Attribute Model[J]. Journal of Software, 2020, 31(3): 648-662.)
[17]	He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 770-778.
[18]	Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv: 1609.02907.
[19]	Woo S, Park J, Lee J Y, et al. CBAM: Convolutional Block Attention Module[C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[20]	冯宁, 郭晟楠, 宋超, 等. 面向交通流量预测的多组件时空图卷积网络[J]. 软件学报, 2019, 30(3): 759-769.
[20]	(Feng Ning, Guo Shengnan, Song Chao, et al. Multi-Component Spatial-Temporal Graph Convolution Networks for Traffic Flow Forecasting[J]. Journal of Software, 2019, 30(3): 759-769.)
[21]	于瑞云, 林福郁, 高宁蔚, 等. 基于可变形卷积时空网络的乘车需求预测模型[J]. 软件学报, 2021, 32(12): 3839-3851.
[21]	(Yu Ruiyun, Lin Fuyu, Gao Ningwei, et al. Passenger Demand Forecast Model Based on Deformable Convolution Spatial-Temporal Network[J]. Journal of Software, 2021, 32(12): 3839-3851.)
[22]	Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[C]// Proceedings of the International Conference on Learning Representations. 2016.

[1]	王诗炜, 陈春. 基于科学论文和技术专利关联关系识别潜在知识发现方法研究综述^*[J]. 数据分析与知识发现, 2023, 7(7): 18-31.
[2]	张振青, 孙巍. 基于特征测度和PhraseLDA模型的领域学科交叉主题识别研究——以纳米技术的农业环境应用领域为例^*[J]. 数据分析与知识发现, 2023, 7(7): 32-45.
[3]	刘美玲, 尚玥, 赵铁军, 周继云. 基于代价敏感学习的不平衡虚假评论处理模型^*[J]. 数据分析与知识发现, 2023, 7(6): 113-122.
[4]	王楠, 王淇. 基于深度学习的学生课堂专注度测评方法^*[J]. 数据分析与知识发现, 2023, 7(6): 123-133.
[5]	石磊, 李树青, 蒋明锋, 张志旺, 王愈. 融合选择数据偏差消除和条件生成对抗网络的显式评分填充策略^*[J]. 数据分析与知识发现, 2023, 7(6): 1-14.
[6]	韦华楠, 雷鸣, 汪雪锋, 余音. 基础研究资助导向识别及演化分析：以NSF为例[J]. 数据分析与知识发现, 2023, 7(5): 10-20.
[7]	本妍妍, 庞雪芹. 融入词性的医疗命名实体识别研究^*[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[8]	林伟振, 刘洪伟, 陈燕君, 温展明, 易闽琦. 基于在线评论的顾客满意度研究——以健康监测穿戴产品为例^*[J]. 数据分析与知识发现, 2023, 7(5): 145-154.
[9]	黄学坚, 马廷淮, 王根生. 基于分层语义特征学习模型的微博谣言事件检测^*[J]. 数据分析与知识发现, 2023, 7(5): 81-91.
[10]	张昱, 张海军, 刘雅情, 梁科晋, 王月阳. 基于双向掩码注意力机制的多模态情感分析^*[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[11]	陈文杰. 基于超图的科研合作推荐研究^*[J]. 数据分析与知识发现, 2023, 7(4): 68-76.
[12]	李佳蕾, 安培浚, 肖仙桃. 学科交叉主题识别方法研究综述^*[J]. 数据分析与知识发现, 2023, 7(4): 1-15.
[13]	李岱峰, 林凯欣, 李栩婷. 基于提示学习与T5 PEGASUS的图书宣传自动摘要生成器^*[J]. 数据分析与知识发现, 2023, 7(3): 121-130.
[14]	赵朝阳, 朱贵波, 王金桥. ChatGPT给语言大模型带来的启示和多模态大模型新的发展思路^*[J]. 数据分析与知识发现, 2023, 7(3): 26-35.
[15]	张智雄, 于改红, 刘熠, 林歆, 张梦婷, 钱力. ChatGPT对文献情报工作的影响^*[J]. 数据分析与知识发现, 2023, 7(3): 36-42.

Viewed

Full text

Abstract

Cited

Shared

Discussed