Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (1): 128-139    DOI: 10.11925/infotech.2096-3467.2020.0418
Current Issue | Archive | Adv Search |
Forecasting Car Sales Based on Consumer Attention
Jiang Cuiqing1,2,Wang Xiangxiang1(),Wang Zhao1
1School of Management, Hefei University of Technology, Hefei 230009, China
2Key Laboratory of Process Optimization and Intelligent Decision-Making of Ministry of Education, Hefei 230009, China
Download: PDF (1169 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study constructs a forecasting model for car sales based on consumer attention. [Methods] First, we defined consumer attention with consumer opinion and search data. Then, we used the Word2Vec algorithm to extract the initial keyword lists, while using time difference correlation analysis to identify the core keywords. Finally, we generated the user attention data with PCA and built Attention_LSTM model to predict car sales. [Results] The RMSE and MAPE indices of our model were reduced by 2.02 and 0.96%. The average percentage error of the new model was 6.52%, 3.42%, 2.56%, and 0.81% less than those of the ARIMA, SVR, BP neural network, and LSTM models. [Limitations] We did not include other social media data to analyze consumers’ online behaviors. [Conclusions] The Attention_LSTM model based on consumer attention could effectively forecast auto sales.

Key wordsSales Forecasting      Consumer Attention      LSTM      Attention Mechanism     
Received: 12 May 2020      Published: 05 February 2021
ZTFLH:  TP391  
Fund:The work is supported by the National Natural Science Foundation of China Grant No(71731005);the Humanities and Social Sciences Planning Fund of the Ministry of Education Grant No(15YJA630010)
Corresponding Authors: Wang Xiangxiang     E-mail: 2285237002@qq.com

Cite this article:

Jiang Cuiqing,Wang Xiangxiang,Wang Zhao. Forecasting Car Sales Based on Consumer Attention. Data Analysis and Knowledge Discovery, 2021, 5(1): 128-139.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0418     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I1/128

Automobile Sales Forecast Framework Based on Consumer Attention
层次 细分类型 部分初始关键词
宏观层面 经济环境 汽油价格、汽车贷款利率、购置税
相关政策 汽车限购、汽车补贴、购车优惠
中观层面 汽车网站 汽车之家、人人车、易车网
车友论坛 汽车论坛、车友会、汽车俱乐部
微观层面 大众品牌 大众朗逸、大众途观、大众途昂
本田品牌 本田CR-V、本田XR-V、本田思域
丰田品牌 丰田RAV4、丰田卡罗拉、丰田雷凌
别克品牌 别克君越、别克英朗、别克GL8
日产品牌 日产轩逸、日产奇骏、日产天籁
吉利品牌 吉利博越、吉利帝豪、吉利星越
五菱品牌 五菱之光、五菱宏光、五菱荣光
现代品牌 现代ix35、现代领动、现代菲斯
哈弗品牌 哈弗H6、哈弗M6、哈弗F7
福特品牌 福特锐界、福特锐际、福特领界
Initial Keywords (Partial)
关键词 相关系数 领先阶数 关键词 相关系数 领先阶数
易车网 0.54 5 欧蓝德 0.56 1
汽车之家 0.52 1 宝马M4 0.64 5
买车网 0.63 3 K5报价 -0.54 1
二手车 0.57 3 阿特兹 0.60 6
人人车 0.56 1 新能源车 0.65 3
车行168 0.63 5 本田哥瑞 0.62 1
车主之家 0.53 5 本田雅阁 0.53 1
汽车金融 0.64 5 轩逸图片 0.64 3
行驶证 0.61 5 x-trail 0.64 3
购置税 0.57 1 现代悦纳 0.58 1
车管所 0.60 5 哈弗H7 0.58 5
车辆年审 0.54 3 国产SUV 0.62 2
银行信贷 0.54 6 五菱宏光 0.56 3
养车费用 -0.59 4 昂科雷报价 -0.57 1
摇号查询 0.59 3 东风本田 0.56 3
汽油价格 -0.59 4 别克英朗 0.55 6
汽车分期 0.56 4 别克GL8 0.65 1
汽车上牌 0.57 3 别克商务 0.54 4
大众朗逸 0.59 1 双擎 0.59 3
大众桑塔纳 0.56 3 卡罗拉 0.64 3
大众速腾 0.55 3 汉兰达 0.56 6
捷达车 -0.57 6 吉利博越 0.60 5
浩纳 0.60 5 帝豪RS 0.62 4
Time Difference Correlation Analysis of Core Keywords (Partial)
成分 初始特征值 提取平方和载入
合计 方差/% 累积方差/% 合计 方差/% 累积方差/%
1 8.069 67.240 67.240 8.069 67.240 67.240
2 1.961 16.338 83.578 1.961 16.338 83.578
3 0.748 6.236 89.814
4 0.335 2.795 92.609
5 0.273 2.278 94.887
6 0.158 1.320 96.207
7 0.137 1.141 97.348
8 0.103 0.860 98.208
9 0.073 0.606 98.814
10 0.067 0.562 99.376
11 0.045 0.379 99.755
12 0.029 0.245 100.000
Total Variance Explained
变量 成分
1 2
宏观搜索指数 0.850 -0.453
微观搜索指数 0.605 0.558
丰田品牌指数 0.770 -0.557
吉利品牌指数 0.570 0.705
福特品牌指数 0.873 -0.127
大众品牌指数 0.904 -0.093
本田品牌指数 0.904 0.081
日产品牌指数 0.895 0.113
别克品牌指数 0.928 0.148
五菱品牌指数 0.716 0.625
现代品牌指数 0.859 -0.180
哈弗品牌指数 0.867 -0.385
Component Matrix
17]
">
LSTM Structure[17]
21]
">
Attention_LSTM Structure[21]
类型 变量名称 变量说明
消费者在线行为特征 消费者关注度 本文使用的网络搜索数据源于百度搜索引擎
口碑数量 汽车之家口碑论坛用户的发帖数量
互动数量 汽车之家口碑论坛的评论互动数量
点赞数量 汽车之家口碑论坛的评论点赞数量
口碑情感 汽车之家口碑论坛用户的星级评分
宏观经济特征 GDP 统计局数据
CPI 统计局数据
PPI 统计局数据
人均可支配收入 统计局数据
社会消费品零售总额 统计局数据
成品油价格 以国家发改委公布的92#汽油调整价格为准
美元汇率 数据来自中国人民银行网站的汇率报表
贷款利率 参照中国人民银行发布的同期贷款利率
历史销量特征 历史销量 数据来自搜狐网站汽车频道
汽车保有量 来自公安部交管局发布的民用汽车保有量
Feature Description
变量名称 与汽车销量的相关系数 变量名称 与汽车销量的相关系数
消费者关注度 0.647* 人均可支配收入 0.602*
口碑数量 0.454* 社会消费品零售总额 0.638*
互动数量 0.270* 成品油价格 -0.403*
点赞数量 0.272* 美元汇率 0.445*
口碑情感 -0.238* 贷款利率 -0.303*
GDP -0.502* 历史销量 0.736*
CPI -0.352* 汽车保有量 0.507*
PPI 0.301*
The Result of Characteristic Correlation Test
Fitting Graph of Consumer Attention and Automobile Sales
月份 实际值(万辆) SVR BP神经网络 LSTM Attention_LSTM
基础模型 对照模型 基础模型 对照模型 基础模型 对照模型 基础模型 对照模型
2019/7 152.791 160.044 159.089 155.264 153.749 148.228 151.904 153.728 152.245
2019/8 165.291 159.911 158.819 158.736 149.466 158.287 161.120 162.517 165.101
2019/9 193.064 172.866 176.742 172.438 191.852 189.245 191.065 188.182 196.919
2019/10 192.767 183.629 185.629 201.324 198.956 194.987 196.434 194.472 194.844
2019/11 205.667 192.854 205.856 205.413 198.438 205.836 200.926 199.194 207.442
2019/12 221.309 226.285 237.391 214.623 210.569 212.156 213.554 212.562 215.600
RMSE 11.27 10.47 9.94 8.74 5.37 4.44 5.06 3.04
MAPE 5.28% 4.59% 3.98% 3.73% 2.43% 1.98% 2.13% 1.17%
Prediction Error Between Base Model and Control Model
月份 实际值
(万辆)
ARIMA SVR BP神经网络 LSTM Attention_LSTM
预测值 相对误差/% 预测值 相对误差/% 预测值 相对误差/% 预测值 相对误差/% 预测值 相对误差/%
2019/7 152.791 163.649 7.11 159.089 4.12 153.749 0.63 151.904 0.58 152.245 0.36
2019/8 165.291 181.260 9.66 158.819 3.92 149.466 9.57 161.121 2.52 165.101 0.12
2019/9 193.064 209.685 8.61 176.742 8.45 191.852 0.63 191.065 1.04 196.919 2.00
2019/10 192.767 209.128 8.49 185.629 3.70 198.956 3.21 196.434 3.46 194.844 1.08
2019/11 205.667 224.286 9.05 205.856 0.09 198.438 3.51 200.925 2.31 207.442 0.86
2019/12 221.309 228.497 3.25 237.391 7.27 210.569 4.85 213.554 3.96 215.600 2.58
Experimental Prediction Results of Five Models
评价指标 ARIMA SVR BP神经网络 LSTM Attention_
LSTM
RMSE 14.81 10.47 8.74 4.44 3.04
MAPE 7.69% 4.59% 3.73% 1.98% 1.17%
Error Comparison of Five Models
Fitting and Prediction Results of Attention_LSTM
[1] 许泰然, 伍青生 . 企业新产品发布策略与消费者关注度研究[J]. 软科学, 2020,34(1):58-64.
[1] ( Xu Tairan, Wu Qingsheng . Research on New Product Announcement and Consumer Attention[J]. Soft Science, 2020,34(1):58-64.)
[2] Crowson M G, Witsell D, Eskander A . Using Google Trends to Predict Pediatric Respiratory Syncytial Virus Encounters at a Major Health Care System[J]. Journal of Medical Systems, 2020,44(3):57.
doi: 10.1007/s10916-020-1526-8 pmid: 31997013
[3] 向诚, 陆静 . 本地投资者有信息优势吗? 基于百度搜索的实证研究[J]. 中国管理科学, 2019,27(4):25-36.
[3] ( Xiang Cheng, Lu Jing . Do Local Investors Have Information Advantages? An Empirical Study with Baidu Search[J]. Chinese Journal of Management Science, 2019,27(4):25-36.)
[4] 任武军, 李新 . 基于互联网大数据的旅游需求分析——以北京怀柔为例[J]. 系统工程理论与实践, 2018,38(2):437-443.
[4] ( Ren Wujun, Li Xin . Tourism Demand Analysis Based on Internet Big Data: The Case of Huairou, Beijing[J]. Systems Engineering-Theory & Practice, 2018,38(2):437-443.)
[5] Geva T, Oestreicher-Singer G, Efron N , et al. Using Forum and Search Data for Sales Prediction of High-Involvement Projects[J]. MIS Quarterly, 2017,41(1):65-82.
[6] Fantazzini D, Toktamysova Z . Forecasting German Car Sales Using Google Data and Multivariate Models[J]. International Journal of Production Economics, 2015,170(A):97-135.
[7] 王炼, 宁一鉴, 贾建民 . 基于网络搜索的销量与市场份额预测:来自中国汽车市场的证据[J]. 管理工程学报, 2015,29(4):56-64.
[7] ( Wang Lian, Ning Yijian, Jia Jianmin . Predicting Sales and Market Share with Online Search: Evidence from Chinese Automobile Market[J]. Journal of Industrial Engineering and Engineering Management, 2015,29(4):56-64.)
[8] 冯明, 刘淳 . 基于互联网搜索量的先导景气指数、需求预测及消费者购前调研行为——以汽车行业为例[J]. 营销科学学报, 2013,9(3):31-44.
[8] ( Feng Ming, Liu Chun . Pre-purchase Research Behaviors, Leading Climate Index and Demand Forecasting Based on Internet Search Data:The Case of Automobile Industry in China[J]. Journal of Marketing Science, 2013,9(3):31-44.)
[9] 尹小平, 王艳秀 . 中国汽车销量影响因素的实证分析, 统计与决策, 2011(8):98-100.
[9] ( Yin Xiaoping, Wang Yanxiu . An Empirical Analysis on The Influencing Factors of Chinese Automobile Sales, Statistics & Decision, 2011(8):98-100.)
[10] Chen D P. Chinese Automobile Demand Prediction Based on ARIMA Model[C]// Proceedings of the 4th International Conference on Biomedical Engineering and Informatics. IEEE, 2011: 2197-2201.
[11] 张军凯, 孙志锋 . 基于优化灰色-马尔可夫链的销量预测, 现代制造工程, 2019(4):7-13.
[11] ( Zhang Junkai, Sun Zhifeng . Sales Forecast Based on Optimized Grey-Markov Chain, Modern Manufacturing Engineering, 2019(4):7-13.)
[12] 吴奇, 严洪森 . 基于混沌v-支持矢量机的产品销售预测模型[J]. 机械工程学报, 2010,46(7):128-135.
[12] ( Wu Qi, Yan Hongsen . Forecasting Model of Product Sales Based on the Chaotic v-Support Vector Machine[J]. Journal of Mechanical Engineering, 2010,46(7):128-135.)
[13] 王栋 . 基于灰色关联和BP神经网络的汽车保有量预测[J]. 计算技术与自动化, 2015,34(1):29-33.
[13] ( Wang Dong . Prediction of Car Ownership Based on Grey Relational Analysis and BP Neural Network[J]. Computing Technology and Automation, 2015,34(1):29-33.)
[14] 马丽平, 张建辉 . 基于误差修正模型的汽车需求影响因素分析[J]. 科技管理研究, 2014,34(7):106-109, 122
[14] (Ma Liping, Zhang Jianhui. Analysis on China’s Automobile Demand Influencing Factors Based on the Error Correction Model, Science and Technology Management Research, 2014,34(7):106-109, 122.)
[15] Yolcu U, Egrioglu E, Aladag C H . A New Linear & Nonlinear Artificial Neural Network Model for Time Series Forecasting[J]. Decision Support Systems, 2013,54(3):1340-1347.
[16] 国显达, 那日萨, 崔少泽 . 基于CNN-BiLSTM的消费者网络评论情感分析[J]. 系统工程理论与实践, 2020,40(3):653-663.
[16] ( Guo Xianda, Zhao Narisa, Cui Shaoze . Consumer Reviews Sentiment Analysis Based on CNN-BiLSTM[J]. Systems Engineering-Theory & Practice, 2020,40(3):653-663.)
[17] Petersen N C, Rodrigues F, Pereira F C . Multi-Output Bus Travel Time Prediction with Convolutional LSTM Neural Network[J]. Expert Systems with Applications, 2019,120:426-435.
[18] Bin Y, Yang Y, Shen F M , et al. Describing Video with Attention-Based Bidirectional LSTM[J]. IEEE Transactions on Cybernetics, 2019,49(7):2631-2641.
doi: 10.1109/TCYB.2018.2831447 pmid: 29993730
[19] 徐彤彤, 孙华志, 马春梅 , 等. 基于双向长效注意力特征表达的少样本文本分类模型研究[J]. 数据分析与知识发现, 2020,4(10):113-123.
[19] ( Xu Tongtong, Sun Huazhi, Ma Chunmei , et al. Classification Model for Few-shot Texts Based on Bi-directional Long-term Attention Features[J]. Data Analysis and Knowledge Discovery, 2020,4(10):113-123.)
[20] 陶志勇, 李小兵, 刘影 , 等. 基于双向长短时记忆网络的改进注意力短文本分类方法[J]. 数据分析与知识发现, 2019,3(12):21-29.
[20] ( Tao Zhiyong, Li Xiaobing, Liu Ying , et al. Classifying Short Texts with Improved-Attention Based Bidirectional Long Memory Network[J]. Data Analysis and Knowledge Discovery, 2019,3(12):21-29.)
[21] Li Y R, Zhu Z F, Kong D Q , et al. EA-LSTM: Evolutionary Attention-Based LSTM for Time Series Prediction[OL]. arXiv Preprint, arXiv: 1811. 03760.
[22] 中国互联网络信息中心. 2012年中国网民消费行为调查报告——汽车[R/OL]. [ 2013- 01- 15]. http://www.cnnic.cn/hlwfzyj/hlwxzbg/dzswbg/201301/t20130116_38523.htm. .
[22] (China Internet Network Information Center. 2012 Chinese Internet Users Consumption Behavior Survey Report-Automobile[R/OL]. [ 2013-01-15]. http://www.cnnic.cn/hlwfzyj/hlwxzbg/dzswbg/201301/t20130116_38523.htm
[23] 孙春华 . 汽车消费市场对汽车产业布局影响的实证研究, 技术经济与管理研究, 2016(6):119-123.
[23] ( Sun Chunhua . An Empirical Study on the Effect of Automobile Market on Automobile Industry Distribution, Technical Economics & Management, 2016(6):119-123.)
[24] Kim S, Park H, Lee J . Word2Vec-Based Latent Semantic Analysis (W2V-LSA) for Topic Modeling: A Study on Blockchain Technology Trend Analysis[J]. Expert Systems with Applications, 2020,152:113401.
[25] Stein R A, Jaques P A, Valiati J F . An Analysis of Hierarchical Text Classification Using Word Embeddings[J]. Information Sciences, 2019,471:216-232.
doi: 10.1016/j.ins.2018.09.001
[26] 陈旭 . 基于时差分析法的旅游经济运行预警指标筛选[J]. 生态经济, 2013(11):87-89, 105.
[26] ( Chen Xu . Indices Selection for the Early Warning System of Tourism Macro-Economic Performance:Basing on Time Difference Relation Method[J]. Ecological Economy, 2013(11):87-89, 105.)
[27] Hulth A, Rydevik G, Linde A . Web Queries as a Source for Syndromic Surveillance[J]. PLoS One, 2009,4(2):e4378.
[1] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[2] Yin Haoran,Cao Jinxuan,Cao Luzhe,Wang Guodong. Identifying Emergency Elements Based on BiGRU-AM Model with Extended Semantic Dimension[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[3] Shi Lei,Wang Yi,Cheng Ying,Wei Ruibin. Review of Attention Mechanism in Natural Language Processing[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[4] Xue Fuliang,Liu Lifang. Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[5] Ma Jianxia,Yuan Hui,Jiang Xiang. Extracting Name Entities from Ecological Restoration Literature with Bi-LSTM+CRF[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[6] Qi Ruihua,Jian Yue,Guo Xu,Guan Jinghua,Yang Mingxin. Sentiment Analysis of Cross-Domain Product Reviews Based on Feature Fusion and Attention Mechanism[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
[7] Yan Jinghua,Hou Miaomiao. Predicting Time Series of Theft Crimes Based on LSTM Network[J]. 数据分析与知识发现, 2020, 4(11): 84-91.
[8] Xu Tongtong,Sun Huazhi,Ma Chunmei,Jiang Lifen,Liu Yichen. Classification Model for Few-shot Texts Based on Bi-directional Long-term Attention Features[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[9] Na Ma,Zhixiong Zhang,Pengmin Wu. Automatic Identification of Term Citation Object with Feature Fusion[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[10] Yuemin Wu,Ganggui Ding,Bin Hu. Extracting Relationship of Agricultural Financial Texts with Attention Mechanism[J]. 数据分析与知识发现, 2019, 3(5): 86-92.
[11] Xiaoxiao Zhu,Zunqi Yang,Jing Liu. Construction of an Adverse Drug Reaction Extraction Model Based on Bi-LSTM and CRF[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[12] Meishan Chen,Chenxi Xia. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[13] Qiang Lu,Zhenfang Zhu,Fuyong Xu,Qiangqiang Guo. Chinese Sentiment Classification Method with Bi-LSTM and Grammar Rules[J]. 数据分析与知识发现, 2019, 3(11): 99-107.
[14] Lianjie Xiao,Tao Meng,Wei Wang,Zhixiang Wu. Entity Recognition of Intelligence Method Based on Deep Learning: Taking Area of Security Intelligence for Example[J]. 数据分析与知识发现, 2019, 3(10): 20-28.
[15] Kan Liu,Haochen Du. Detecting Twitter Rumors with Deep Transfer Network[J]. 数据分析与知识发现, 2019, 3(10): 47-55.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn