Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (2/3): 68-77    DOI: 10.11925/infotech.2096-3467.2019.0728
Current Issue | Archive | Adv Search |
Online Product Recommendation Based on Multi-Head Self-Attention Neural Networks
Ni Weijian,Guo Haoyu,Liu Tong(),Zeng Qingtian
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266510, China
Download: PDF(1348 KB)   HTML ( 6
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to predict online customers’ future purchases based on their previous shopping behaviors.[Methods] We proposed a new product recommendation approach based on multi-head self-attention neural networks. Our method captured the relationship and attributes of items checked out by specific customers.Finally, we generated the recommended lists using recurrent neural networks with attentions.[Results] We examined the proposed approach on three real-world data sets and yielded better F1 values than existing methods (2% higher).[Limitations] The diversity of the recommended lists needs more analysis.[Conclusions] The multi-head self-attention mechanism is an effective way to model shopping behaviors and create better recommendations for the consumers.

Key wordsNext Basket Recommendation      Deep Neural Network      Multi-Head Self-Attention      Item Attributes     
Received: 20 June 2019      Published: 26 April 2020
ZTFLH:  TP391  
Corresponding Authors: Tong Liu     E-mail: liu_tongtong@foxmail.com

Cite this article:

Ni Weijian,Guo Haoyu,Liu Tong,Zeng Qingtian. Online Product Recommendation Based on Multi-Head Self-Attention Neural Networks. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 68-77.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0728     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I2/3/68

Structure of Multi-Head Self-Attention
Structure of Basket Recommendation Model Based on Multi-Head Self-Attention
数据集 用户数量 物品数量 购物篮数量 物品属性数量
Ta-Feng 2 348 14 716 39 101 2 013
JingDong 16 891 17 718 250 996 10
TaoBao 1 235 11 623 16 937 1 619
Basic Statistics of Datasets
Model Precision Recall F1 Hit-Rate NDCG@5
TOP 0.0497 0.0698 0.0580 0.2172 0.0833
Item-CF 0.0272 0.0400 0.0324 0.1244 0.0431
NFM 0.0518 0.0728 0.0605 0.2432 0.0831
DREAM 0.0533 0.0781 0.0633 0.2322 0.0851
NAM 0.0599 0.0941 0.0732 0.2627 0.0925
ANAM 0.0600 0.0942 0.0733 0.2632 0.0925
本文方法 0.0624 0.0968 0.0759 0.2732 0.0940
Results on Ta-Feng Dataset
Model Precision Recall F1 Hit-Rate NDCG@5
TOP 0.0140 0.0392 0.0206 0.0679 0.0270
Item-CF 0.0123 0.0341 0.0181 0.0584 0.0251
NFM 0.0337 0.0822 0.0478 0.1409 0.0633
DREAM 0.0213 0.0592 0.0314 0.0988 0.0413
NAM 0.0257 0.0678 0.0372 0.1150 0.0485
ANAM 0.0568 0.1530 0.0828 0.2294 0.1226
本文方法 0.0678 0.1859 0.0994 0.2620 0.1485
Results on JingDong Dataset
Model Precision Recall F1 Hit-Rate NDCG@5
TOP 0.0011 0.0024 0.0015 0.0048 0.0025
Item-CF 0.0012 0.0034 0.0018 0.0056 0.0022
NFM 0.0074 0.0185 0.0106 0.0307 0.0136
DREAM 0.0020 0.0044 0.0028 0.0088 0.0038
NAM 0.0019 0.0042 0.0026 0.0080 0.0037
ANAM 0.0019 0.0046 0.0027 0.0088 0.0038
本文方法 0.0158 0.0477 0.0237 0.0704 0.0410
Results on TaoBao Dataset
Model Precision Recall F1 Hit-Rate NDCG@5
-category-attention 0.0589 0.0935 0.0723 0.2629 0.0930
-category-transformer 0.0533 0.0746 0.0621 0.2317 0.0686
-multihead 0.0617 0.0961 0.0752 0.2687 0.0934
-attention 0.0601 0.0943 0.0734 0.2634 0.0787
-transformer 0.0556 0.0899 0.0687 0.2512 0.0836
完整网络 0.0624 0.0968 0.0759 0.2732 0.0940
Ablated Results on Ta-Feng Dataset
Model Precision Recall F1 Hit-Rate NDCG@5
-category-attention 0.0428 0.1157 0.0625 0.1835 0.0919
-category-transformer 0.0559 0.1506 0.0816 0.2225 0.1232
-multihead 0.0665 0.1781 0.0969 0.2577 0.1480
-attention 0.0428 0.1223 0.0634 0.1820 0.0930
-transformer 0.0568 0.1539 0.0829 0.2304 0.1256
完整网络 0.0678 0.1859 0.0994 0.2620 0.1485
Ablated Results on JingDong Dataset
Model Precision Recall F1 Hit-Rate NDCG@5
-category-attention 0.0080 0.0221 0.0118 0.0363 0.0178
-category-transformer 0.0046 0.0142 0.0070 0.0209 0.0101
-multihead 0.0171 0.0543 0.0260 0.0738 0.0450
-attention 0.0047 0.0143 0.0071 0.0272 0.0138
-transformer 0.0055 0.0145 0.0080 0.0252 0.0176
完整网络 0.0158 0.0477 0.0237 0.0704 0.0410
Ablated Results on TaoBao Dataset
Visualization of Weights of Item-based Self-Attention
Visualization of Weights of Attribute-based Self-Attention
[1] Hidasi B, Karatzoglou A, Baltrunas L , et al. Session-based Recommendations with Recurrent Neural Networks[OL]. arXiv Preprint, arXiv: 1511.06939.
[2] Hidasi B, Quadrana M, Karatzoglou A . Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations [C]// Proceedings of the 10th ACM Conference on Recommender Systems, Boston, USA. ACM, 2016: 241-248.
[3] Quadrana M, Karatzoglou A, Hidasi B , et al. Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks [C]// Proceedings of the 11th ACM Conference on Recommender Systems, Como, Italy. ACM, 2017: 130-137.
[4] Jannach D, Ludewig M . When Recurrent Neural Networks Meet the Neighborhood for Session-based Recommendation [C]// Proceedings of the 11th ACM Conference on Recommender Systems, Como, Italy. ACM, 2017: 306-310.
[5] De Montjoye Y A, Shmueli E, Wang S S . openPDS: Protecting the Privacy of Metadata Through Safe Answers[J]. PLoS One, 2014,9(7):e98790.
[6] Vescovi M, Perentis C, Leonardi C , et al. My Data Store: Toward User Awareness and Control on Personal Data [C]// Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, Seattle, USA. ACM, 2014: 179-182.
[7] Hsu C N, Chung H H, Huang H S . Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation[J]. Machine Learning, 2004,57(1-2):35-59.
[8] Lazcorreta E, Botella F, Fernández-Caballero A . Towards Personalized Recommendation by Two-step Modified Apriori Data Mining Algorithm[J]. Expert Systems with Applications, 2008,35(3):1422-1429.
[9] Guidotti R, Rossetti G, Pappalardo L , et al. Next Basket Prediction Using Recurring Sequential Patterns[OL]. arXiv Preprint, arXiv: 1702.07158.
[10] Chand C, Thakkar A, Ganatra A . Sequential Pattern Mining: Survey and Current Research Challenges[J]. International Journal of Soft Computing and Engineering, 2012,1(2):185-193.
[11] Rendle S, Freudenthaler C, Schmidt-Thieme L . Factorizing Personalized Markov Chains for Next-Basket Recommendation [C]// Proceedings of the 19th International Conference on World Wide Web. New York: ACM, 2010: 811-820.
[12] Chen J, Wang C, Wang J . A Personalized Interest-forgetting Markov Model for Recommendations [C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015.
[13] Wang P, Guo J, Lan Y , et al. Learning Hierarchical Representation Model for Next Basket Recommendation [C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile. ACM, 2015: 403-412.
[14] Yu F, Liu Q, Wu S , et al. A Dynamic Recurrent Model for Next Basket Recommendation [C]// Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Italy. ACM, 2016: 729-732.
[15] Bai T, Nie J Y, Zhao W X , et al. An Attribute-aware Neural Attentive Model for Next Basket Recommendation [C]// Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA. ACM, 2018: 1201-1204.
[16] Vaswani A, Shazeer N, Parmar N , et al. Attention is All You Need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.
[17] Yu A W, Dohan D, Luong M T , et al. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension[OL]. arXiv Preprint, arXiv: 1804.09541.
[18] Shen T, Zhou T, Long G , et al. DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018.
[19] Hochreiter S, Schmidhuber J . Long Short-term Memory[J]. Neural Computation, 1997,9(8):1735-1780.
[20] Lee D D, Seung H S . Algorithms for Non-negative Matrix Factorization [C]// Proceedings of the 13th International Conference on Neural Information Processing Systems. 2001: 556-562.
[1] Xingxin Qin,Rongbo Wang,Xiaoxi Huang,Zhiqun Chen. Slope One Collaborative Filtering Algorithm Based on Multi-Weights[J]. 数据分析与知识发现, 2017, 1(6): 65-71.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn