Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (5): 89-98    DOI: 10.11925/infotech.2096-3467.2021.1068
Current Issue | Archive | Adv Search |
User Community Partition Based on Multi-layer Information Fusion in E-commerce Heterogeneous Network
Feng Yong1,Xu Wentao1,Wang Rongbing1(),Xu Hongyan1,Zhang Yonggang2
1College of Information, Liaoning University, Shenyang 110036, China
2Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
Download: PDF (929 KB)   HTML ( 28
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new algorithm based on multi-layer information fusion in an e-commerce heterogeneous network, aiming to improve the accuracy of user community division. [Methods] First, we conducted hierarchical processing of the e-commerce heterogeneous networks and constructed user node embeddings based on different relationship types. Then, we merged users of different layers and obtained their embedding characterization in e-commerce heterogeneous networks. Third, we used the objective function to optimize the relevant parameters of the user nodes. Finally, we clustered these users with an improved K-means algorithm, and created the reasonable community division. [Results] The NMI and Sim@5 indicators of the proposed algorithm were 6.4% and 1.7% higher than the existing algorithms based on DeepWalk, Node2Vec, and GCN. The model effectively characterized user nodes and accurately divided their communities. [Limitations] We did not examine the time information and noise points from the heterogeneous network. [Conclusions] The proposed algorithm could improve the performance of friend prediction, group recommendation and other applications.

Key wordsHeterogeneous Network      E-commerce      Representation Learning      Community Division     
Received: 22 September 2021      Published: 21 June 2022
ZTFLH:  TP302  
  G202  
Fund:Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University(93K172018K01);General Project of Scie.pngic Research Funds of Liaoning Provincial Department of Education(LJKZ0085)
Corresponding Authors: Wang Rongbing,ORCID:0000-0003-4129-7093     E-mail: wrb@lnu.edu.cn

Cite this article:

Feng Yong, Xu Wentao, Wang Rongbing, Xu Hongyan, Zhang Yonggang. User Community Partition Based on Multi-layer Information Fusion in E-commerce Heterogeneous Network. Data Analysis and Knowledge Discovery, 2022, 6(5): 89-98.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.1068     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I5/89

Different Types of Networks
UCEH Algorithm Framework
Representation Fusion
数据集 节点类型 节点数量 边类型 边类型对应的网络层 边数量
亚马逊 用户 6 506 用户查看产品 ULP 84 853
电子产品 3 660 用户购买产品 UBP 64 012
Netflix 用户
电影
398 556
81 633
用户评价电影 UEM 17 770
阿里巴巴 用户
商品
4 129
2 034
点击 UKI 4 108
添加至首选项 UPI 3 853
添加到购物车 UCI 7 153
不同类型节点之间转换 CON 2 751
Yelp 用户
商家
34 908
20 502
签到 URS 149 439
评分 UGS 41 843
标记 UMS 38 625
Dataset Comparison
数据集 Amazon Netflix Alibaba Yelp
指标 NMI Sim@5 NMI Sim@5 NMI Sim@5 NMI Sim@5
DeepWalk
Node2Vec
0.083
0.074
0.726
0.738
0.117
0.123
0.490
0.487
0.348
0.382
0.629
0.628
0.311
0.309
0.704
0.710
MetaPath2Vec
DGI
0.086
0.007
0.747
0.558
0.129
0.182
0.492
0.578
0.387
0.551
0.635
0.786
0.317
0.641
0.715
0.889
GCN 0.287 0.624 0.176 0.565 0.465 0.724 0.671 0.867
GAT
HAN
0.301
0.029
0.630
0.495
0.183
0.164
0.550
0.561
0.468
0.472
0.726
0.779
0.668
0.658
0.873
0.872
UCEH-AP 0.341 0.745 0.189 0.603 0.558 0.776 0.685 0.876
UCEH 0.344 0.753 0.194 0.605 0.563 0.787 0.691 0.898
Results of Different Algorithms in Two Types of Experiments
数据集 Amazon Netflix
网络层 ULP UBP UEM
指标 NMI Sim@5 NMI Sim@5 NMI Sim@5
E. 0.002 0.395 0.003 0.414 0.145 0.549
E.+R. 0.002 0.399 0.003 0.426 0.150 0.552
E.+I. 0.152 0.512 0.143 0.517 0.193 0.595
E.+I.+J. 0.169 0.544 0.153 0.525 0.194 0.592
数据集 Alibaba
网络层 UKI UPI UCI CON
指标 NMI Sim@5 NMI Sim@5 NMI Sim@5 NMI Sim@5
E. 0.526 0.698 0.651 0.872 0.089 0.495 0.547 0.801
E.+R. 0.525 0.728 0.659 0.874 0.079 0.490 0.564 0.804
E.+I. 0.527 0.708 0.656 0.882 0.143 0.526 0.569 0.802
E.+I.+J. 0.528 0.716 0.662 0.886 0.142 0.527 0.562 0.805
数据集 Yelp
网络层 URS UGS UMS
指标 NMI Sim@5 NMI Sim@5 NMI Sim@5
E. 0.404 0.740 0.054 0.583 0.038 0.701
E.+R. 0.421 0.741 0.051 0.568 0.020 0.661
E.+I. 0.405 0.741 0.053 0.569 0.401 0.824
E.+I.+J. 0.408 0.742 0.055 0.591 0.407 0.826
The Ablation Experiment of the Algorithm in the Two Types of Experiments
NMI Value and Attention Weight of Each Layer Network
[1] Valdeolivas A, Tichit L, Navarro C, et al. Random Walk with Restart on Multiplex and Heterogeneous Biological Networks[J]. Bioinformatics, 2018, 35(3): 497-505.
doi: 10.1093/bioinformatics/bty637
[2] Bagavathi A, Krishnan S. Multi-Net: A Scalable Multiplex Network Embedding Framework[C]// Proceedings of the 7th International Conference on Complex Networks and Their Applications. 2019: 119-131.
[3] 陈文杰, 文奕, 杨宁. 基于节点向量表示的模糊重叠社区划分算法[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[3] ( Chen Wenjie, Wen Yi, Yang Ning. Fuzzy Overlapping Community Detection Algorithm Based on Node Vector Representation[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 41-50.)
[4] 冶忠林, 曹蓉, 赵海兴, 等. 基于矩阵分解的DeepWalk链路预测算法[J]. 计算机应用研究, 2020, 37(2): 424-429.
[4] ( Ye Zhonglin, Cao Rong, Zhao Haixing, et al. Link Prediction Based on Matrix Factorization for DeepWalk[J]. Application Research of Computers, 2020, 37(2): 424-429.)
[5] 王文涛, 吴淋涛, 黄烨, 等. 基于密集连接卷积神经网络的链路预测模型[J]. 计算机应用, 2019, 39(6): 1632-1638.
[5] ( Wang Wentao, Wu Lintao, Huang Ye, et al. Link Prediction Model Based on Densely Connected Convolutional Network[J]. Journal of Computer Applications, 2019, 39(6): 1632-1638.)
[6] 葛尧, 陈松灿. 面向推荐系统的图卷积网络[J]. 软件学报, 2020, 31(4): 1101-1112.
[6] ( Ge Yao, Chen Songcan. Graph Convolutional Network for Recommender Systems[J]. Journal of Software, 2020, 31(4): 1101-1112.)
[7] Zhang H M, Qiu L W, Yi L L, et al. Scalable Multiplex Network Embedding[C]// Proceedings of the 27th International Joint Conference on A.pngicial Intelligence. 2018: 3082-3088.
[8] Shi C, Li Y T, Zhang J W, et al. A Survey of Heterogeneous Information Network Analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1): 17-37.
doi: 10.1109/TKDE.2016.2598561
[9] Tang L, Liu H. Uncovering Cross-Dimension Group Structures in Multi-dimensional Networks[C]// Proceedings of the 2009 SDM Workshop on Analysis of Dynamic Networks. 2009: 568-575.
[10] Papalexakis E E, Akoglu L, Ience D. Do More Views of a Graph Help? Community Detection and Clustering in Multi-Graphs[C]// Proceedings of the 16th International Conference on Information Fusion. IEEE, 2013: 899-905.
[11] Boutemine O, Bouguessa M. Mining Community Structures in Multidimensional Networks[J]. ACM Transactions on Knowledge Discovery from Data, 2017, 11(4): 1-36.
[12] 张宜浩, 朱小飞, 徐传运, 等. 基于用户评论的深度情感分析和多视图协同融合的混合推荐方法[J]. 计算机学报, 2019, 42(6): 1316-1333.
[12] ( Zhang Yihao, Zhu Xiaofei, Xu Chuanyun, et al. Hybrid Recommendation Approach Based on Deep Sentiment Analysis of User Reviews and Multi-View Collaborative Fusion[J]. Chinese Journal of Computers, 2019, 42(6): 1316-1333.)
[13] Cen Y K, Zou X, Zhang J W, et al. Representation Learning for Attributed Multiplex Heterogeneous Network[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 1358-1368.
[14] You Q Z, Jin H L, Wang Z W, et al. Image Captioning with Semantic Attention[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 4651-4659.
[15] Veličković P, Fedus W, Hamilton W L, et al. Deep Graph Infomax[OL]. arXiv Preprint, arXiv: 1809.10341.
[16] McGill W. Multivariate Information Transmission[J]. Transactions of the IRE Professional Group on Information Theory, 1954, 4(4): 93-111.
doi: 10.1109/TIT.1954.1057469
[17] 周世兵, 徐振源, 唐旭清. K-means算法最佳聚类数确定方法[J]. 计算机应用, 2010, 30(8): 1995-1998.
doi: 10.3724/SP.J.1087.2010.01995
[17] ( Zhou Shibing, Xu Zhenyuan, Tang Xuqing. Method for Determining Optimal Number of Clusters in K-Means Clustering Algorithm[J]. Journal of Computer Applications, 2010, 30(8): 1995-1998.)
doi: 10.3724/SP.J.1087.2010.01995
[18] 丁义, 杨建. 欧氏距离与标准化欧氏距离在k近邻算法中的比较[J]. 软件, 2020, 41(10): 135-136.
[18] ( Ding Yi, Yang Jian. Comparison Between Euclidean Distance and Standardized Euclidean Distance in K-Nearest Neighbor Algorithm[J]. Computer Engineering & Software, 2020, 41(10): 135-136.)
[19] Park C, Kim D, Han J W, et al. Unsupervised Attributed Multiplex Network Embedding[J]. Proceedings of the AAAI Conference on A.pngicial Intelligence, 2020, 34(4): 5371-5378.
[20] Wang X, Ji H Y, Shi C, et al. Heterogeneous Graph Attention Network[C]// Proceedings of the World Wide Web Conference. ACM, 2019: 2022-2032.
[21] He R N, McAuley J. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering[C]// Proceedings of the 25th International Conference on World Wide Web. 2016: 507-517.
[22] Bennett J, Lanning S. The Netflix Prize[C]// Proceedings of KDD Cup and Workshop 2007.ACM Press, 2007: 35-38.
[23] Zhang Y, Pang L, Shi L,et al. Large Scale Purchase Prediction with Historical User Actions on B2C Online Retail Platform[OL]. arXiv Preprint, arXiv: 1408.6515.
[24] Byers J W, Mitzenmacher M, Zervas G. The Groupon Effect on Yelp Ratings: A Root Cause Analysis[C]// Proceedings of the 13th ACM Conference on Electronic Commerce. 2012: 248-265.
[25] Dong Y X, Chawla N V, Swami A. MetaPath2Vec: Scalable Representation Learning for Heterogeneous Networks[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017: 135-144.
[26] 许晶航, 左万利, 梁世宁, 等. 基于图注意力网络的因果关系抽取[J]. 计算机研究与发展, 2020, 57(1): 159-174.
[26] ( Xu Jinghang, Zuo Wanli, Liang Shining, et al. Causal Relation Extraction Based on Graph Attention Networks[J]. Journal of Computer Research and Development, 2020, 57(1): 159-174.)
[27] Chen W J, Gu Y L, Ren Z C, et al. Semi-Supervised User Profiling with Heterogeneous Graph Attention Networks[C]// Proceedings of the 28th International Joint Conference on A.pngicial Intelligence. 2019: 2116-2122.
[1] Deng Qiping, Chen Weijing, Ji Ling, Zhang Yu’e. Author Name Disambiguation Based on Heterogeneous Information Network[J]. 数据分析与知识发现, 2022, 6(4): 60-68.
[2] Sun Yu, Qiu Jiangnan. Studying Opinion Leaders with Network Analysis and Text Mining[J]. 数据分析与知识发现, 2022, 6(1): 69-79.
[3] Chen Wenjie,Wen Yi,Yang Ning. Fuzzy Overlapping Community Detection Algorithm Based on Node Vector Representation[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[4] Zhang Xin,Wen Yi,Xu Haiyun. A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration[J]. 数据分析与知识发现, 2021, 5(3): 88-100.
[5] Zhang Jinzhu, Yu Wenqian. Topic Recognition and Key-Phrase Extraction with Phrase Representation Learning[J]. 数据分析与知识发现, 2021, 5(2): 50-60.
[6] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[7] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[8] Yu Chuanming,Zhong Yunci,Lin Aochen,An Lu. Author Name Disambiguation with Network Embedding[J]. 数据分析与知识发现, 2020, 4(2/3): 48-59.
[9] Zhang Chunjin,Guo Shenghui,Ji Shujuan,Yang Wei,Yi Lei. Group Recommendation Algorithms Based on Implicit Representation Learning of Multi-attribute Ratings[J]. 数据分析与知识发现, 2020, 4(12): 120-135.
[10] Ding Yong,Chen Xi,Jiang Cuiqing,Wang Zhao. Predicting Online Ratings with Network Representation Learning and XGBoost[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[11] Zhang Jinzhu,Zhu Lipeng,Liu Jingjie. Unsupervised Cross-Language Model for Patent Recommendation Based on Representation[J]. 数据分析与知识发现, 2020, 4(10): 93-103.
[12] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[13] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[14] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[15] Qingtian Zeng,Mingdi Dai,Chao Li,Hua Duan,Zhongying Zhao. Discovering Important Locations with User Representation and Trace Data[J]. 数据分析与知识发现, 2019, 3(6): 75-82.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn