Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (7): 85-93    DOI: 10.11925/infotech.2096-3467.2018.0999
Current Issue | Archive | Adv Search |
Mining Algorithm for Weighted Association Rules Based on Frequency Effective Length
Yong Zhang,Shuqing Li(),Yongshang Cheng
School of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210046, China
Download: PDF (653 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper analyzes the differences in the importance of database items, aiming to address the issues of traditional association mining algorithm with redundant and worthless rules. [Methods] On the sequence with temporal constraints, we explored the non-weighted association rules with the frequency effective length and the weighting methods. Then, we used sliding window technique to study the rare weighted association rules on the time series. [Results] The accuracy of the prediction made by the proposed method increased to 69% from 62%. [Limitations] The mining algorithm took long time to extract the needed rules due to the sliding windows and the large number of rules generated. [Conclusions] The association rules of weighted time series improve the accuracy of recommendation, which also provides new directions for research method on association rules.

Key wordsData Mining      Association Rules      Frequency Length      Sliding Window     
Received: 08 September 2018      Published: 06 September 2019
ZTFLH:  G354  
Corresponding Authors: Shuqing Li     E-mail: leeshuqing@163.com

Cite this article:

Yong Zhang,Shuqing Li,Yongshang Cheng. Mining Algorithm for Weighted Association Rules Based on Frequency Effective Length. Data Analysis and Knowledge Discovery, 2019, 3(7): 85-93.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0999     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I7/85

用户 类型 评分
1 A,C,T,W 3, 1, 5, 0.5
2 C,D,T 1 , 0.5 ,5
3 A,C,W 3 , 1 , 1.5
4 A,C,D,W 4, 1, 2, 1
5 C,D,W 1, 2, 1
类型 权值
A 0.6
C 0.3
D 0.3
T 0.6
W 0.2
用户 类型 计算过程 用户序列平均权值
1 A,C,T,W (0.6+0.3+0.6+0.2)/4 0.43
2 C,D,T (0.3+0.3+0.6)/3 0.4
3 A,C,W (0.6+0.3+0.2)/3 0.37
4 A,C,D,W (0.6+0.3+0.3+0.2)/4 0.35
5 C,D,W (0.3+0.3+0.2)/3 0.27
sum 1.82
频繁集 CT ACW CW
加权支持度(ws) 0.46 0.63 0.78
用户 类型
1 [A,C,T,W,A,W,C,T],W
2 [D,T, T,W,A,W,C,A],W,C
3 [W, T, T,W,A,W,A,T],W,A,C
4 [A,C,D,W,T,A,C,W]
5 [C,D,W]
用户 类型 滑窗中的用户序列平均权值
1 [A,C,T,W,A,W,C,T],W 0.45
2 C,[D,T, T,W,A,W,C,A],W 0.51
3 A,C,[W, T, T,W,A,W,A,T],W 0.56
4 [A,C,D,W,T,A,C,W] 0.38
5 [C,D,W] 0.20
sum 2.10
支持度
置信度
(0.20,0.25] (0.25,0.30] (0.30,0.35] (0.35,0.40]
(0.5,0.6] 0.0378 0.0319 0.0579 0.0287
(0.6,0.7] 0.0469 0.0432 0.0599 0.0433
(0.7,0.8] 0.0748 0.0802 0.0967 0.0830
(0.8,0.9] 0.1020 0.1174 0.0935 0.0667
(0.9,1.0] 0.0462 0.0272 0.0090 0.0066
支持度
置信度
(0.20,0.25] (0.25,0.30] (0.30,0.35] (0.35,0.40]
(0.5,0.6] 0.0031 0.0019 0.0012 0.0009
(0.6,0.7] 0.0091 0.0035 0.0026 0.0018
(0.7,0.8] 0.0190 0.0071 0.0039 0.0024
(0.8,0.9] 0.0543 0.0196 0.0059 0.0025
(0.9,1.0] 0.0826 0.0185 0.0029 0.0011
后续节点位置 命中率 后续节点位置 命中率
1 0.3750 6 0.3701
2 0.3722 7 0.3651
3 0.3725 8 0.3685
4 0.3704 9 0.3695
5 0.3853 10 0.3680
支持度
置信度
(0.20,0.25] (0.25,0.30] (0.30,0.35] (0.35,0.40]
(0.5,0.6] 0.0432 0.0455 0.0554 0.0308
(0.6,0.7] 0.1121 0.1079 0.0560 0.0975
(0.7,0.8] 0.2124 0.1594 0.1142 0.0994
(0.8,0.9] 0.1832 0.1730 0.1270 0.0903
(0.9,1.0] 0.0969 0.0534 0.0172 0.0156
支持度
置信度
(0.20,0.25] (0.25,0.30] (0.30,0.35] (0.35,0.40]
(0.5,0.6] 0.0047 0.0022 0.0018 0.0008
(0.6,0.7] 0.0162 0.0127 0.0049 0.0029
(0.7,0.8] 0.0310 0.0122 0.0065 0.0042
(0.8,0.9] 0.0761 0.0236 0.0097 0.0030
(0.9,1.0] 0.1285 0.0301 0.0054 0.0013
频次有效周期 误差率 频次有效周期 误差率
3 0.3201 8 0.2449
4 0.2995 9 0.2658
5 0.2831 10 0.2576
6 0.2769 11 0.2694
7 0.2769 12 0.2603
区间 准确度 区间 准确度 区间 准确度
(0.10,0.15] 0.69 (0.40,0.45] 0.38 (0.70,0.75] 0.45
(0.15,0.20] 0.55 (0.45,0.50] 0.40 (0.75,0.80] 0.07
(0.20,0.25] 0.53 (0.50,0.55] 0.42 (0.80,0.85] 0.15
(0.25,0.30] 0.49 (0.55,0.60] 0.20 (0.85,0.90] 0.11
(0.30,0.35] 0.52 (0.60,0.65] 0.14 (0.90,0.95] 0.10
(0.35,0.40] 0.43 (0.65,0.70] 0.23 (0.95,1.00] 0.01
区间 覆盖度 区间 覆盖度 区间 覆盖度
(0.10,0.15] 0.40 (0.40,0.45] 0.35 (0.70,0.75] 0.31
(0.15,0.20] 0.41 (0.45,0.50] 0.33 (0.75,0.80] 0.31
(0.20,0.25] 0.36 (0.50,0.55] 0.22 (0.80,0.85] 0.27
(0.25,0.30] 0.38 (0.55,0.60] 0.34 (0.85,0.90] 0.22
(0.30,0.35] 0.37 (0.60,0.65] 0.30 (0.90,0.95] 0.25
(0.35,0.40] 0.36 (0.65,0.70] 0.31 (0.95,1.00] 0.08
最小加权置信度阈值 准确度 最小加权置信度阈值 准确度
0.40 0.69 0.70 0.57
0.50 0.67 0.80 0.53
0.60 0.65 0.90 0.48
[1] Khan M S, Muyeba M, Coenen F. A Weighted Utility Framework for Mining Association Rules [C]// Proceedings of the 2nd UKSIM European Symposium on Computer Modeling and Simulation. 2008: 87-92.
[2] Forsati R, Meybodi M R . Effective Page Recommendation Algorithms Based on Distributed Learning Automata and Weighted Association Rules[J]. Expert Systems with Applications, 2010,37(2):1316-1330.
[3] Zhai Y, Wang L, Wang N. Efficient Weighted Association Rule Mining Using Lattice [C]// Proceedings of the 26th Chinese Control and Decision Conference. 2014: 4913-4917.
[4] Ouyang W. Mining Weighted Rare Association Rules Using Sliding Window over Data Streams [C]// Proceedings of the 2016 International Conference on Computer Science and Electronic Technology. 2016: 116-119.
[5] 李成军, 杨天奇 . 一种改进的加权关联规则挖掘方法[J]. 计算机工程, 2010,36(7):55-57.
[5] ( Li Chengjun, Yang Tianqi . Improved Weighted Association Rules Mining Method[J]. Computer Engineering, 2010,36(7):55-57.)
[6] 欧阳为民, 郑诚, 蔡庆生 . 数据库中加权关联规则的发现[J]. 软件学报, 2001,12(4):612-619.
[6] ( Ouyang Weimin, Zheng Cheng, Cai Qingsheng . Discovery of Weighted Association Rules in Databases[J]. Journal of Software, 2001,12(4):612-619.)
[7] Malarvizhi S P, Sathiyabhama B . Frequent Pagesets from Web Log by Enhanced Weighted Association Rule Mining[J]. Cluster Computing, 2016,19(1):1-9.
[8] 王涛伟, 任一波 . 基于加权关联规则的个性化推荐研究[J]. 计算机应用与软件, 2008,25(8):242-244.
[8] ( Wang Taowei, Ren Yibo . Study on Personalized Recommendation Based on Weighted Association Rule[J]. Computer Applications and Software, 2008,25(8):242-244.)
[9] 王斌, 丁祥斌 . 一种基于BUC的水平加权关联规则挖掘算法[J]. 计算机应用与软件, 2008,25(12):112-115.
[9] ( Wang Bin, Ding Xiangbin . A BUC-Based Mining Algorithm for Horizontal Weighted Association Rules[J]. Computer Applications and Software, 2008,25(12):112-115.)
[10] 龙舜, 蔡跳, 林佳雄 . 一个基于演化关联规则挖掘的个性化推荐模型[J]. 暨南大学学报: 自然科学与医学版, 2012,33(3):264-267.
[10] ( Long Shun, Cai Tiao, Lin Jiaxiong . A Personalized Recommendation Model Based on Evolving Association Rule Mining[J]. Journal of Jinan University: Natural Science & Medicine Edition, 2012,33(3):264-267.)
[11] 张佳乐, 梁吉业, 庞继芳 , 等. 基于行为和评分相似性的关联规则群推荐算法[J]. 计算机科学, 2014,41(3):36-40.
[11] ( Zhang Jiale, Liang Jiye, Pang Jifang , et al. Behavior and Score Similarity Based Algorithm for Association Rule Group Recommendation[J]. Computer Science, 2014,41(3):36-40.)
[1] Xie Wang, Wang Lizhen, Chen Hongmei, Zeng Lanqing. Identifying Relationship Between Pollution Sources and Cancer Cases with Spatial Ordered Pair Patterns[J]. 数据分析与知识发现, 2021, 5(2): 14-31.
[2] Li Tiejun,Yan Duanwu,Yang Xiongfei. Recommending Microblogs Based on Emotion-Weighted Association Rules[J]. 数据分析与知识发现, 2020, 4(4): 27-33.
[3] Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
[4] Dongmei Mu,Hui Fa,Ping Wang,Jing Sun. Research on Disease Risk Factors on Structural Equation Model[J]. 数据分析与知识发现, 2019, 3(4): 80-89.
[5] Li Yongnan. Using Bayes Theory to Classify Counter Terrorism Intelligence[J]. 数据分析与知识发现, 2018, 2(10): 9-14.
[6] Mu Dongmei,Wang Ping,Zhao Danning. Reducing Data Dimension of Electronic Medical Records: An Empirical Study[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
[7] Hu Zhongyi,Wang Chaoqun,Wu Jiang. Identifying Phishing Websites with Multiple Online Data Sources[J]. 数据分析与知识发现, 2017, 1(6): 47-55.
[8] Lu Xiaohang,Wang Shengqing,Huang Junjie,Chen Wenguang,Yan Zengwang. Predicting Dropout Rates of MOOCs with Sliding Window Model[J]. 数据分析与知识发现, 2017, 1(4): 67-75.
[9] Jiang Siwei,Xie Zhenping,Chen Meijie,Cai Ming. Self-Explainable Reduction Method for Mixed Feature Data Modeling[J]. 数据分析与知识发现, 2017, 1(12): 92-100.
[10] Wei Xing,Hu Dehua,Yi Minhan,Zhu Qizhen,Zhu Wenjie. Extracting Disease-Gene-Drug Correlations Based on Data Cube[J]. 数据分析与知识发现, 2017, 1(10): 94-104.
[11] Mu Dongmei,Ren Ke. Discovering Knowledge from Electronic Medical Records with Three Data Mining Algorithms[J]. 现代图书情报技术, 2016, 32(6): 102-109.
[12] Li Feng,Li Shu’ning,Yu Jing. A Department Oriented Library Usage Data System for Graduates[J]. 现代图书情报技术, 2016, 32(5): 99-103.
[13] Guangce Ruan, Lei Xia. Mining Document Topics Based on Association Rules[J]. 数据分析与知识发现, 2016, 32(12): 50-56.
[14] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[15] Zhao Jingxian. Detect of Internet Fake Public Opinion Based on Decision Tree[J]. 现代图书情报技术, 2015, 31(6): 78-84.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn