基于图书目录注意力机制的读者偏好分析与推荐模型研究

doi:10.11925/infotech.2096-3467.2021.1317

数据分析与知识发现

2022, Vol. 6

Issue (9): 138-152 https://doi.org/10.11925/infotech.2096-3467.2021.1317

研究论文

本期目录 | 过刊浏览 | 高级检索

基于图书目录注意力机制的读者偏好分析与推荐模型研究

王代琳¹,刘丽娜²(

),刘美玲²,刘亚秋²

¹东北林业大学图书馆哈尔滨 150040
²东北林业大学信息与计算机工程学院哈尔滨 150040

Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs

Wang Dailin¹,Liu Lina²(

),Liu Meiling²,Liu Yaqiu²

¹Northeast Forestry University Library, Harbin 150040, China
²College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF (3008 KB) HTML ( 45 )
输出: BibTeX | EndNote (RIS)

摘要

【目的】 为解决现有推荐算法因忽略读者对于图书目录的关注而导致推荐准确度不高的问题,本文提出一种基于图书目录注意力机制的读者偏好分析方法及其个性化推荐模型IABiLSTM。【方法】 根据图书标题和目录内容提取图书的语义特征：利用BiLSTM网络捕获文本的长距离依赖和语序上下文信息,使用双层Self-Attention机制增强图书目录特征更深层次的语义表达;分析读者历史浏览行为,使用兴趣函数拟合量化读者兴趣度;将图书的语义特征和读者兴趣度相结合生成读者偏好向量,计算候选图书语义特征向量和读者偏好向量的相似度预测评分并完成个性化图书推荐。【结果】 使用MSE、Precision、Recall三项指标对模型进行考察,当N=50时,豆瓣数据集上结果分别为1.1%、89.1%、85.2%,Amazon数据集上结果分别为1.2%、75.2%、72.8%,优于对比模型。【局限】 仅在豆瓣读书和Amazon两个数据集上进行了模型验证,在其他数据集上的泛化性能有待进一步验证。【结论】 本文通过提高对图书目录的注意力关注度和对读者历史浏览交互行为的分析,有效表达读者的兴趣偏好,对图书推荐准确度的提升起到了重要作用。所提模型不仅适用于基于图书内容和读者浏览行为的推荐任务,在其他常见的自然语言处理任务中也有借鉴意义。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	王代琳
	刘丽娜
	刘美玲
	刘亚秋

关键词 ：浏览行为, 图书目录注意力, 读者偏好, 个性化推荐, BiLSTM

Abstract：

[Objective] This paper proposes a new reader preference analysis method as well as a personalized book recommendation model (IABiLSTM), aiming to improve the accuracy of the existing algorithms. [Methods] First, we extracted the semantic features of books according to their titles and catalog contents. We used the BiLSTM network to capture the long-distance dependency of the texts and word order context information. We also utilized the Two-layer Self-Attention mechanism to enhance the deeper semantic expression of book catalog features. Then, we analyzed readers’ historical browsing behaviors, which were quantified with interest function. Third, we combined the semantic features of books with readers’ interests to generate their preference vector. Fourth, we calculated the similarity between the vectors of candidate books’ semantic features and readers’ preferences, and predicted the scores for personalized book recommendation. [Results] We examined our model on Douban Reading and Amazon datasets, and set the N value as 50. The MSE,Precision and Recall reached 1.1%, 89.1%, and 85.2%, on the Douban data, while they were 1.2%, 75.2%, and 72.8% with the Amazon data. These performance were better than those of the comparison model. [Limitations] More research is needed to examine our model with other datasets. [Conclusions] The proposed model improves the accuracy of book recommendation, and benefits common NLP tasks.

Key words： Browsing Behavior Book Catalog Attention Reader Preference Personalized Recommendation BiLSTM

收稿日期: 2021-11-18 出版日期: 2022-10-26

ZTFLH:

G250

基金资助:^*国家自然科学基金项目(61702091)

通讯作者: 刘丽娜,ORCID：0000-0002-1601-3290 E-mail: lln@nefu.edu.cn

引用本文:

王代琳, 刘丽娜, 刘美玲, 刘亚秋. 基于图书目录注意力机制的读者偏好分析与推荐模型研究[J]. 数据分析与知识发现, 2022, 6(9): 138-152.
Wang Dailin, Liu Lina, Liu Meiling, Liu Yaqiu. Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs. Data Analysis and Knowledge Discovery, 2022, 6(9): 138-152.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.1317 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I9/138

Fig.1 读者偏好模型IABiLSTM

Fig.2 编码结构

Fig.3 LSTM原理

Fig.4 Self-Attention机制

Table 1 数据集信息

BookId	书名	目录
30293801	Python 深度学习	第1章什么是深度学习1.1 人工智能、机器学习与深度学习1.1.1 人工智能1.1.2 机器学习1.1.3 从数据中学习表示1.1.4 深度学习之“深度”1.1.5 用三张图理解深度学习的工作原理1.1.6 深度学习已经取得的进展1.1.7 不要相信短期炒作1.1.8 人工智能的未来1.2 深度学习之前：机器学习简史1.2.1 概率建模 1.2.2 早期神经网络1.2.3 核方法1.2.4 决策树、随机森林与梯度提升机1.2.5 回到神经网络1.2.6 深度学习有何不同1.2.7 机器学习现状1.3 为什么是深度学习,为什么是现在1.3.1 硬件1.3.2 数据1.3.3 算法1.3.4 新的投资热潮 1.3.5 深度学习的大众化1.3.6 这种趋势会持续吗
26883982	Deep Learning	第1章引言1.1 本书面向的读者 1.2 深度学习的历史趋势 1.2.1 神经网络的众多名称和命运变迁 1.2.2 与日俱增的数据量 1.2.3 与日俱增的模型规模1.2.4 与日俱增的精度、复杂度和对现实世界的冲击第 1 部分应用数学与机器学习基础第 2 章线性代数2.1 标量、向量、矩阵和张量2.2 矩阵和向量相乘2.3 单位矩阵和逆矩阵2.4 线性相关和生成子空间2.5 范数2.6 特殊类型的矩阵和向量2.7 特征分解2.8 奇异值分解2.9 Moore-Penrose 伪逆2.10 迹运算2.11 行列式2.12 实例：主成分分析
35013197	深度学习推荐系统	第1章互联网的增长引擎——推荐系统1.1 为什么推荐系统是互联网的增长引擎1.1.1 推荐系统的作用和意义1.1.2 推荐系统与YouTube的观看时长增长1.1.3 推荐系统与电商网站的收入增长1.2 推荐系统的架构1.2.1 推荐系统的逻辑框架1.2.2 推荐系统的技术架构1.2.3 推荐系统的数据部分1.2.4 推荐系统的模型部分1.2.5 深度学习对推荐系统的革命性贡献1.2.6 把握整体,补充细节1.3 本书的整体结构
30385709	C程序设计（第五版）	第1章程序设计和C语言11.1什么是计算机程序1.2 什么是计算机语言1.3 C语言的发展及其特点1.4·简单的C语言程序1.4.1·简单的C语言程序举例1.4.2 C语言程序的结构1.5 运行C程序的步骤与方法1.6 程序设计的任务
10546125	JavaScript 高级程序设计（第3版）	第1章 JavaScript简介1.1 JavaScript简史1.2 JavaScript实现1.2.1 ECMAScript1.2.2 文档对象模型（DOM）1.2.3 浏览器对象模型（BOM）1.3 JavaScript版本1.4 小结第2章在HTML中使用JavaScript2.1 <script>元素2.1.1 标签的位置2.1.2 延迟脚本2.1.3 异步脚本2.1.4 在XHTML中的用法2.1.5 不推荐使用的语法2.2 嵌入代码与外部文件
1444656	C++程序设计教程（第二版）	第1章 C++入门1.1从C到C++1.2程序与语言1.3结构化程序设计1.4面向对象程序设计1.5程序开发过程1.6最简单的程序1.7函数小结第2章基本数据类型与输入/输出2.1字符集与保留字2.2基本数据类型2.3变量定义2.4字面量2.5常量2.6I/O流控制2.7printf与scanf
…	…	…

Table 2 豆瓣读书-图书数据

Table 3 豆瓣读书-读者浏览行为数据

Fig.5 注意力热力图

Table 4 豆瓣读书数据集上的消融实验

Table 5 有无自注意力机制对比

Table 6 目录数据MSE性能评价

Fig.6 MSE性能评价

Fig.7 对比实验的准确率

Fig.8 对比实验的召回率

[1]	Zhang L B, Luo T J, Zhang F, et al. A Recommendation Model Based on Deep Neural Network[J]. IEEE Access, 2018, 6: 9454-9463. doi: 10.1109/ACCESS.2018.2789866
[2]	Yang L B, Zheng Y, Cai X Y, et al. A LSTM Based Model for Personalized Context-Aware Citation Recommendation[J]. IEEE Access, 2018, 6: 59618-59627. doi: 10.1109/ACCESS.2018.2872730
[3]	Zhou G R, Zhu X Q, Song C R, et al. Deep Interest Network for Click-Through Rate Prediction[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 1059-1068.
[4]	Zhou G R, Mou N, Fan Y, et al. Deep Interest Evolution Network for Click-Through Rate Prediction[C]// Proceedings of the 2019 AAAI Conference on Artificial Intelligence. 2019, 33: 5941-5948.
[5]	Manotumruksa J, MacDonald C, Ounis I. A Contextual Attention Recurrent Architecture for Context-Aware Venue Recommendation[C]// Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 555-564.
[6]	Chen Q W, Zhao H, Li W, et al. Behavior Sequence Transformer for E-Commerce Recommendation in Alibaba[C]// Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data. ACM, 2019: 35-42.
[7]	Li R. Simulation Research of University Library Recommended System Based on Big Data and Data Mining[C]// Proceedings of the 3rd International Conference on Machinery, Materials and Information Technology Applications. 2015: 202-206.
[8]	Ping H. The Research on Personalized Recommendation Algorithm of Library Based on Big Data and Association Rules[J]. The Open Cybernetics & Systemics Journal, 2015, 9(1): 2554-2558.
[9]	Akbar M, Shaffer C A, Fan W G, et al. Recommendation Based on Deduced Social Networks in an Educational Digital Library[C]// Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries. IEEE, 2014: 29-38.
[10]	侯银秀, 李伟卿, 王伟军, 等. 基于用户偏好与商品属性情感匹配的图书个性化推荐研究[J]. 数据分析与知识发现, 2017, 1(8): 9-17.
[10]	( Hou Yinxiu, Li Weiqing, Wang Weijun, et al. Personalized Book Recommendation Based on User Preferences and Commodity Features[J]. Data Analysis and Knowledge Discovery, 2017, 1(8): 9-17.)
[11]	邢玲, 宋章浩, 马强. 基于混合行为兴趣度的用户兴趣模型[J]. 计算机应用研究, 2016, 33(3): 661-664.
[11]	( Xing Ling, Song Zhanghao, Ma Qiang. User Interest Model Based on Hybrid Behaviors Interest Rate[J]. Application Research of Computers, 2016, 33(3): 661-664.)
[12]	熊回香, 李晓敏, 李跃艳. 基于图书评论属性挖掘的群组推荐研究[J]. 数据分析与知识发现, 2020, 4(2/3): 214-222.
[12]	( Xiong Huixiang, Li Xiaomin, Li Yueyan. Group Recommendation Based on Attribute Mining of Book Reviews[J]. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 214-222.)
[13]	唐晓波, 周咏. 基于图书基因组的个性化图书推荐研究[J]. 图书馆学研究, 2017(2): 76-85.
[13]	( Tang Xiaobo, Zhou Yong. Personalized Book Recommendation Based on Book Genome[J]. Research on Library Science, 2017(2): 76-85.)
[14]	李建廷, 郭晔, 汤志军. 基于用户浏览行为分析的用户兴趣度计算[J]. 计算机工程与设计, 2012, 33(3): 968-972.
[14]	( Li Jianting, Guo Ye, Tang Zhijun. User Interest Degree Calculating Based on Analysis Users’ Browsing Behaviors[J]. Computer Engineering and Design, 2012, 33(3): 968-972.)
[15]	刘华真, 王巍, 谷壬倩, 等. 基于用户浏览行为的个性化推荐研究综述[J]. 计算机应用研究, 2021, 38(8): 2268-2277.
[15]	( Liu Huazhen, Wang Wei, Gu Renqian, et al. Survey of Personalized Recommendation Study Based on User Browsing Behavior[J]. Application Research of Computers, 2021, 38(8): 2268-2277.)
[16]	韩佳育. 基于深度学习的混合式隐语义推荐模型研究[D]. 长春: 吉林大学, 2020.
[16]	( Han Jiayu. Research on Hybrid Latent Factor Model for Recommendation Based on Deep Learning Techniques[D]. Changchun: Jilin University, 2020.)
[17]	杨辰, 陈晓虹, 王楚涵, 等. 基于用户细粒度属性偏好聚类的推荐策略[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[17]	( Yang Chen, Chen Xiaohong, Wang Chuhan, et al. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. Data Analysis and Knowledge Discovery, 2021, 5(10): 94-102.)
[18]	向卓元, 刘志聪, 吴玉. 基于用户行为自适应推荐模型研究[J]. 数据分析与知识发现, 2021, 5(4): 103-114.
[18]	( Xiang Zhuoyuan, Liu Zhicong, Wu Yu. Adaptive Recommendation Model Based on User Behaviors[J]. Data Analysis and Knowledge Discovery, 2021, 5(4): 103-114.)
[19]	李伟卿, 池毛毛, 王伟军. 面向用户长短期偏好调节的可解释个性化推荐方法研究[J]. 图书情报工作, 2021, 65(12): 101-111. doi: 10.13266/j.issn.0252-3116.2021.12.010
[19]	( Li Weiqing, Chi Maomao, Wang Weijun. Explainable Personalized Recommendation Method Based on Adjustment of Users’ Long- and Short-Term Preferences[J]. Library and Information Service, 2021, 65(12): 101-111.) doi: 10.13266/j.issn.0252-3116.2021.12.010
[20]	Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[21]	Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780. pmid: 9377276
[22]	Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures[J]. Neural Networks, 2005, 18(5-6): 602-610. pmid: 16112549
[23]	赵勤鲁, 蔡晓东, 李波, 等. 基于LSTM-Attention神经网络的文本特征提取方法[J]. 现代电子技术, 2018, 41(8): 167-170.
[23]	( Zhao Qinlu, Cai Xiaodong, Li Bo, et al. Text Feature Extraction Method Based on LSTM-Attention Neural Network[J]. Modern Electronics Technique, 2018, 41(8): 167-170.)
[24]	Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[25]	伊恩·古德费洛. 约书亚·本吉奥. 亚伦·库维尔. 深度学习[M]. 赵申剑, 黎彧君, 符天凡,等译. 北京: 人民邮电出版社, 2017: 269-273.
[25]	( Ian Goodfellow, Yoshua Bengio, Aaron Courville. Deep Learning[M]. Translated by ZhaoShenjian, LiYujun, FuTianfan, et al. Beijing: Posts and Telecommunications Press, 2017: 269-273.)
[26]	Zhang Z K, Zhou T, Zhang Y C. Tag-Aware Recommender Systems: A State-of-the-Art Survey[J]. Journal of Computer Science and Technology, 2011, 26(5): 767-777. doi: 10.1007/s11390-011-0176-1
[27]	Lops P, de Gemmis M, Semeraro G, et al. Content-Based and Collaborative Techniques for Tag Recommendation: An Empirical Evaluation[J]. Journal of Intelligent Information Systems, 2013, 40(1): 41-61. doi: 10.1007/s10844-012-0215-6
[28]	Rendle S, Freudenthaler C, Gantner Z, et al. BPR: Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 2009: 452-461.
[29]	Davidson J, Livingston B, Sampath D, et al. The YouTube Video Recommendation System[C]// Proceedings of the 4th ACM Conference on Recommender Systems. 2010: 293-296.
[30]	Hidasi B, Karatzoglou A, Baltrunas L, et al. Session-Based Recommendations with Recurrent Neural Networks[OL]. arXiv Preprint, arXiv: 1511.06939.

[1]	张治鹏, 毛煜升, 张李义. 基于领域ERNIE和BiLSTM模型的酒店评论观点原因分类研究^*[J]. 数据分析与知识发现, 2022, 6(9): 65-76.
[2]	胡吉明, 钱玮, 文鹏, 吕晓光. 基于结构功能和实体识别的文本语义表示——以病历领域为例*[J]. 数据分析与知识发现, 2022, 6(8): 110-121.
[3]	王昊, 林克柔, 孟镇, 李心蕾. 文本表示及其特征生成对法律判决书中多类型实体识别的影响分析[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[4]	吴彦文, 蔡秋亭, 刘智, 邓云泽. 融合多源数据和场景相似度计算的数字资源推荐研究^*[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[5]	丁浩, 艾文华, 胡广伟, 李树青, 索炜. 融合用户兴趣波动时序的个性化推荐模型^*[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[6]	魏伟,郭崇慧,邢小宇. 基于语义关联规则的试题知识点标注及试题推荐^*[J]. 数据分析与知识发现, 2020, 4(2/3): 182-191.
[7]	马娜,张智雄,吴朋民. 基于特征融合的术语型引用对象自动识别方法研究*[J]. 数据分析与知识发现, 2020, 4(1): 89-98.
[8]	张怡文,张臣坤,杨安桔,计成睿,岳丽华. 基于条件型游走的四部图推荐方法^*[J]. 数据分析与知识发现, 2019, 3(4): 117-125.
[9]	叶佳鑫,熊回香. 基于标签的跨领域资源个性化推荐研究^*[J]. 数据分析与知识发现, 2019, 3(2): 21-32.
[10]	聂卉. 结合词向量和词图算法的用户兴趣建模研究 ^*[J]. 数据分析与知识发现, 2019, 3(12): 30-40.
[11]	陈美杉,夏晨曦. 肝癌患者在线提问的命名实体识别研究:一种基于迁移学习的方法 ^*[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[12]	丁浩,李树青. 基于用户多类型兴趣波动趋势预测分析的个性化推荐方法 ^*[J]. 数据分析与知识发现, 2019, 3(11): 43-51.
[13]	李杰, 杨芳, 徐晨曦. 考虑时间动态性和序列模式的个性化推荐算法^*[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[14]	侯银秀, 李伟卿, 王伟军, 张婷婷. 基于用户偏好与商品属性情感匹配的图书个性化推荐研究^*[J]. 数据分析与知识发现, 2017, 1(8): 9-17.
[15]	陈梅梅, 薛康杰. 基于标签簇多构面信任关系的个性化推荐算法研究^*[J]. 数据分析与知识发现, 2017, 1(5): 94-101.

Viewed

Full text

Abstract

Cited

Shared

Discussed