Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (9): 138-152    DOI: 10.11925/infotech.2096-3467.2021.1317
Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs
Wang Dailin1,Liu Lina2(),Liu Meiling2,Liu Yaqiu2
1Northeast Forestry University Library, Harbin 150040, China
2College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
[Objective] This paper proposes a new reader preference analysis method as well as a personalized book recommendation model (IABiLSTM), aiming to improve the accuracy of the existing algorithms. [Methods] First, we extracted the semantic features of books according to their titles and catalog contents. We used the BiLSTM network to capture the long-distance dependency of the texts and word order context information. We also utilized the Two-layer Self-Attention mechanism to enhance the deeper semantic expression of book catalog features. Then, we analyzed readers’ historical browsing behaviors, which were quantified with interest function. Third, we combined the semantic features of books with readers’ interests to generate their preference vector. Fourth, we calculated the similarity between the vectors of candidate books’ semantic features and readers’ preferences, and predicted the scores for personalized book recommendation. [Results] We examined our model on Douban Reading and Amazon datasets, and set the N value as 50. The MSE,Precision and Recall reached 1.1%, 89.1%, and 85.2%, on the Douban data, while they were 1.2%, 75.2%, and 72.8% with the Amazon data. These performance were better than those of the comparison model. [Limitations] More research is needed to examine our model with other datasets. [Conclusions] The proposed model improves the accuracy of book recommendation, and benefits common NLP tasks.

Key wordsBrowsing Behavior      Book Catalog Attention      Reader Preference      Personalized Recommendation      BiLSTM     
Received: 18 November 2021      Published: 26 October 2022
ZTFLH:  G250  
Fund:National Natural Science Foundation of China(61702091)
Wang Dailin, Liu Lina, Liu Meiling, Liu Yaqiu. Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs. Data Analysis and Knowledge Discovery, 2022, 6(9): 138-152.

The IABiLSTM Model
Coding Structure
The Principle of LSTM
Self-Attention Mechanism
数据集 用户数 图书数 评价记录 评价等级 评价稀疏度%
豆瓣读书 408 610 3 210 1,…,5 98.71
Amazon 514 916 4 500 1,…,5 99.04
Data Set Information
BookId 书名 目录
30293801 Python
第1章 什么是深度学习1.1 人工智能、机器学习与深度学习1.1.1 人工智能1.1.2 机器学习1.1.3 从数据中学习表示1.1.4 深度学习之“深度”1.1.5 用三张图理解深度学习的工作原理1.1.6 深度学习已经取得的进展1.1.7 不要相信短期炒作1.1.8 人工智能的未来1.2 深度学习之前:机器学习简史1.2.1 概率建模 1.2.2 早期神经网络1.2.3 核方法1.2.4 决策树、随机森林与梯度提升机1.2.5 回到神经网络1.2.6 深度学习有何不同1.2.7 机器学习现状1.3 为什么是深度学习,为什么是现在1.3.1 硬件1.3.2 数据1.3.3 算法1.3.4 新的投资热潮 1.3.5 深度学习的大众化1.3.6 这种趋势会持续吗
26883982 Deep
第1章 引言1.1 本书面向的读者 1.2 深度学习的历史趋势 1.2.1 神经网络的众多名称和命运变迁 1.2.2 与日俱增的数据量 1.2.3 与日俱增的模型规模1.2.4 与日俱增的精度、复杂度和对现实世界的冲击第 1 部分 应用数学与机器学习基础第 2 章 线性代数2.1 标量、向量、矩阵和张量2.2 矩阵和向量相乘2.3 单位矩阵和逆矩阵2.4 线性相关和生成子空间2.5 范数2.6 特殊类型的矩阵和向量2.7 特征分解2.8 奇异值分解2.9 Moore-Penrose 伪逆2.10 迹运算2.11 行列式2.12 实例:主成分分析
35013197 深度学习
第1章 互联网的增长引擎——推荐系统1.1 为什么推荐系统是互联网的增长引擎1.1.1 推荐系统的作用和意义1.1.2 推荐系统与YouTube的观看时长增长1.1.3 推荐系统与电商网站的收入增长1.2 推荐系统的架构1.2.1 推荐系统的逻辑框架1.2.2 推荐系统的技术架构1.2.3 推荐系统的数据部分1.2.4 推荐系统的模型部分1.2.5 深度学习对推荐系统的革命性贡献1.2.6 把握整体,补充细节1.3 本书的整体结构
30385709 C程序设计
第1章 程序设计和C语言11.1什么是计算机程序1.2 什么是计算机语言1.3 C语言的发展及其特点1.4·简单的C语言程序1.4.1·简单的C语言程序举例1.4.2 C语言程序的结构1.5 运行C程序的步骤与方法1.6 程序设计的任务
10546125 JavaScript
第1章 JavaScript简介1.1 JavaScript简史1.2 JavaScript实现1.2.1 ECMAScript1.2.2 文档对象模型(DOM)1.2.3 浏览器对象模型(BOM)1.3 JavaScript版本1.4 小结第2章 在HTML中使用JavaScript2.1 <script>元素2.1.1 标签的位置2.1.2 延迟脚本2.1.3 异步脚本2.1.4 在XHTML中的用法2.1.5 不推荐使用的语法2.2 嵌入代码与外部文件
1444656 C++程序
第1章 C++入门1.1从C到C++1.2程序与语言1.3结构化程序设计1.4面向对象程序设计1.5程序开发过程1.6最简单的程序1.7函数小结第2章基本数据类型与输入/输出2.1字符集与保留字2.2基本数据类型2.3变量定义2.4字面量2.5常量2.6I/O流控制2.7printf与scanf
Douban Reading-Book Data
UserName BookId 评论时间 评价等级 浏览时间/分钟 浏览行为 其他阅读
FLCL 30293801 2018/10/27 5 4.8 K/S 26883982/26852315/25658468
lisa 26883982 2017/8/4 5 5.1 K/S 26840215/26727997
飞林沙 35013197 2020/8/14 5 6.1 S/C 26883982/24703171/27087503
似水流年 30385709 2020/5/28 5 5.5 K/C/S 1139336/1767741/16254569
lovevfp 10546125 2016/7/22 5 4.6 K 26351021/25458965/26358954/25653214
云天 10546125 2019/8/21 5 4.9 S/K/C 12051836/12564875/25653214
这么近,那么远 1444656 2016/4/2 3 2.9 N 1101524/12124514/45256325/36254785
喜欢鱼的小肉汪 4831448 2017/1/19 4 4.1 K/S 4843567/2287506/1610233
悟空 4831448 2015/2/13 4 5.3 K/S 2287506/23546521/58412632
TimeMaste 26663605 2020/6/7 5 3.8 C 15233695/10426640
yangong 26663605 2018/6/28 4 3.6 N 26767354/26586554
贫道 26340543 2015/10/11 3 2 N 4746407/47584125
无名 20390374 2015/5/20 3 4.8 N 26772632/26663605
夏夜寂寞轻注销 19952400 2020/2/20 5 4.3 K/S 35391618/26681685
Martin 19952400 2017/8/30 5 6 K/S 20432061/20390374
幸运小魔头 1885170 2016/3/5 4 4.6 N 27096665
李星云 1139426 2020/7/12 5 5.1 K/C/P 25859528/19952400
昊天 35126508 2020/10/11 4 2.8 N 35218199/35652145/65124532
Douban Reading-Readers’ Browsing Behavior Data
Heat Map of Attention
标题(书名) 目录 浏览行为 Precision@50/%
Ablation Experiment on Douban Reading Data Set
豆瓣读书 Amazon
89.1 85.2 75.2 72.8
86.0 80.7 70.8 69.2
Comparison of Self-Attention Mechanism
1.7 1.5 1.2 1.2 1.2 1.5
MSE Performance Evaluation of Catalog Data
MSE Performance Evaluation
Precision of Comparative Experiments
Recall of Comparative Experiments
