Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (12): 30-40    DOI: 10.11925/infotech.2096-3467.2019.0494
Current Issue | Archive | Adv Search |
Modeling Users with Word Vector and Term-Graph Algorithm
Hui Nie()
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
Download: PDF (1429 KB)   HTML ( 11
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a review-based user modeling method, aiming to improve the personalized information pushing services. [Methods] Firstly, we identified product feature-specific terms from reviews with the help of pre-trained word embedding model. Then, we built a term-specific graph based on semantic correlation among feature-specific words. Finally, we used the TextRank algorithm to compute user’s interest in product features, and model their preferences for products. [Results] User model generated by our new algorithm was consistent with the manually created ones (with nearly 90% semantic correlation). Our F1-score was 0.55, better than those of the classic TF-based word bag models. [Limitations] More manually labeled data and research is needed to improve the domain-specific analysis. [Conclusions] The proposed model helps us better analyze online reviews and develop new application for recommendation system.

Key wordsUser Modeling      Personal Recommendation      Review Mining     
Received: 10 May 2019      Published: 25 December 2019
ZTFLH:  TP393 O212  
Corresponding Authors: Hui Nie     E-mail:

Cite this article:

Hui Nie. Modeling Users with Word Vector and Term-Graph Algorithm. Data Analysis and Knowledge Discovery, 2019, 3(12): 30-40.

URL:     OR

特征观点抽取规则模板 覆盖率 示例 说明
a(评价)←SBV←n(特征项) 73% 像素(n)挺高(a)的 SBV: 主谓关系
VOB: 动宾关系
ATT: 定中关系
COO: 并列关系
a: 形容词
v: 动词
n: 名词
a(评价)→VOB→v←SBV←n(特征项) 13.8% 就是价钱(n)有(v)点小贵(a)
a(评价)→COO→a(评价)←SBV←n(特征项) 5.6% 屏幕(n)精致(a)漂亮(a)
a(评价)←SBV←v(特征项) 4.2% 运行(v)挺流畅(a)的
a(评价)←SBV←v←ATT←n(特征项) 1.9% 电池(n)续航(v)很给力(a)
未登录词 语义相关特征词/相似度 特征词平均语义关联度 是否归并特征词库
菜单 按钮/0.625, 闪屏/0.619, 截屏/0.591, 图标/0.565, 屏保/0.552 0.591
人脸 人脸识别/0.607, 图像/0.563, 截屏/0.535, 照片/0.488, 成像/0.485 0.536
物美价廉 性价比/0.586, 国产货/0.550, 回头率/0.504, 价钱/0.502, 正品/0.493 0.527
水货 行货/0.741, 国产货/0.603, 换货/0.586, 正品/0.581, 国产机/0.577 0.618
京东 商城/0.348, 物流/0.247, android/0.239, 新品/0.238, 国产/0.236 0.261
华为 ?手机/0.393, 网络/0.330, 电信/0.329, 三星/0.328, IOS/0.324 0.341
用户兴趣模型 模型描述 正确率P (均值) 召回率R (均值) F1(均值)
Semantic_Model 基于Word2Vec的词图模型, $\varepsilon $=0.5 0.4564 0.7582 0.5505
Feature_Model 面向评论内容中的特征词, 基于词频建立的用户兴趣模型 0.4336 0.7339 0.5269
Term_Model 面向评论内容中的词项(名词, 动名词, 动词), 基于词频建立的用户兴趣模型 0.2278 0.7327 0.3322
[1] 姜霖, 张麒麟 . 基于评论情感分析的个性化推荐策略研究-以豆瓣影评为例[J]. 情报理论与实践, 2017,40(8):99-104.
[1] ( Jiang Lin, Zhang Qilin . Research on Personalized Recommendation Strategy Based on Sentimental Analysis of the Reviews[J]. Information Studies: Theory & Application, 2017,40(8):99-104.)
[2] Chen L, Chen G, Wang F . Recommender Systems Based on User Reviews: The State of the Art[J]. User Modeling and User-Adapted Interaction, 2015,25(2):99-154.
[3] 宁建飞, 刘降珍 . 融合Word2vec与TextRank的关键词抽取研究[J]. 现代图书情报技术, 2016(6):20-27.
[3] ( Ning Jianfei, Liu Jiangzhen . Using Word2vec with TextRank to Extract Keywords[J]. New Technology of Library and Information Service, 2016(6):20-27.)
[4] 徐文海, 温有奎 . 一种基于TFIDF方法的中文关键词抽取算法[J]. 情报理论与实践, 2008,31(2):298-302.
[4] ( Xu Wenhai, Wen Youkui . An TFIDF_based Algorithm for Chinese Keywords Extraction[J]. Information Studies: Theory & Application, 2008,31(2):298-302.)
[5] 刘俊, 邹东升, 邢欣来 , 等. 基于主题特征的关键词抽取[J]. 计算机应用研究, 2012,29(11):4224-4227.
[5] ( Liu Jun, Zou Dongsheng, Xing Xinlai , et al. Keyphrase Extraction Based on Topic Feature[J]. Application Research of Computers, 2012,29(11):4224-4227.)
[6] Mihalcea R, Tarau P . TextRank: Bringing Order into Texts [C]//Proceedings of Empirical Methods in Natural Language Processing, Barcelona, Spain. 2004: 404-411.
[7] 夏天 . 词语位置加权TextRank的关键词抽取研究[J]. 现代图书情报技术, 2013(9):30-34.
[7] ( Xia Tian . Study on Keyword Extraction Using Word Position Weighted TextRank[J]. New Technology of Library and Information Service, 2013(9):30-34.)
[8] 谢玮, 沈一, 马永征 . 基于图计算的论文审稿自动推荐系统[J]. 计算机应用研究, 2016,33(3):798-801.
[8] ( Xie Wei, Shen Yi, Ma Yongzheng . Recommendation System for Paper Reviewing Based on Graph Computing[J]. Application Research of Computers, 2016,33(3):798-801.)
[9] 顾益军, 夏天 . 融合LDA与TextRank的关键词抽取研究[J]. 现代图书情报技术, 2014(7/8):41-47.
[9] ( Gu Yijun, Xia Tian . Study on Keyword Extraction with LDA and TextRank Combination[J]. New Technology of Library and Information Service, 2014(7/8):41-47.)
[10] 夏天 . 词向量聚类加权TextRank的关键词抽取[J]. 数据分析与知识发现, 2017,1(2):28-34.
[10] ( Xia Tian . Extracting Keywords with Modified TextRank Model[J]. Data Analysis and Knowledge Discovery, 2017,1(2):28-34.)
[11] Esparza S G, O’Mahony M P, Smyth B . Effective Product Recommendation Using the Real-Time Web [C]//Proceedings of the 30th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, UK. Springer, 2010: 5-18.
[12] Zhang W, Ding G, Chen L , et al. Generating Virtual Ratings from Chinese Reviews to Augment Online Recommendations [J]. ACM Transactions on Intelligent Systems and Technology, 2013, 4(1): Article No. 9.
[13] Musat C C, Liang Y, Faltings B . Recommendation Using Textual Opinions [C]//Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China. AAAI Press, 2013: 2684-2690.
[14] McAuley J, Leskovec J . Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text [C] //Proceedings of the 7th ACM International Conference on Recommender Systems, Hong Kong, China. New York, USA: ACM, 2013: 165-172.
[15] Liu H, He J, Wang T , et al. Combining User Preferences and User Opinions for Accurate Recommendation[J]. Electronic Commerce Research and Applications, 2013,12(1):14-23.
[16] Chen L, Wang F . Preference-based Clustering Reviews for Augmenting E-commerce Recommendation[J]. Knowledge-Based Systems, 2013,50:44-59.
[17] Chen L, Wang F . Explaining Recommendations Based on Feature Sentiments in Product Reviews [C]// Proceedings of the 22nd International Conference on Intelligent User Interfaces, Limasso, Cyprus. New York, USA: ACM, 2017: 17-28.
[18] 王伟, 王洪伟 . 面向竞争力的特征比较网络: 情感分析方法[J]. 管理科学学报, 2016,19(9):109-126.
[18] ( Wang Wei, Wang Hongwei . Comparative Network for Product Competition in Feature-levels Through Sentiment Analysis[J]. Journal of Management Sciences in China, 2016,19(9):109-126.)
[19] Hong Y, Lu J, Yao J , et al. What Reviews are Satisfactory: Novel Features for Automatic Helpfulness Voting [C] //Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, Oregon, USA. New York, USA: ACM, 2012: 495-504.
[20] Chinese Word Vectors: 目前最全的中文预训练词向量集合[EB/OL]. [ 2018- 10- 20].
[20] ( Chinese Word Vectors: The Most Complete Set of Chinese Pre-trained Word Vectors [EB/OL]. [ 2018- 10- 20].
[21] 聂卉, 杜嘉忠 . 依存句法模板下的商品特征标签抽取研究[J]. 现代图书情报技术, 2014(12):44-50.
[21] ( Nie Hui, Du Jiazhong . Using Dependency Parsing Pattern to Extract Product Feature Tags[J]. New Technology of Library and Information Service, 2014(12):44-50.)
[22] LTP语言技术平台 [EB/OL]. [ 2018- 10- 01].
[22] ( Language Technology Platform [EB/OL]. [ 2018- 10- 01].
[1] Shen Zhuo,Li Yan. Mining User Reviews with PreLM-FT Fine-Grain Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[2] Wang Qiangbing,Zhang Chengzhi. Constructing Users Profiles with Content and Gesture Behaviors[J]. 数据分析与知识发现, 2017, 1(2): 80-86.
[3] Zhu Ling,Xue Chunxiang,Zhang Chengzhi,Fu Zhu. User Tags and Microblog Posts: Case Study of Sina Weibo[J]. 现代图书情报技术, 2016, 32(3): 18-24.
[4] Tang Xiaobo, Qiu Xin. Research on Subject-Oriented High Quality Reviews Mining Model[J]. 现代图书情报技术, 2015, 31(7-8): 104-112.
[5] Zheng Wei, Liang Zhanping, Liang Jian. Research on the Framework of a User Intent-oriented Intelligent Search Engine[J]. 现代图书情报技术, 2014, 30(3): 65-72.
[6] Nie Hui, Du Jiazhong. Using Dependency Parsing Pattern to Extract Product Feature Tags[J]. 现代图书情报技术, 2014, 30(12): 44-50.
[7] Wang Yong, Zhang Qin, Yang Xiaojie. Research on the Method of Extracting Features from Chinese Product Reviews on the Internet[J]. 现代图书情报技术, 2013, (12): 70-73.
[8] Niu Yazhen, Zhu Zhongming. Overview about the Methods of Cross-system User Modeling for Personalization Service[J]. 现代图书情报技术, 2012, 28(5): 1-6.
[9] Niu Yazhen, Zhu Zhongming. A Linked Data-driven Semantic User Modeling Framework for Personalization Service[J]. 现代图书情报技术, 2012, (10): 1-7.
[10] Ku Liping. Research on that User Behaviour Model Driven Information System Design[J]. 现代图书情报技术, 2010, 26(7/8): 45-50.
[11] Zhang Yu,Su Xiaolu,Liu Shihong,Li Jing,Hu Haiyan. Design and Realization of Agricultural Scientific Information User Modeling System Based on Ontology[J]. 现代图书情报技术, 2009, 25(11): 34-39.
[12] Jiang Qi. Design of P2P-Based Adaptive Information Retrieval System[J]. 现代图书情报技术, 2005, 21(9): 41-44.
[13] Jiang Qi,Li Guangjian. Reusability in User Modeling[J]. 现代图书情报技术, 2005, 21(12): 7-11.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938