Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (7-8): 104-112    DOI: 10.11925/infotech.1003-3513.2015.07.14
Current Issue | Archive | Adv Search |
Research on Subject-Oriented High Quality Reviews Mining Model
Tang Xiaobo1, Qiu Xin2
1 Center for the Studies of Information Resources, Wuhan University, Wuhan 430072, China;
2 School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF(6034 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] In order to help consumers distinguish high quality reviews from enormous review sets.[Methods] Using LDA topic model to classify the themes and referring to the thoughts of improved automatic summarization, this paper puts forward Subject-Oriented High Quality Reviews Mining Model.[Results] The model extracts high quality reviews automatically under each topic. The results of the experiment show that its precision, recall and F1 score reach 80.73%, 64.90% and 71.95% respectively, proving the model's effectiveness and superiority.[Limitations] Just compared the model with some typical models, but there are some other methods exist but have not been verified. [Conclusions] The model can effectively mine high quality reviews under different themes from the review sets, thus help customers in making more effective purchase decision.

Received: 13 January 2015      Published: 25 August 2015
:  G203  

Cite this article:

Tang Xiaobo, Qiu Xin. Research on Subject-Oriented High Quality Reviews Mining Model. New Technology of Library and Information Service, 2015, 31(7-8): 104-112.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.07.14     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I7-8/104

[1] 江敏.产品网络评论挖掘研究[D].北京:北京信息科技大学, 2008.(Jiang Min.Research on ProductNetworkReviewsMining[D].Beijing: Beijing Information Science and Technology University, 2008.)
[2] Ghose A, Ipeiortis P G.Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews[C].In: Proceedings of the 9th International ConferenceonElectronicCommerce(ICEC'07),Minneapolis,MN, USA. New York: ACM, 2007: 303-310.
[3] Otterbacher J. "Helpfulness"in Online Communities: A Measure of Message Quality[C]. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'09), Boston, MA, USA.New York: ACM, 2009: 955-964.
[4] 李志宇. 在线商品评论效用排序模型研究[J]. 现代图书情报技术, 2013(4): 62-68.(Li Zhiyu. Study on the Reviews Effectiveness Sequencing Model of Online Products[J]. New Technology of Library and Information Service, 2013(4): 62-68.)
[5] 王平, 代宝.消费者在线评论有用性影响因素实证研究[J]. 统计与决策, 2012(2): 118-120.(Wang Ping, Dai Bao. An Empirical Study of the Factors Affecting the Usefulness of Online Consumer Reviews[J]. Statistics & Decision, 2012(2): 118-120.)
[6] 彭岚, 周启海, 邱江涛.消费者在线评论有用性影响因素模型研究[J].计算机科学, 2011, 38(8): 205-207, 244.(Peng Lan, Zhou Qihai, Qiu Jiangtao. Research on the Model of Helpfulness Factors of Online Customer Reviews[J].Computer Science, 2011, 38(8): 205-207, 244.)
[7] 聂卉.基于内容分析的用户评论质量的评价与预测[J].图书情报工作, 2014, 58(13): 83-89.(Nie Hui.Content-oriented Evaluation and Detection for Product Reviews[J].Library and Information Service, 2014, 58(13): 83-89.)
[8] Liu Y, Huang X, An A, et al. Modeling and Predicting the Helpfulness of Online Reviews[C].In: Proceedings of the 8th IEEE International Conference on Data Mining, (ICDM'08). IEEE, 2008: 443-452.
[9] Fei G, Mukherjee A, Liu B. Exploiting Business in Reviews for Review Spammer Detection [C]. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media. 2013.
[10] Moghaddam S, Ester M. ILDA: Interdependent LDA Model for Learning Latent Aspects and Their Ratings from Online Product Reviews[C].In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'11). New York: ACM, 2011: 665-674.
[11] 阮光册. 基于LDA的网络评论主题发现研究[J]. 情报杂志, 2014, 33(3): 161-164. (Ruan Guangce. Topic Extraction Research of Net Reviews Based on Latent Dirichlet Allocation[J]. Journal of Intelligence, 2014, 33(3): 161-164.)
[12] 余传明, 张小青, 陈雷. 基于LDA模型的评论热点挖掘: 原理与实现[J]. 情报理论与实践, 2010, 33(5): 103-106.(Yu Chuanming, Zhang Xiaoqing, Chen Lei. Mining Hot Topics of User Comment Based on LDA Model: Principle & Approach[J]. Information Studies: Theory & Application, 2010, 33(5): 103-106.)
[13] Titov I, McDonald R. Modeling Online Reviews with Multi-grain Topic Models[C].In: Proceedings of the 17th International Conference on World Wide Web (WWW'08). New York: ACM, 2008: 111-120.
[14] Erkan G, Radev D R. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization[J]. Journal of ArtificialIntelligence Research, 2004, 22(1): 457-479.
[15] 纪文倩, 李舟军, 巢文涵, 等. 一种基于 LexRank 算法的改进的自动文摘系统[J]. 计算机科学, 2010, 37(5): 151-154.(Ji Wenqian, Li Zhoujun, Chao Wenhan, et al. Automatic Abstracting System Based on Improved LexRank Algorithm[J].Computer Science, 2010, 37(5): 151-154.)
[16] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[17] 八爪鱼采集器 [EB/OL].[2014-11-08].http://www.bazhuayu.com/doc-wf.(Bazhuayu Collector[EB/OL].[2014-11-08].http://www.bazhuayu.com/doc-wf.)
[18] Gross A, Murthy D. Modeling Virtual Organizations with Latent Dirichlet Allocation: A Case for Natural Language Processing[J]. Neural Networks, 2014, 58: 38-49.
[19] Mudambi S M, Schuff D. What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon.com[J]. Management Information Systems Quarterly, 2010, 34(1): 185-200.
[20] 杨潇, 马军, 杨同峰, 等.主题模型LDA的多文档自动文摘[J]. 智能系统学报, 2010, 5(2): 169-176.(Yang Xiao, Ma Jun, Yang Tongfeng, et al. Automatic Multi-document Summarization Based on the Latent Dirichlet Topic Allocation Model[J].CAAI Transactions on Intelligent Systems, 2010, 5(2): 169-176.)
[21] Zhang Y, Ji D, Su Y, et al. Topic Analysis for Online Reviews with an Author-Experience-Object-Topic Model[A].//Information Retrieval Technology[M]. Springer Berlin Heidelberg, 2011: 303-314.
[22] Zhuang L, Jing F, Zhu X. Movie Review Mining and Summarization[C].In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management(CIKM'06). New York: ACM, 2006: 43-50.

[1] Zhang Yingyi, Zhang Chengzhi, Chi Xuehua, Li Lei. Difference Research on Keywords Tagging Behavior for Academic User Blog——A Case Study of ScienceNet.cn[J]. 现代图书情报技术, 2015, 31(10): 13-21.
[2] Zhai Shanshan, Xu Xin, Xia Lixin. A Review of User Communication and Knowledge Dissemination in Academic Blogs[J]. 现代图书情报技术, 2015, 31(7-8): 3-12.
[3] Xu Xin, Zhai Shanshan, Yao Zhanlei. Disciplinary Interaction Analysis of Academic Blogs——Taking ScienceNet.cn Blog as an Example[J]. 现代图书情报技术, 2015, 31(7-8): 13-23.
[4] Tan Min, Xu Xin, Zhao Xing . Exploring the Co-recommendation Relationship and Its Core Structure Features of Academic Blogs——Taking ScienceNet.cn Blog as an Example[J]. 现代图书情报技术, 2015, 31(7-8): 24-30.
[5] Tan Min, Xu Xin. The Empirical Study of h-Degree in Recommendation Network of Academic Blogs——Taking ScienceNet.cn Blogs as an Example[J]. 现代图书情报技术, 2015, 31(7-8): 31-36.
[6] Wang Chuanqing, Bi Qiang. System Model of Digital Library Automatic Semantic Annotation Tool[J]. 现代图书情报技术, 2014, 30(6): 17-24.
[7] Jiang Wen, Xu Xin. Review on Information Quality Evaluation of Online Community Question Answering Sites[J]. 现代图书情报技术, 2014, 30(6): 41-50.
[8] Tang Xiaobo, Fang Xiaoke. The Effect of the Quality of Textual Features on Retrieval in Micro-blog[J]. 现代图书情报技术, 2014, 30(6): 79-86.
[9] Ke Qing, Wang Xiufeng. A Review on Web Navigation Model:Information Foraging Theory Perspective[J]. 现代图书情报技术, 2014, 30(2): 32-40.
[10] Li Yingying, Wang Huilin. Application of Topic Maps in Consumer Health Information Resources Organization——Illustrated by Diabetes Mellitus Information Resources[J]. 现代图书情报技术, 2013, (12): 55-61.
[11] Chen Minghong, Qi Xianjun. A User’s Acceptance Model of Academic Blog and Its Empirical Study[J]. 现代图书情报技术, 2013, (12): 81-87.
[12] Chen Ying, Li Jiao, Li Junlian. A Knowledge Representation Method for Pharmaceutical Products in China[J]. 现代图书情报技术, 2013, (6): 9-15.
[13] Hong Na, Qian Qing, Fan Wei, Fang An, Wang Junhui. Visualization Implementation of Relation Discovery Based on Linked Data[J]. 现代图书情报技术, 2013, 29(2): 11-17.
[14] Wan Jun, Zhang Xiang, Pang Peipei. Research on Factor Model of Dating Sites’ Initial Trust[J]. 现代图书情报技术, 2012, (10): 67-71.
[15] Han Yaojun. Modeling and Analysis of Multilingual Information Resource Scheduling Using Colored Timed Petri Nets[J]. 现代图书情报技术, 2012, 28(3): 40-46.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn