Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (7-8): 104-112    DOI: 10.11925/infotech.1003-3513.2015.07.14
Current Issue | Archive | Adv Search |
Research on Subject-Oriented High Quality Reviews Mining Model
Tang Xiaobo1, Qiu Xin2
1 Center for the Studies of Information Resources, Wuhan University, Wuhan 430072, China;
2 School of Information Management, Wuhan University, Wuhan 430072, China
Export: BibTeX | EndNote (RIS)      

[Objective] In order to help consumers distinguish high quality reviews from enormous review sets.[Methods] Using LDA topic model to classify the themes and referring to the thoughts of improved automatic summarization, this paper puts forward Subject-Oriented High Quality Reviews Mining Model.[Results] The model extracts high quality reviews automatically under each topic. The results of the experiment show that its precision, recall and F1 score reach 80.73%, 64.90% and 71.95% respectively, proving the model's effectiveness and superiority.[Limitations] Just compared the model with some typical models, but there are some other methods exist but have not been verified. [Conclusions] The model can effectively mine high quality reviews under different themes from the review sets, thus help customers in making more effective purchase decision.

Received: 13 January 2015      Published: 25 August 2015
:  G203  

Cite this article:

Tang Xiaobo, Qiu Xin. Research on Subject-Oriented High Quality Reviews Mining Model. New Technology of Library and Information Service, 2015, 31(7-8): 104-112.

URL:     OR

[1] 江敏.产品网络评论挖掘研究[D].北京:北京信息科技大学, 2008.(Jiang Min.Research on ProductNetworkReviewsMining[D].Beijing: Beijing Information Science and Technology University, 2008.)
[2] Ghose A, Ipeiortis P G.Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews[C].In: Proceedings of the 9th International ConferenceonElectronicCommerce(ICEC'07),Minneapolis,MN, USA. New York: ACM, 2007: 303-310.
[3] Otterbacher J. "Helpfulness"in Online Communities: A Measure of Message Quality[C]. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'09), Boston, MA, USA.New York: ACM, 2009: 955-964.
[4] 李志宇. 在线商品评论效用排序模型研究[J]. 现代图书情报技术, 2013(4): 62-68.(Li Zhiyu. Study on the Reviews Effectiveness Sequencing Model of Online Products[J]. New Technology of Library and Information Service, 2013(4): 62-68.)
[5] 王平, 代宝.消费者在线评论有用性影响因素实证研究[J]. 统计与决策, 2012(2): 118-120.(Wang Ping, Dai Bao. An Empirical Study of the Factors Affecting the Usefulness of Online Consumer Reviews[J]. Statistics & Decision, 2012(2): 118-120.)
[6] 彭岚, 周启海, 邱江涛.消费者在线评论有用性影响因素模型研究[J].计算机科学, 2011, 38(8): 205-207, 244.(Peng Lan, Zhou Qihai, Qiu Jiangtao. Research on the Model of Helpfulness Factors of Online Customer Reviews[J].Computer Science, 2011, 38(8): 205-207, 244.)
[7] 聂卉.基于内容分析的用户评论质量的评价与预测[J].图书情报工作, 2014, 58(13): 83-89.(Nie Hui.Content-oriented Evaluation and Detection for Product Reviews[J].Library and Information Service, 2014, 58(13): 83-89.)
[8] Liu Y, Huang X, An A, et al. Modeling and Predicting the Helpfulness of Online Reviews[C].In: Proceedings of the 8th IEEE International Conference on Data Mining, (ICDM'08). IEEE, 2008: 443-452.
[9] Fei G, Mukherjee A, Liu B. Exploiting Business in Reviews for Review Spammer Detection [C]. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media. 2013.
[10] Moghaddam S, Ester M. ILDA: Interdependent LDA Model for Learning Latent Aspects and Their Ratings from Online Product Reviews[C].In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'11). New York: ACM, 2011: 665-674.
[11] 阮光册. 基于LDA的网络评论主题发现研究[J]. 情报杂志, 2014, 33(3): 161-164. (Ruan Guangce. Topic Extraction Research of Net Reviews Based on Latent Dirichlet Allocation[J]. Journal of Intelligence, 2014, 33(3): 161-164.)
[12] 余传明, 张小青, 陈雷. 基于LDA模型的评论热点挖掘: 原理与实现[J]. 情报理论与实践, 2010, 33(5): 103-106.(Yu Chuanming, Zhang Xiaoqing, Chen Lei. Mining Hot Topics of User Comment Based on LDA Model: Principle & Approach[J]. Information Studies: Theory & Application, 2010, 33(5): 103-106.)
[13] Titov I, McDonald R. Modeling Online Reviews with Multi-grain Topic Models[C].In: Proceedings of the 17th International Conference on World Wide Web (WWW'08). New York: ACM, 2008: 111-120.
[14] Erkan G, Radev D R. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization[J]. Journal of ArtificialIntelligence Research, 2004, 22(1): 457-479.
[15] 纪文倩, 李舟军, 巢文涵, 等. 一种基于 LexRank 算法的改进的自动文摘系统[J]. 计算机科学, 2010, 37(5): 151-154.(Ji Wenqian, Li Zhoujun, Chao Wenhan, et al. Automatic Abstracting System Based on Improved LexRank Algorithm[J].Computer Science, 2010, 37(5): 151-154.)
[16] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[17] 八爪鱼采集器 [EB/OL].[2014-11-08]. Collector[EB/OL].[2014-11-08].
[18] Gross A, Murthy D. Modeling Virtual Organizations with Latent Dirichlet Allocation: A Case for Natural Language Processing[J]. Neural Networks, 2014, 58: 38-49.
[19] Mudambi S M, Schuff D. What Makes a Helpful Online Review? A Study of Customer Reviews on[J]. Management Information Systems Quarterly, 2010, 34(1): 185-200.
[20] 杨潇, 马军, 杨同峰, 等.主题模型LDA的多文档自动文摘[J]. 智能系统学报, 2010, 5(2): 169-176.(Yang Xiao, Ma Jun, Yang Tongfeng, et al. Automatic Multi-document Summarization Based on the Latent Dirichlet Topic Allocation Model[J].CAAI Transactions on Intelligent Systems, 2010, 5(2): 169-176.)
[21] Zhang Y, Ji D, Su Y, et al. Topic Analysis for Online Reviews with an Author-Experience-Object-Topic Model[A].//Information Retrieval Technology[M]. Springer Berlin Heidelberg, 2011: 303-314.
[22] Zhuang L, Jing F, Zhu X. Movie Review Mining and Summarization[C].In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management(CIKM'06). New York: ACM, 2006: 43-50.

[1] Qi Tuotuo, Bai Ruyu, Wang Tianmei. Research on Knowledge Payment Behavior Based on Information Adoption Model: Moderating Effect of Product Type [J]. 数据分析与知识发现, 0, (): 1-.
[2] Lu Quan, He Chao, Chen Jing, Tian Min, Liu Ting. A Multi-Label Classification Model with Two-Stage Transfer Learning[J]. 数据分析与知识发现, 2021, 5(7): 91-100.
[3] Chen Jun,Liang Hao,Qian Chen. Studying Investment Decisions of Rewarded Crowdfunding Users with Emotional Distance and Text Analysis[J]. 数据分析与知识发现, 2021, 5(4): 60-71.
[4] Liang Jiaming, Zhao Jie, Zheng Peng, Huang Liushen, Ye Minqi, Dong Zhenning. Framework for Computing Trust in Online Short-Rent Platform Using Feature Selection of Images and Texts[J]. 数据分析与知识发现, 2021, 5(2): 129-140.
[5] Wu Sizhu, Qian Qing, Zhou Wei, Zhong Ming, Wang Anran, Xiu Xiaolei, Gou Huan, Li Zanmei, Li Jiao, Fang An. Data Archive for Research Projects in Population Health[J]. 数据分析与知识发现, 2020, 4(12): 2-13.
[6] Huang Wei,Zhao Jiangyuan,Yan Lu. Empirical Research on Topic Drift Index for Trending Network Events[J]. 数据分析与知识发现, 2020, 4(11): 92-101.
[7] Chi Maomao,Pan Meiyu,Wang Weijun. Impacts of Cue Consistency on Shared Accommodation Bookings: Interaction Between Texts and Images[J]. 数据分析与知识发现, 2020, 4(11): 74-83.
[8] Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
[9] Wu Sizhu, Qian Qing, Zhou Wei, Zhong Ming, Wang Anran, Xiu Xiaolei, Gou Huan, Li Zanmei, Li Jiao, Fang An. Design and Implementation of Data Archive for Data Collection from Research Projects in the Field of Population Health [J]. 数据分析与知识发现, 0, (): 1-.
[10] Wang Shuyi,Liu Sai,Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
[11] Jiaming Liang, Jie Zhao, Peng Zheng, Liushen Huang, Minqi Ye, Zhenning Dong. Image and text analysis based computational framework of trust in online short-rent platform using feature selection [J]. 数据分析与知识发现, 0, (): 1-.
[12] Chi Maomao, Pan Meiyu, Wang Weijun. Research on the Impact of Clue Consistency on Purchasing Decisions of Peer-to-Peer Accommodation Platform: Interaction between Text and Image Clues [J]. 数据分析与知识发现, 0, (): 1-.
[13] Xuhui Li,Tao Yu,Ting Li,Yiwen Li,Jinguang Gu. An Evolutionary Schema for Metadata Description[J]. 数据分析与知识发现, 2020, 4(1): 76-88.
[14] Gang Li,Sijing Chen,Jin Mao,Yansong Gu. Spatio-Temporal Comparison of Microblog Trending Topics on Natural Disasters[J]. 数据分析与知识发现, 2019, 3(11): 1-15.
[15] Li He,Zhu Linlin,Yan Min,Liu Jincheng,Hong Chuang. Identifying Useful Information from Open Innovation Community[J]. 数据分析与知识发现, 2018, 2(12): 12-22.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938