Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (4): 40-47    DOI: 10.11925/infotech.1003-3513.2016.04.05
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于文本内容特征选择的评论质量检测*
孟园(),王洪伟
同济大学经济与管理学院 上海 210000
Evaluating Online Reviews Based on Text Content Features
Meng Yuan(),Wang Hongwei
School of Economics and Management, Tongji University, Shanghai 210000, China
全文: PDF(526 KB)   HTML ( 79
输出: BibTeX | EndNote (RIS)      
摘要 

目的】在有效提取多维特征基础上, 考察评论内容特征对评论质量检测的影响。【方法】基于评论文本的信息特征度量和情感倾向的混合性, 量化并抽取评论内容特征, 采用GBDT模型评估特征集合分类效果, 结合贪婪式特征选择算法识别有效内容特征, 分析其对评论质量检测的影响。【结果】将评论内容特征应用于评论质量检测任务中能取得较好的效果, 明显提升了实验准确率和召回率。【局限】实验对象主要是搜索型产品的评论数据, 未对其他享受型产品(如电影、音乐)等进行验证和比较。【结论】评论内容的信息增益、产品特征词的信息增益、评论客观情感倾向度、内容差异性对评论质量检测有明显作用。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
孟园
王洪伟
关键词 评论质量信息特征情感倾向内容特征贪婪式特征选择    
Abstract

[Objective] This paper aims to effectively extract multi-dimensional characteristics of online reviews and then examine the impact of text content to the review quality evaluation. [Methods] First, we quantified and extracted content features based on the textual and sentimental message from the reviews. Then, adopted the GBDT model to evaluate the influence of feature sets to classification results, along with greedy feature selection procedure to identify the most effective content features. Finally, we examined the influences of these features. [Results] The proposed method could improve the performance of review quality evaluation tasks, especially the recall and precision of the new system. [Limitations] Our research focused on review data from search services, and did not investigate products like movies and music. [Conclusions] The information gained from reviews and product feature words, degree of sentimental objectiveness, and differences among review contents all posed important effects to review quality evaluation.

Key wordsReview quality    Information feature    Sentiment orientation    Review content    Greedy feature selection
收稿日期: 2015-12-09     
基金资助:*本文系国家自然科学基金项目“中文语境下基于模糊本体的用户在线评论的情感分析”(项目编号: 70971099)和国家自然科学基金项目“在线评论对商家业绩的影响研究: 情感分析的视角”(项目编号: 71371144)的研究成果之一
引用本文:   
孟园,王洪伟. 基于文本内容特征选择的评论质量检测*[J]. 现代图书情报技术, 2016, 32(4): 40-47.
Meng Yuan,Wang Hongwei. Evaluating Online Reviews Based on Text Content Features. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2016.04.05.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.04.05
[1] 聂卉, 容哲. 面向评论效用评估的文本情感特征提取[J]. 现代图书情报技术, 2015 (7-8): 104-112.
[1] (Nie Hui, Rong Zhe.Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets[J]. New Technology of Library and Information Service, 2015(7-8): 104-112.)
[2] 杨铭, 祁巍, 闫相斌, 等. 在线商品评论的效用分析研究[J]. 管理科学学报, 2012,15(5): 65-75.
[2] (Yang Ming, Qi Wei, Yan Xiangbin, et al.Utility Analysis for Online Product Review[J]. Journal of Management Science in China, 2012, 15(5): 65-75.)
[3] 高雅, 李红, 施慧斌. 在线评论投票数的影响因素研究[J]. 中国管理信息化, 2012, 15(17): 88-91.
[3] (Gao Ya, Li Hong, Shi Huibin.The Research of the Impact Factors of the Online Review Votes[J]. China Management Informationization, 2012, 15(17): 88-91.)
[4] 严建援, 张丽, 张蕾. 电子商务中在线评论内容对评论有用性影响的实证研究[J]. 情报科学, 2012, 30(5): 713-719.
[4] (Yan Jianyuan, Zhang Li, Zhang Lei.An Empirical Study of the Impact of Review Content on Online Reviews Helpfulness in E-Commerce[J]. Information Science, 2012, 30(5): 713-719.)
[5] 杨爽. 信息质量和社区地位对用户创造产品评论的感知有用性影响机制——基于Tobit模型回归[J]. 管理评论, 2013, 25(5): 136-143,154.
[5] (Yang Shuang.The Impact Mechanism of Information Quality and Community Status on Perceived Usefulness for User-Generated Product Reviews——Tobit Regression Analysis[J]. Management Review, 2013, 25(5): 136-143, 154.)
[6] 殷国鹏. 消费者认为怎样的在线评论更有用?——社会性因素的影响效应[J]. 管理世界, 2012(12): 115-124.
[6] (Yin Guopeng.What is the Kind of Online Reviews that Consumer Think are More Useful? The Effect of Social Factors Influence[J]. Management World, 2012(12): 115-124. )
[7] Kim S M, Pantel P, Chklovski T, et al.Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sydney, Australia. Stroudsburg, PA, USA: ACL, 2006: 423-430.
[8] Ghose A, Ipeirotis P G.Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[9] Li F, Zhang Y L, Dang Y, et al.Analyzing Sentiments in Web2.0 Social Medial Data in Chinese: Experiments on Business and Marketing Related Chinese Web Forums[J]. Information Technology Management, 2013(14): 231-242.
[10] Liu Y, Jin J, Ji P, et al.Identifying Helpful Online Reviews: A Product Designer’s Perspective[J]. Computer-Aided Design, 2013, 45(2): 180-194.
[11] Chen C C, Tseng Y-D.Quality Evaluation of Product Reviews Using an Information Quality Framework[J]. Decision Support Systems, 2011, 50(4): 755-768.
[12] 王伟, 王洪伟. 特征观点对购买意愿的影响: 在线评论的情感分析方法[J]. 系统工程理论与实践, 2016, 36(1): 63-76.
[12] (Wang Wei, Wang Hongwei.The Influence of Aspect- based Opinions on User’s Purchase Intention Using Sentiment Analysis of Online Reviews[J]. Systems Engineering—— Theory & Practice, 2016, 36(1): 63-76.)
[13] Ayaru L, Ypsilantis P-P, Nanapragasam A, et al.Prediction of Outcome in Acute Lower Gastrointestinal Bleeding Using Gradient Boosting[J]. PLoS ONE, 2015, 10(7). DOI: 10.1371/journal.pone.0132485.
[14] Semanjski I, Gautama S.Smart City Mobility Application- Gradient Boosting Trees for Mobility Prediction and Analysis Based on Crowd Sourced Data[J]. Sensors, 2015, 15(7): 15974-15987.
[15] Zhang R, Tran T.An Information Gain-based Approach for Recommending Useful Product Reviews[J]. Knowledge and Information Systems, 2011, 26(3): 419-434.
[16] Jindal N, Liu B.Review Spam Detection [C]. In: Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada. New York, NY, USA: ACM, 2007: 1189-1190.
[17] 吴军. 数学之美[M]. 北京: 人民邮电出版社, 2012:60-64.
[17] (Wu Jun.The Beauty of Mathmatics [M]. Beijing: Posts and Telecom Press, 2012: 60-64.)
[18] Pang B, Lee L.A Sentiment Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts [C]. In: Proceeding of the 42nd Annual Meeting of the Association for Computational Linguistic (ACL). Morristown, NJ, USA: ACL, 2004: 271-278.
[19] Jiang Z P, Ng H T.Semantic Role Labeling of NomBank: A Maximum Entropy Approach [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: ACL, 2006: 138-145.
[20] Liu J, Cao Y, Lin C-Y, et al.Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on EMNLP-CoNLL. ACL Press, 2007: 334-342.
[1] 何有世,何述芳. 基于领域本体的产品网络口碑信息多层次细粒度情感挖掘*[J]. 数据分析与知识发现, 2018, 2(8): 60-68.
[2] 逯万辉,谭宗颖. 学术成果主题新颖性测度方法研究*——基于Doc2Vec和HMM算法[J]. 数据分析与知识发现, 2018, 2(3): 22-29.
[3] 何跃,肖敏,张月. 结合话题相关性的热点话题情感倾向研究*[J]. 数据分析与知识发现, 2017, 1(3): 46-53.
[4] 田世海,吕德丽. 改进潜在语义分析和支持向量机算法用于突发安全事件舆情预警*[J]. 数据分析与知识发现, 2017, 1(2): 11-18.
[5] 杨爽,陈芬. 基于SVM多特征融合的微博情感多级分类研究*[J]. 数据分析与知识发现, 2017, 1(2): 73-79.
[6] 吴聪,赵宇翔,朱庆华. 基于任务展示示能性的众筹项目视频分析*——以众筹网为例[J]. 数据分析与知识发现, 2017, 1(10): 64-76.
[7] 杨小平,马奇凤,余力,莫雨婷,吴佳楠,张悦. 评论簇在网络舆论中的情感倾向代表性研究*[J]. 现代图书情报技术, 2016, 32(7-8): 51-59.
[8] 陈燕方, 李志宇. 基于评论产品属性情感倾向评估的虚假评论识别研究[J]. 现代图书情报技术, 2014, 30(9): 81-90.
[9] 张红斌, 李广丽. 商品在线评价的情感倾向性分析研究[J]. 现代图书情报技术, 2012, (10): 61-66.
[10] 张清亮, 徐健. 网络情感词自动识别方法研究[J]. 现代图书情报技术, 2011, 27(10): 24-28.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn