Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (7-8): 113-121    DOI: 10.11925/infotech.1003-3513.2015.07.15
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
面向评论效用评估的文本情感特征提取
聂卉1, 容哲2
1 中山大学资讯管理学院 广州 510006;
2 中山大学管理学院 广州 510275
Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets
Nie Hui1, Rong Zhe2
1 School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China;
2 Business School, Sun Yat-Sen University, Guangzhou 510275, China
全文: PDF(673 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

目的】探测情感词典匹配方法以及机器学习方法抽取的情感特征对评论效用的预测作用。【方法】采用情感词典匹配法和机器学习分类法抽取评论情感特征。针对语料构建情感词典, 设计合理匹配算法, 探测最佳情感分类模型, 采用随机森林算法取不同情感特征组合对评论效用价值进行预测。【结果】结合两种情感分析方法对评论效用预测效果最好。其中情感词典匹配方法所得的评论情感均值和评论情感波动能有效识别评论效用, 效果优于机器学习方法。【局限】只针对搜索型商品的评论数据, 缺乏对体验型商品评论的相应分析, 研究数据的覆盖面存在局限。【结论】情感词典匹配法结合机器学习法能有效识别评论效用。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] Use review sentiment feature sets extracted by dictionary matching method and machine learning method to predict review's helpfulness. [Methods] This paper adopts sentiment dictionary matching method and machine learning classification method to extract review sentiment feature sets, including building sentiment dictionary, designing appropriate matching algorithm and choosing the best sentiment classifier. Random forest algorithm is applied to predict review's helpfulness with different sentiment feature sets. [Results] The combination of two sentiment analysis methods performs best in predicting review helpfulness. Review's average sentiment score and deviation score derived from sentiment dictionary method have better prediction performance to review helpfulness. [Limitations] Only focused on reviews of search product but neglected the reviews of experience product. The research dataset is limited. [Conclusions] The combination of sentiment dictionary matching method and machine learning method can predict review helpfulness effectively.

收稿日期: 2015-01-20     
:  TP391  
基金资助:

本文系广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号: CD13CTS01)的研究成果之一。

通讯作者: 容哲, ORCID: 0000-0002-0995-8990, E-mail: rongzhe@mail2.sysu.edu.cn。     E-mail: rongzhe@mail2.sysu.edu.cn
作者简介: 作者贡献声明: 聂卉: 提出研究思路, 设计研究方案, 论文修订; 容哲: 文献搜集, 数据采集、清洗及分析, 进行实验, 起草论文。
引用本文:   
聂卉, 容哲. 面向评论效用评估的文本情感特征提取[J]. 现代图书情报技术, 2015, 31(7-8): 113-121.
Nie Hui, Rong Zhe. Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2015.07.15.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.07.15

[1] 杨铭, 祁巍, 闫相斌, 等. 在线商品评论的效用分析研究[J]. 管理科学学报, 2012, 15(5): 65-75. (Yang Ming, Qi Wei, Yan Xiangbin, et al. Utility Analysis for Online Product Review [J]. Journal of Management Sciences in China, 2012, 15(5): 65-75.)
[2] Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), Seattle, Washington, USA. New York: ACM, 2004: 168-177.
[3] Chen C C, Tseng Y. Quality Evaluation of Product Reviews Using an Information Quality Framework [J]. Decision Support Systems, 2011, 50(4): 755-768.
[4] Jin J, Liu Y. How to Interpret the Helpfulness of Online Product Reviews: Bridging the Needs Between Customers and Designers [C]. In: Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents (SMUC'10), Toronto, Ontario, Canada. New York: ACM, 2010: 87-94.
[5] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification Using Machine Learning Techniques [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'02), Stroudsburg, Philadelphia, USA. Stroudsburg: ACL, 2002: 79-86.
[6] Ghose A, Ipeirotis P G. Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews [C]. In: Proceedings of the 9th International Conference on Electronic Commerce (ICEC'07), Minneapolis, Minnesota, USA. New York: ACM, 2007: 303-310.
[7] 郝媛媛, 叶强, 李一军. 基于影评数据的在线评论有用性影响因素研究[J]. 管理科学学报, 2010, 13(8): 78-88. (Hao Yuanyuan, Ye Qiang, Li Yijun. Research on Online Impact Factors of Customer Reviews Usefulness Based on Movie Reviews Data [J]. Journal of Management Sciences in China, 2010, 13(8): 78-88.)
[8] Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association [J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
[9] 姚天昉, 娄德成. 汉语情感词语义倾向判别的研究[C]. 见: 中文计算技术与语言问题研究-第7届中文信息处理国际会议论文集. 北京: 电子工业出版社, 2007. (Yao Tianfang, Lou Decheng. Research on Semantic Orientation Distinction for Chinese Sentiment Words [C]. In: Proceedings of International Conference on Chinese Computing 2007 (ICCC'07), Wuhan, China. Beijing: Publishing House of Electronics Industry, 2007.)
[10] Pak A, Paroubek P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining [C]. In: Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta. 2010: 19-21.
[11] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1): 1-4. (Liu Zhiming, Liu Lu. Empirical Study of Sentiment Classification for Chinese Microblog Based on Machine Learning [J]. Computer Engineering and Applications, 2012, 48(1): 1-4.)
[12] Tan S, Zhang J. An Empirical Study of Sentiment Analysis for Chinese Documents [J]. Expert Systems with Applications, 2008, 34(4): 2622-2629.
[13] Siersdorfer S, Chelaru S, Nejdl W, et al. How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings [C]. In: Proceedings of the 19th International Conference on World Wide Web (WWW'10), Raleigh, North Carolina, USA. New York: ACM, 2006: 891-900.
[14] Kim S, Pantel P, Chklovski T, et al. Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP'06), Sydney, Australia. Stroudsburg: ACL, 2006: 423-430.
[15] Ghose A, Ipeirotis P G. Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[16] Liu J, Cao Y, Lin C, et al. Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. Stroudsburg: ACL, 2007: 334-342.
[17] Jindal N, Bing L. Analyzing and Detecting Review Spam[C]. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM'07), Omaha, USA. IEEE Computer Society, 2007: 547-552.
[18] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition [J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[19] Bishop C. Pattern Recognition and Machine Learning [M]. Springer, 2007: 217-218.
[20] Kim S, Han K, Rim H, et al. Some Effective Techniques for Naive Bayes Text Classification [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(11): 1457-1466.
[21] Breiman L. Random Forest [J]. Machine Learning, 2001, 45(1): 5-32.

[1] 李晓峰,马静,李驰,朱恒民. 基于XGBoost模型的电商商品品名识别算法研究 *[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[2] 尤众喜,华薇娜,潘雪莲. 中文分词器对图书评论和情感词典匹配程度的影响 *[J]. 数据分析与知识发现, 2019, 3(7): 23-33.
[3] 关鹏,王曰芬,傅柱. 基于LDA的主题语义演化分析方法研究 * ——以锂离子电池领域为例[J]. 数据分析与知识发现, 2019, 3(7): 61-72.