面向评论效用评估的文本情感特征提取

doi:10.11925/infotech.1003-3513.2015.07.15

现代图书情报技术

2015, Vol. 31

Issue (7-8): 113-121 https://doi.org/10.11925/infotech.1003-3513.2015.07.15

研究论文

本期目录 | 过刊浏览 | 高级检索

面向评论效用评估的文本情感特征提取

聂卉¹, 容哲²

1 中山大学资讯管理学院广州 510006;
2 中山大学管理学院广州 510275

Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets

Nie Hui¹, Rong Zhe²

1 School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China;
2 Business School, Sun Yat-Sen University, Guangzhou 510275, China

摘要
参考文献
相关文章
Metrics

全文: PDF (673 KB) HTML
输出: BibTeX | EndNote (RIS)

摘要

【目的】探测情感词典匹配方法以及机器学习方法抽取的情感特征对评论效用的预测作用。【方法】采用情感词典匹配法和机器学习分类法抽取评论情感特征。针对语料构建情感词典, 设计合理匹配算法, 探测最佳情感分类模型, 采用随机森林算法取不同情感特征组合对评论效用价值进行预测。【结果】结合两种情感分析方法对评论效用预测效果最好。其中情感词典匹配方法所得的评论情感均值和评论情感波动能有效识别评论效用, 效果优于机器学习方法。【局限】只针对搜索型商品的评论数据, 缺乏对体验型商品评论的相应分析, 研究数据的覆盖面存在局限。【结论】情感词典匹配法结合机器学习法能有效识别评论效用。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

Abstract：

[Objective] Use review sentiment feature sets extracted by dictionary matching method and machine learning method to predict review's helpfulness. [Methods] This paper adopts sentiment dictionary matching method and machine learning classification method to extract review sentiment feature sets, including building sentiment dictionary, designing appropriate matching algorithm and choosing the best sentiment classifier. Random forest algorithm is applied to predict review's helpfulness with different sentiment feature sets. [Results] The combination of two sentiment analysis methods performs best in predicting review helpfulness. Review's average sentiment score and deviation score derived from sentiment dictionary method have better prediction performance to review helpfulness. [Limitations] Only focused on reviews of search product but neglected the reviews of experience product. The research dataset is limited. [Conclusions] The combination of sentiment dictionary matching method and machine learning method can predict review helpfulness effectively.

收稿日期: 2015-01-20 出版日期: 2015-08-25

TP391

基金资助:

本文系广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号: CD13CTS01)的研究成果之一。

通讯作者: 容哲, ORCID: 0000-0002-0995-8990, E-mail: rongzhe@mail2.sysu.edu.cn。 E-mail: rongzhe@mail2.sysu.edu.cn

作者简介: 作者贡献声明: 聂卉: 提出研究思路, 设计研究方案, 论文修订; 容哲: 文献搜集, 数据采集、清洗及分析, 进行实验, 起草论文。

引用本文:

聂卉, 容哲. 面向评论效用评估的文本情感特征提取[J]. 现代图书情报技术, 2015, 31(7-8): 113-121.
Nie Hui, Rong Zhe. Review Helpfulness Prediction Research Based on Review Sentiment Feature Sets. New Technology of Library and Information Service, 2015, 31(7-8): 113-121.

链接本文:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.07.15 或 https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2015/V31/I7-8/113

[1] 杨铭, 祁巍, 闫相斌, 等. 在线商品评论的效用分析研究[J]. 管理科学学报, 2012, 15(5): 65-75. (Yang Ming, Qi Wei, Yan Xiangbin, et al. Utility Analysis for Online Product Review [J]. Journal of Management Sciences in China, 2012, 15(5): 65-75.)
[2] Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), Seattle, Washington, USA. New York: ACM, 2004: 168-177.
[3] Chen C C, Tseng Y. Quality Evaluation of Product Reviews Using an Information Quality Framework [J]. Decision Support Systems, 2011, 50(4): 755-768.
[4] Jin J, Liu Y. How to Interpret the Helpfulness of Online Product Reviews: Bridging the Needs Between Customers and Designers [C]. In: Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents (SMUC'10), Toronto, Ontario, Canada. New York: ACM, 2010: 87-94.
[5] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification Using Machine Learning Techniques [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'02), Stroudsburg, Philadelphia, USA. Stroudsburg: ACL, 2002: 79-86.
[6] Ghose A, Ipeirotis P G. Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews [C]. In: Proceedings of the 9th International Conference on Electronic Commerce (ICEC'07), Minneapolis, Minnesota, USA. New York: ACM, 2007: 303-310.
[7] 郝媛媛, 叶强, 李一军. 基于影评数据的在线评论有用性影响因素研究[J]. 管理科学学报, 2010, 13(8): 78-88. (Hao Yuanyuan, Ye Qiang, Li Yijun. Research on Online Impact Factors of Customer Reviews Usefulness Based on Movie Reviews Data [J]. Journal of Management Sciences in China, 2010, 13(8): 78-88.)
[8] Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association [J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
[9] 姚天昉, 娄德成. 汉语情感词语义倾向判别的研究[C]. 见: 中文计算技术与语言问题研究-第7届中文信息处理国际会议论文集. 北京: 电子工业出版社, 2007. (Yao Tianfang, Lou Decheng. Research on Semantic Orientation Distinction for Chinese Sentiment Words [C]. In: Proceedings of International Conference on Chinese Computing 2007 (ICCC'07), Wuhan, China. Beijing: Publishing House of Electronics Industry, 2007.)
[10] Pak A, Paroubek P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining [C]. In: Proceedings of the 7th Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta. 2010: 19-21.
[11] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1): 1-4. (Liu Zhiming, Liu Lu. Empirical Study of Sentiment Classification for Chinese Microblog Based on Machine Learning [J]. Computer Engineering and Applications, 2012, 48(1): 1-4.)
[12] Tan S, Zhang J. An Empirical Study of Sentiment Analysis for Chinese Documents [J]. Expert Systems with Applications, 2008, 34(4): 2622-2629.
[13] Siersdorfer S, Chelaru S, Nejdl W, et al. How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings [C]. In: Proceedings of the 19th International Conference on World Wide Web (WWW'10), Raleigh, North Carolina, USA. New York: ACM, 2006: 891-900.
[14] Kim S, Pantel P, Chklovski T, et al. Automatically Assessing Review Helpfulness [C]. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP'06), Sydney, Australia. Stroudsburg: ACL, 2006: 423-430.
[15] Ghose A, Ipeirotis P G. Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics [J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(10): 1498-1512.
[16] Liu J, Cao Y, Lin C, et al. Low-Quality Product Review Detection in Opinion Summarization [C]. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. Stroudsburg: ACL, 2007: 334-342.
[17] Jindal N, Bing L. Analyzing and Detecting Review Spam[C]. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM'07), Omaha, USA. IEEE Computer Society, 2007: 547-552.
[18] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition [J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[19] Bishop C. Pattern Recognition and Machine Learning [M]. Springer, 2007: 217-218.
[20] Kim S, Han K, Rim H, et al. Some Effective Techniques for Naive Bayes Text Classification [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(11): 1457-1466.
[21] Breiman L. Random Forest [J]. Machine Learning, 2001, 45(1): 5-32.

[1]	王鸿, 舒展, 高印权, 田文洪. 一种单分类器联合多任务网络的隐式句间关系分析方法^*[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2]	吴彦文, 蔡秋亭, 刘智, 邓云泽. 融合多源数据和场景相似度计算的数字资源推荐研究^*[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3]	李振宇, 李树青. 嵌入隐式相似群的深度协同过滤算法^*[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4]	董淼, 苏中琪, 周晓北, 兰雪, 崔志刚, 崔雷. 利用Text-CNN改进PubMedBERT在化学诱导性疾病实体关系分类效果的尝试[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5]	余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究^*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6]	丁浩, 艾文华, 胡广伟, 李树青, 索炜. 融合用户兴趣波动时序的个性化推荐模型^*[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7]	华斌, 吴诺, 贺欣. 基于知识融合的政务信息化项目多专家审批意见整合^*[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8]	王媛, 时恺泽, 牛振东. 一种用于实体关系三元组抽取的位置辅助分步标记方法^*[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9]	杨辰, 陈晓虹, 王楚涵, 刘婷婷. 基于用户细粒度属性偏好聚类的推荐策略^*[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10]	戴志宏, 郝晓玲. 上下位关系抽取方法及其在金融市场的应用^*[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11]	汪雪锋, 任惠超, 刘玉琴. 融合聚类信息的技术主题图可视化方法研究 [J]. 数据分析与知识发现, 0, (): 1-.
[12]	王一钒,李博,史话,苗威,姜斌. 古汉语实体关系联合抽取的标注方法*[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13]	车宏鑫,王桐,王伟. 前列腺癌预测模型对比研究*[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14]	周阳,李学俊,王冬磊,陈方,彭莉娟. 炸药配方设计知识图谱的构建与可视分析方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15]	马江微, 吕学强, 游新冬, 肖刚, 韩君妹. 融合BERT与关系位置特征的军事领域关系抽取方法^*[J]. 数据分析与知识发现, 2021, 5(8): 1-12.

Viewed

Full text

Abstract

Cited

Shared

Discussed