Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (12): 74-83    DOI: 10.11925/infotech.2096-3467.2017.0866
Orginal Article Current Issue | Archive | Adv Search |
Identifying Useful Online Reviews with Semantic Feature Extraction
Zhang Yanfeng(), Li He, Peng Lihui, Hou Litie
School of Management, Jilin University, Changchun 130022, China
Download: PDF (997 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] We propose a model to identify useful online Chinese reviews, which helps consumers make purchasing decisions. [Methods] First, we calculated six attributes affecting the usefulness of online reviews based on their form and content characteristics. Then, we constructed a usefulness evaluation system with the weighted grey relational degree analysis method. Finally, we created a model to retrieve useful online reviews with k-means clustering method. [Results] We examined the effectiveness of our model with online reviews from Amazon.com. The recall, precision and F values showed that our method could effectively identify the useful online reviews, and classify the polarity ones. [Limitations] The samples, metrics and e-commerce platforms could be further improved. [Conclusions] The proposed method could rank and classify online reviews accurately and reliably.

Key wordsWeighted Grey Relational Degree      Online Reviews      Classification Model      Usefulness     
Received: 28 August 2017      Published: 29 December 2017
ZTFLH:  G253 G202  

Cite this article:

Zhang Yanfeng,Li He,Peng Lihui,Hou Litie. Identifying Useful Online Reviews with Semantic Feature Extraction. Data Analysis and Knowledge Discovery, 2017, 1(12): 74-83.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0866     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I12/74

属性 权重 产品词汇
特殊术语 0.10 商品指代(手机、宝贝等)、品牌名称(苹果、三星等)、专有名词(翻新机、品牌机等)
产品功能 0.25 商务功能、拍照功能、数据应用功能、音乐功能、游戏功能、GPS导航功能
产品性能 0.30 手机反应速度、扩展性、通信质量、电池
续航能力、信号收发能力
外观形态 0.15 材料、键盘类型、输入方式、机身大小、
设计、颜色
技术指标 0.20 CPU配置、网络制式、触屏方式、分辨率、尺寸、内存、存储扩展
属性 权重 示例
情感词 ω1 完美、爱、喜欢、还行、无感、还好、不好、信赖、多、满意、推荐、愉快、一般……(1 800个)
特殊关键词 ω2 啊、呢、吗、嘛、吖……(16个)
特殊符号 ω2 :)、:-)、^O^ 、O(∩_∩)O、\(^o^)/……(84个)
特殊句式 ω2 反问句、疑问句(统计?号), 感叹句(统计!号)
属性 程度 修饰词
评论修饰词 极(E) 极、过、至、顶、最、最为、无比、极度、极其、极为、极端、万分、过分、分外、过于、至为
高(H) 太、挺、满、越、更、忒、特、多、够、蛮、殊、愈、甚、更、尤其、大为、越发、多么、深为、备加、十分、相当、特别、非常、格外、越加
中(M) 较、很、比较、较为、不太、不大、不很
低(L) 稍、略、多少、有点、略为、稍许、稍微、些许
重要程度 说明 标度f(x, y)
属性xy同等重要 xy对总目标的贡献相同 1
属性xy稍微重要 x的贡献稍大于y 3
属性xy明显重要 x的贡献明显大于y 5
属性xy特别重要 x的贡献特别明显大于y 7
属性xy极端重要 x的贡献以压倒优势大于y 9
属性xy介于各等
级之间
相邻两判断的折中 2,4,6,8
指标
样本
有用
投票
图片
数量
时效
产品
属性词
情感
强度
修饰
1 0.72 0 15 4.56 6.50 3.90
2 1.14 2 19 6.62 15.30 4.50
3 0.96 2 21 6.86 14.30 5.20
48 1.64 6 25 7.35 15.90 6.40
49 0.66 0 11 4.87 7.90 4.20
50 1.18 4 26 6.25 12.70 5.40
指标
样本
有用
投票
图片
数量
时效
产品
属性词
情感
强度
修饰
关联
1 0.667 0.512 0.333 0.388 1.000 0.887 0.691
2 0.610 0.333 1.000 0.667 0.890 0.667 0.725
3 1.000 0.750 0.889 0.780 0.500 0.667 0.723
48 0.667 0.889 0.623 0.650 1.000 0.650 0.755
49 0.649 1.000 0.645 0.667 0.333 1.000 0.671
50 0.466 0.510 1.000 0.667 0.733 0.889 0.725
聚类结果 等级 中心点 样本编号(按有用性排序)
第1类(15个) 1 0.74783 48 32 24 37 18 12 42 28 44 7 4
15 10 22 34
第2类(14个) 2 0.72618 25 47 39 20 30 43 14 2 50 23
19 3 35 9
第3类(11个) 3 0.69391 13 45 16 6 21 31 1 40 27 29 38
第4类(10个) 4 0.66844 11 26 49 8 17 33 36 41 5 46
聚类结果 Precision Recall F值
第1类 93.33%(14/15) 87.50%(14/16) 90.31%
第2类 78.57%(11/14) 84.62%(11/13) 81.48%
第3类 72.73%(9/12) 81.82%(9/11) 77.01%
第4类 100.00%(9/9) 90.00%(9/10) 94.74%
平均 86.16% 85.99% 86.07%
[1] Ngo-Ye T L, Sinha A P. The Influence of Reviewer Engagement Characteristics on Online Review Helpfulness: A Text Regression Model[J]. Decision Support Systems, 2014, 61(4): 47-58.
doi: 10.1016/j.dss.2014.01.011
[2] Huang A H, Chen K, Yen D C, et al.A Study of Factors that Contribute to Online Review Helpfulness[J]. Computers in Human Behavior, 2015, 48(C): 17-27.
doi: 10.1016/j.chb.2015.01.010
[3] Agnihotri A, Bhattacharya S.Online Review Helpfulness: Role of Qualitative Factors[J]. Psychology & Marketing, 2016, 33(11): 1006-1017.
doi: 10.1002/mar.20934
[4] Chua A Y K, Banerjee S. Helpfulness of User-generated Reviews as a Function of Review Sentiment, Product Type and Information Quality[J]. Computers in Human Behavior, 2016, 54(C): 547-554.
doi: 10.1016/j.chb.2015.08.057
[5] 李启庚, 赵晓虹, 何耀宇. 在线评论信息感知有用性影响因素实证研究——以服务型产品为例[J]. 情报理论与实践, 2017, 40(8): 122-125.
doi: 10.16353/j.cnki.1000-7490.2017.08.022
[5] (Li Qigeng, Zhao Xiaohong, He Yaoyu.An Empirical Study of Influencing Factors of Perceived Usefulness of Online Review[J]. Information Studies: Theory & Application, 2017, 40(8): 122-125.)
doi: 10.16353/j.cnki.1000-7490.2017.08.022
[6] 张艳辉, 李宗伟, 赵诣成. 基于淘宝网评论数据的信息质量对在线评论有用性的影响[J]. 管理学报, 2017, 14(1): 77-85.
doi: 10.3969/j.issn.1672-884x.2017.01.009
[6] (Zhang Yanhui, Li Zongwei, Zhao Yicheng.How the Information Quality Affects the Online Review Usefulness? ——An Empirical Analysis Based on Taobao Review Data[J]. Chinese Journal of Management, 2017, 14(1): 77-85.)
doi: 10.3969/j.issn.1672-884x.2017.01.009
[7] 刘杰, 付晓东, 刘骊, 等. 热门B2C购物门户用户评论质量影响因素分析研究[J]. 计算机应用与软件, 2017, 34(3): 71-75, 97.
[7] (Liu Jie, Fu Xiaodong, Liu Li, et al.Analysis on Popular B2C Shopping Site[J]. Computer Applications and Software, 2017, 34(3): 71-75, 97.)
[8] Jindal N, Liu B.Opinion Spam and Analysis[C]//Proceedings of International Conference on Web Search and Web Data Mining, Califormia, USA. New York, NY, USA: ACM, 2008: 219-229.
[9] Krishnamoorthy S.Linguistic Features for Review Helpfulness Prediction[M]. Pergamon Press, Inc., 2015.
[10] Ngo-Ye T L, Sinha A P, Sen A. Predicting the Helpfulness of Online Reviews Using a Scripts-Enriched Text Regression Model[J]. Expert Systems with Applications, 2016, 71(11): 98-110.
doi: 10.1016/j.eswa.2016.11.029
[11] 孟园, 王洪伟. 基于文本内容特征选择的评论质量检测[J]. 现代图书情报技术, 2016(4): 40-47.
[11] (Meng Yuan, Wang Hongwei.Evaluating Online Reviews Based on Text Content Features[J]. New Technology of Library and Information Service, 2016(4): 40-47.)
[12] 赵军, 王红. 融合情感极性和逻辑回归的虚假评论检测方法[J]. 智能系统学报, 2016, 11(3): 336-342.
doi: 10.11992/tis.201603027
[12] (Zhao Jun, Wang Hong.Detection of Fake Reviews Based on Emotional Orientation and Logistic Regression[J]. CAAI Transactions on Intelligent Systems, 2016, 11(3): 336-342.)
doi: 10.11992/tis.201603027
[13] Clemons E K, Gao G D, Hitt L M.When Online Reviews Meet Hyper Differentiation: A Study of the Craft Beer Industry[J]. Journal of Management Information Systems, 2006, 23(2): 149-171.
[14] 郭顺利, 张向先, 李中梅. 面向用户信息需求的移动O2O在线评论有用性排序模型研究——以美团为例[J]. 图书情报工作, 2015, 59(12): 85-93.
[14] (Guo Shunli, Zhang Xiangxian, Li Zhongmei.Study on the Usefulness Ranking Model of Mobile O2O Online Reviews from the Perspective of User’s Information Demand: Taking an Example of Meituan[J]. Library and Information Service, 2015, 59(12): 85-93.)
[15] 张艳丰, 李贺, 翟倩, 等. 基于模糊TOPSIS分析的在线评论有用性排序过滤模型研究——以亚马逊手机评论为例[J]. 图书情报工作, 2016, 60(13): 109-117, 125.
doi: 10.13266/j.issn.0252-3116.2016.13.014
[15] (Zhang Yanfeng, Li He, Zhai Qian, et al.Research on the Usefulness of Online Review Based on Fuzzy TOPSIS Analysis: A Case Study of Amazon’s Mobile Phone Review[J]. Library and Information Service, 2016, 60(13): 109-117,125.)
doi: 10.13266/j.issn.0252-3116.2016.13.014
[16] Zhang Z.Weighing Stars: Aggregating Online Product Reviews for Intelligent E-commerce Applications[J]. IEEE Intelligent Systems, 2008, 23(5): 42-49.
doi: 10.1109/MIS.2008.95
[17] Zhang K, Cheng Y, Liao W, et al.Mining Millions of Reviews: A Technique to Rank Products Based on Importance of Reviews[C]//Proceedings of the 13th International Conference on Electronic Commerce. New York, USA: ACM, 2011: 1-8.
[18] HowNet [EB/OL]. [2017-03-18]. .
[19] 数据堂. 台湾大学NTUSD-简体中文情感极性词典[EB/OL]. [2017-03-18]. .
[19] (Data Tang. Taiwan University-The Polarity of Simplified Chinese Emotional Dictionary [EB/OL]. [2017-03-18].
[20] 梁樑, 盛昭翰, 徐南荣. 一种改进的层次分析法[J]. 系统工程, 1989(3): 5-7.
[20] (Liang Liang, Sheng Zhaohan, Xu Nanrong.An Improved Analytic Hierarchy Process[J]. Systems Engineering, 1989(3): 5-7.)
[21] 潘吟, 吴望名. 反对称阵的最优传递阵[J]. 数学的实践与认识, 1988(2): 44-50.
[21] (Pan Yin, Wu Wangming.The Optimal Transitive Matrix of Antisymmetric Matrices[J]. Mathematics in Practice and Theory, 1988(2): 44-50.)
[22] 卓金武, 魏永生, 秦健, 等. MATLAB在数学建模中的应用[M]. 第2版. 北京: 北京航空航天大学出版社, 2014.
[22] (Zhuo Jinwu, Wei Yongsheng, Qin Jian, et al.The Application of MATLAB in Mathematical Modeling [M]. The 2nd Edition. Beijing: Beihang University Press, 2014.)
[23] GooSeeker. MetaSeeker [EB/OL]. [2017-03-25]. .
[24] 敦欣卉, 张云秋, 杨铠西. 基于微博的细粒度情感分析[J]. 数据分析与知识发现, 2017, 1(7): 61-72.
[24] (Dun Xinhui, Zhang Yunqiu, Yang Kaixi.Fine-grained Sentiment Analysis Based on Weibo[J]. Data Analysis and Knowledge Discovery, 2017, 1(7): 61-72.)
[25] Li G, Hoi S C H, Chang K, et al. Microblogging Sentiment Detection by Collaborative Online Learning[C]//Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, Australia. USA: IEEE, 2010: 893-898.
[26] 王倩倩. 一种在线商品评论信息可信度的排序方法[J]. 情报杂志, 2015, 34(3): 181-185.
doi: 10.3969/j.issn.1002-1965.2015.03.033
[26] (Wang Qianqian.Information Credibility of Online Reviews: A New Ranking Method[J]. Journal of Intelligence, 2015, 34(3): 181-185.)
doi: 10.3969/j.issn.1002-1965.2015.03.033
[27] 蔡晓珍, 徐健, 吴思竹. 面向情感分析的用户评论过滤模型研究[J]. 现代图书情报技术, 2014 (4): 58-64.
[27] (Cai Xiaozhen, Xu Jian, Wu Sizhu.Research on Filter Model of Customer Review for Sentiment Analysis[J]. New Technology of Library and Information Service, 2014(4): 58-64.)
[1] Wang Wei, Gao Ning, Xu Yuting, Wang Hongwei. Topic Evolution of Online Reviews for Crowdfunding Campaigns[J]. 数据分析与知识发现, 2021, 5(10): 103-123.
[2] Wang Xinyun,Wang Hao,Deng Sanhong,Zhang Baolong. Classification of Academic Papers for Periodical Selection[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[3] Li Hui,Hu Yunfeng. Analyzing Online Reviews with Dynamic Sentiment Topic Model[J]. 数据分析与知识发现, 2017, 1(9): 74-82.
[4] Zhang Yanfeng,Li He,Peng Lihui. Research on the Brand Switching Intention of Online Product Reviews Based on the Fuzzy Sentiment Calculation[J]. 现代图书情报技术, 2016, 32(5): 64-71.
[5] Gao Song,Wang Hongwei,Feng Gang,Wang Wei. Review of Comparative Opinions Mining Studies of Online Comments[J]. 现代图书情报技术, 2016, 32(10): 1-12.
[6] Li Chuanxi, Zhang Zhixiong, Liu Jianhua, Qian Li. A Semi-supervised Web Scientific and Technical Information Classification Model[J]. 现代图书情报技术, 2014, 30(11): 53-58.
[7] Li Zhiyu. Study on the Reviews Effectiveness Sequencing Model of Online Products[J]. 现代图书情报技术, 2013, (4): 62-68.
[8] Zhang Hongbin, Li Guangli. Research on Sentiment Orientation Analysis of Product Online Reviews[J]. 现代图书情报技术, 2012, (10): 61-66.
[9] Hu Zewen, Wang Xiaoyue, Bai Rujiang. Study on Text Classification Model Based on SUMO and WordNet Ontology Integration[J]. 现代图书情报技术, 2011, 27(1): 31-38.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn