Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (1): 41-50    DOI: 10.11925/infotech.2096-3467.2017.0717
Orginal Article Current Issue | Archive | Adv Search |
Building Product Feature Dictionary with Large-scale Review Data
Weiqing Li1,2,Weijun Wang2()
1(School of Information Management, Central China Normal University, Wuhan 430079, China)
2(Key Laboratory of Adolescent Cyberpsychology and Behavior, Ministry of Education, Central China Normal University, Wuhan 430079, China)
Download: PDF(537 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a method to build product feature dictionary based on large scale review data, aiming to improve its precision and recall. [Methods] First, we constructed a seed dictionary by manually labeling and extending the synonym forest. Then we trained the word vector with large scale product reviews to calculate the semantic similarity and relevance of words. Finally, we identified and categorized the product features to construct the dictionary. [Results] We chose product reviews on mobile-phones, cameras and books to examine the proposed model, which had average precision and recall of 0.774 and 0.855. [Limitations] The proposed method required a great deal of human participation at the marking and verification stages, while it did not consider the implied features of product reviews. [Conclusions] The proposed method could effectively build feature dictionary with better recall.

Key wordsProduct Review      Feature Dictionary      Feature Extraction      Opinion Mining     
Received: 21 July 2017      Published: 05 February 2018

Cite this article:

Weiqing Li,Weijun Wang. Building Product Feature Dictionary with Large-scale Review Data. Data Analysis and Knowledge Discovery, 2018, 2(1): 41-50.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0717     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I1/41

[1] Mathapati S, Manjula S H.Sentiment Analysis and Opinion Mining from Social Media: A Review[J]. Global Journal of Computer Science and Technology, 2016, 16(5): 1-16.
[2] Kim Y, Jeong S R.Opinion-Mining Methodology for Social Media Analytics[J]. KSII Transactions on Internet and Information Systems, 2015, 9(1): 391-406.
[3] Awrahman B, Alatas B.Sentiment Analysis and Opinion Mining Within Social Networks Using Konstanz Information Miner[J]. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 2016, 9(1): 15-22.
[4] Li N, Wu D D.Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast[J]. Decision Support Systems, 2010, 48(2): 354-368.
[5] 史伟, 王洪伟, 何绍义. 基于微博的产品评论挖掘: 情感分析的方法[J]. 情报学报, 2014, 33(12): 1311-1321.
[5] (Shi Wei, Wang Hongwei, He Shaoyi.Product Reviews Mining from Microblogging Based on Sentiment Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2014, 32(12): 1311-1321.)
[6] Liu B, Hu M, Cheng J.Opinion Observer: Analyzing and Comparing Opinions on the Web[C]// Proceedings of the 14th International Conference on World Wide Web. ACM, 2005: 342-351.
[7] Popescu A M, Etzioni O.Extracting Product Features and Opinions from Reviews[A]// Natural Language Processing and Text Mining[M]. Springer London, 2007:9-28.
[8] Somprasertsri G, Lalitrojwong P.Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization[J]. Journal of Essential Oil Research, 2010, 16(6): 938-955.
[9] 吴苏红, 王素格. 基于依存关系的旅游景点评论的特征-观点对抽取[J]. 中文信息学报, 2012, 26(3): 116-121.
[9] (Wu Suhong, Wang Suge.Feature-Opinion Extraction in Scenic Spots Reviews Based on Dependency Relation[J]. Journal of Chinese Information Processing, 2012, 26(3): 116-121.)
[10] 孟园, 王洪伟. 中文评论产品特征与观点抽取方法研究[J]. 现代图书情报技术, 2016(2): 16-24.
[10] (Meng Yuan, Wang Hongwei.Extracting Product Feature and User Opinion from Chinese Reviews[J]. New Technology of Library and Information Service, 2016(2): 16-24.)
[11] 郗亚辉, 张明, 袁方, 等. 产品评论挖掘研究综述[J]. 山东大学学报:理学版, 2011, 46(5): 16-24.
[11] (Xi Yahui, Zhang Ming, Yuan Fang, et al.A Survey of Product Reviews Mining[J]. Journal of Shandong University: Natural Science, 2011, 46(5): 16-24.)
[12] Aravindan S, Ekbal A.Feature Extraction and Opinion Mining in Online Product Reviews[C]// Proceedings of the 2014 International Conference on Information Technology. IEEE, 2015:94-99.
[13] 李实, 叶强, 李一军, 等. 中文网络客户评论的产品特征挖掘方法研究[J]. 管理科学学报, 2009, 12(2): 142-152.
[13] (Li Shi, Ye Qiang, Li Yijun, et al.Mining Features of Products from Chinese Customer Online Reviews[J]. Journal of Management Sciences in China, 2009, 12(2): 142-152.)
[14] 史伟, 王洪伟, 何绍义. 基于知网的模糊情感本体的构建研究[J]. 情报学报, 2012, 31(6): 595-602.
[14] (Shi Wei, Wang Hongwei, He Shaoyi.Study on Construction of Fuzzy Emotion Ontology Based on HowNet[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(6): 595-602.)
[15] Wang B, Wang H.Bootstrapping both Product Properties and Opinion Words from Chinese Reviews with Cross-Training[C] // Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 2007: 259-262.
[16] Cheng X.Automatic Topic Term Detection and Sentiment Classification for Opinion Mining [D]. Saarbrücken, Germany: The University of Saarland, 2007.
[17] 祖李军, 王卫平. 中文网络评论中提取产品特征的研究[J]. 计算机系统应用, 2014, 23(5): 196-201.
[17] (Zu Lijun, Wang Weiping.Research of Extracting Product Features from Chinese Online Reviews[J]. Computer Systems & Applications, 2014, 23(5): 196-201.)
[18] 李素建, 刘群. 汉语组块的定义和获取[C]// 语言计算与基于内容的文本处理——全国第七届计算语言学联合学术会议论文集. 北京: 清华大学出版社. 2003.
[18] (Li Sujian, Liu Qun.Research on Definition and Acquisition of Chunk[C] // Proceedings of the 7th National Conference on Computational Linguitics. Beijing: Tsinghua University Press, 2003: 110-115.)
[19] Xia Y Q, Xu R F, Wong K F, et al.The Unified Collocation Framework for Opinion Mining[C]//Proceedings of the 2007 International Conference on Machine Learning and Cybernetics. IEEE, 2007:844-850.
[20] 黄永文. 中文产品评论挖掘关键技术研究[D]. 重庆: 重庆大学, 2009.
[20] (Huang Yongwen.Research on Key Mining Technologies of Product Reviews in Chinese [D]. Chongqing: Chongqing University, 2009.)
[21] 陈炯, 张虎, 曹付元, 等. 面向中文客户评论的产品属性抽取方法研究[J]. 计算机工程与设计, 2012, 33(3): 1245-1250.
[21] (Chen Jiong, Zhang Hu, Cao Fuyuan, et al.Research on Product Feature Extraction from Chinese Customer Reviews[J]. Computer Engineering and Design, 2012, 33(3): 1245-1250.)
[22] Xia R, Xu F, Zong C, et al.Dual Sentiment Analysis: Considering Two Sides of One Review[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(8): 2120-2133.
[23] Liu S, Cheng X, Li F, et al.TASC: Topic-Adaptive Sentiment Classification on Dynamic Tweets[J]. IEEE Transactions on Knowledge & Data Engineering, 2015, 27(6): 1696-1709.
[24] Hai Z, Chang K, Kim J J, et al.Identifying Features in Opinion Mining via Intrinsic and Extrinsic Domain Relevance[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(3): 623-634.
[25] Asghar M Z, Khan A, Ahmad S, et al.A Review of Feature Extraction in Sentiment Analysis[J]. Journal of Basic & Applied Research International, 2014, 4(3): 181-186.
[26] 刘丹丹, 彭成, 钱龙华, 等. 《同义词词林》在中文实体关系抽取中的作用[J]. 中文信息学报, 2014, 28(2): 91-99.
[26] (Liu Dandan, Peng Cheng, Qian Longhua,et al.The Effect of TongYiCi CiLin in Chinese Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2014, 28(2): 91-99.)
[27] 杜嘉忠, 徐健, 刘颖. 网络商品评论的特征—情感词本体构建与情感分析方法研究[J]. 现代图书情报技术, 2014(5): 74-82.
[27] (Du Jiazhong, Xu Jian, Liu Ying.Research on Construction of Feature-Sentiment Ontology and Sentiment Analysis[J]. New Technology of Library and Information Service, 2014(5): 74-82.)
[28] 侯银秀, 李伟卿, 王伟军, 等. 基于用户偏好与商品属性情感匹配的图书个性化推荐研究[J]. 数据分析与知识发现, 2017, 1(8): 9-17.
[28] (Hou Yinxiu, Li Weiqing, Wang Weijun, et al.Personalized Book Recommendation Based on User Preferences and Commodity Features[J]. Data Analysis and Knowledge Discovery, 2017, 1(8): 9-17.)
[29] 田久乐, 赵蔚. 基于同义词词林的词语相似度计算方法[J]. 吉林大学学报: 信息科学版, 2010, 28(6): 602-608.
[29] (Tian Jiule, Zhao Wei.Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive Learning System[J]. Journal of Jilin University: Information Science Edition, 2010, 28(6): 602-608.)
[30] Song H, Fan Y, Liu X, et al.Extracting Product Features from Online Reviews for Sentimental Analysis[C]// Proceedings of the 6th International Conference on Computer Sciences and Convergence Information Technology. 2011: 745-750.
[31] Jeong H.FEROM: Feature Extraction and Refinement for Opinion Mining[J]. ETRI Journal, 2011, 33(5): 720-730.
[32] Liu B.Sentiment Analysis and Opinion Mining[J]. Synthesis Lectures on Human Language Technologies, 2016, 30(1): 152-153.
[33] 唐晓波, 兰玉婷. 基于特征本体的微博产品评论情感分析[J]. 图书情报工作, 2016, 60(16): 121-127.
[33] (Tang Xiaobo, Lan Yuting.Study on Evolution Process of Network Information Ecological Chain from the Perspective of Complex Networks[J]. Library and Information Service, 2016, 60(16): 121-127.)
[34] 杨阳, 刘龙飞, 魏现辉, 等. 基于词向量的情感新词发现方法[J]. 山东大学学报: 理学版, 2014, 49(11): 51-58.
[34] (Yang Yang, Liu Longfei, Wei Xianhui, et al.New Methods for Extracting Emotional Words Based on Distributed Representations of Words[J]. Journal of Shandong University: Natural Science, 2014, 49(11): 51-58.)
[35] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781v3.
[1] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[2] Guijun Yang,Xue Xu,Fuqiang Zhao. Predicting User Ratings with XGBoost Algorithm[J]. 数据分析与知识发现, 2019, 3(1): 118-126.
[3] Lixin Zhou,Jie Lin. Extracting Product Features with NodeRank Algorithm[J]. 数据分析与知识发现, 2018, 2(4): 90-98.
[4] Xiaoxi Huang,Hanyu Li,Rongbo Wang,Xiaohua Wang,Zhiqun Chen. Recognizing Metaphor with Convolution Neural Network and SVM[J]. 数据分析与知识发现, 2018, 2(10): 77-83.
[5] Changbing Li,Chongpeng Pang,Meiping Li. Extracting Product Features with Weight-based Apriori Algorithm[J]. 数据分析与知识发现, 2017, 1(9): 83-89.
[6] Bo Guo,Shouguang Li,Hao Wang,Xiaojun Zhang,Wei Gong,Zhaojun Yu,Yu Sun. Examining Product Reviews with Sentiment Analysis and Opinion Mining[J]. 数据分析与知识发现, 2017, 1(12): 1-9.
[7] Zhongqun Wang,Dongsheng Wu,Sheng Jiang,Subin Huang. Ranking Credibility of Online Product Reviews Based on Feature-Opinion Pair[J]. 数据分析与知识发现, 2017, 1(10): 32-42.
[8] Liyi Zhang,Chang Liu. Combine Deep Belief Networks and Fuzzy Set for Recognition of Fraud Transaction[J]. 现代图书情报技术, 2016, 32(1): 32-39.
[9] Wang Zhongqun, Huang Subin, Xiu Yu, Zhang Yi. Research on Metrics-Model for Online Product Review Depth Based on Domain Expert and Feature Concept Tree of Products[J]. 现代图书情报技术, 2015, 31(9): 17-25.
[10] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[11] Zhang Li, Xu Xin. Implicit Feature Identification in Product Reviews[J]. 现代图书情报技术, 2015, 31(12): 42-47.
[12] Lu Yonghe, Liang Minghui. Improvement of Text Feature Extraction with Genetic Algorithm[J]. 现代图书情报技术, 2014, 30(4): 48-57.
[13] Tang Xiaobo, Xiao Lu. Research of Text Feature Extraction on Dependency Parsing Network[J]. 现代图书情报技术, 2014, 30(11): 31-37.
[14] You Guirong, Wu Wei, Qian Yuntao. Feature Extraction Method for Detecting Spam in Electronic Commerce[J]. 现代图书情报技术, 2014, 30(10): 93-100.
[15] Meng Meiren, Ding Shengchun. Research on the Credibility of Online Chinese Product Reviews[J]. 现代图书情报技术, 2013, 29(9): 60-66.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn