Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (4): 90-98    DOI: 10.11925/infotech.2096-3467.2017.1252
Orginal Article Current Issue | Archive | Adv Search |
Extracting Product Features with NodeRank Algorithm
Zhou Lixin, Lin Jie()
School of Economics and Management, Tongji University, Shanghai 200092, China
Download: PDF (1351 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper presents a novel algorithm based on the NLP technique and complex network theory, aiming to extract product features more effectively. [Methods] First, we constructed a weighted bipartite graph with the product features and sentiment words, which described their relationship more clearly and intuitively from network perspective. Then, we proposed the NodeRank algorithm to rank the importance of product features, which improved the precision of feature extraction. [Results] We examined the proposed algorithm with data from jd.com, a popular online shopping site in China. The precision, recall and F-score of the NodeRank algorithm were better than the HAC, TF-IDF and TextRank methods. [Limitations] The computational complexity of our new algorithm needs to be optimized. [Conclusions] The NodeRank algorithm could effectively extract the product features, which supports marketing and other business activities.

Key wordsFeature Extraction      Bipartite Graph      NodeRank Algorithm      Importance Ranking     
Received: 11 December 2017      Published: 11 May 2018
ZTFLH:  TP393  

Cite this article:

Zhou Lixin,Lin Jie. Extracting Product Features with NodeRank Algorithm. Data Analysis and Knowledge Discovery, 2018, 2(4): 90-98.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1252     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I4/90

产品名称 类别 评论数量 清洗后的评论数量
华为G9 Plus铂雅金
4G手机
手机 1 888条 1 366条
排序 特征词 NR 词频 RFF
1 手感 0.02622 65 0.08541
2 外观 0.02141 61 0.08016
3 屏幕 0.01678 39 0.05125
4 电池 0.01614 23 0.03022
5 价格 0.01518 13 0.01708
6 速度 0.01446 59 0.07753
7 质量 0.01238 23 0.03022
8 感觉 0.01191 7 0.02365
9 机身 0.01098 4 0.0092
10 界面 0.01081 6 0.00526
[1] King R A, Racherla P, Bush V D.What We Know and don’t Know about Online Word-of-Mouth: A Review and Synthesis of the Literature[J]. Journal of Interactive Marketing, 2014, 28(3): 167-183.
doi: 10.1016/j.intmar.2014.02.001
[2] Phang C W, Zhang C, Sutanto J.The Influence of User Interaction and Participation in Social Media on the Consumption Intention of Niche Products[J]. Information & Management, 2013, 50(8): 661-672.
doi: 10.1016/j.im.2013.07.001
[3] Gandomi A, Haider M.Beyond the Hype: Big Data Concepts, Methods, and Analytics[J]. International Journal of Information Management, 2015, 35(2): 137-144.
doi: 10.1016/j.ijinfomgt.2014.10.007
[4] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA. 2004: 168-177.
[5] Popescu A M, Etzioni O.Extracting Product Features and Opinions from Reviews [A]//Natural Language Processing and Text Mining[M]. Springer, 2007: 9-28.
[6] 李实, 叶强, 李一军. 中文网络客户评论的产品特征挖掘方法研究[J]. 管理科学学报, 2009, 12(2): 142-152.
doi: 10.3321/j.issn:1007-9807.2009.02.015
[6] (Li Shi, Ye Qiang, Li Yijun, et al.Mining Features of Products from Chinese Customer Online Reviews[J]. Journal of Management Sciences in China, 2009, 12(2): 142-152.)
doi: 10.3321/j.issn:1007-9807.2009.02.015
[7] 刘鸿宇, 赵妍妍, 秦兵, 等. 评价对象抽取及其倾向性分析[J]. 中文信息学报, 2010, 24(1): 84-88.
doi: 10.3969/j.issn.1003-0077.2010.01.015
[7] (Liu Hongyu, Zhao Yanyan, Qin Bing, et al.Comment Target Extraction and Sentiment Classification[J]. Journal of Chinese Information Processing, 2010, 24(1): 84-88.)
doi: 10.3969/j.issn.1003-0077.2010.01.015
[8] Qiu G, Liu B, Bu J, et al.Opinion Word Expansion and Target Extraction Through Double Propagation[J]. Computational Linguistics, 2011, 37(1): 9-27.
doi: 10.1162/coli_a_00034
[9] Poria S, Cambria E, Ku L W, et al.A Rule-Based Approach to Aspect Extraction from Product Reviews[C] //Proceedings of the 2nd Workshop on Natural Language Processing for Social Media (SocialNLP). 2014: 28-37.
[10] Xu H, Shu L, Zhang J, et al.Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification Using Distant Label Expansion [OL]. arXiv Preprint, arXiv:1612.04499.
[11] Xu H, Xie S, Shu L, et al.CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews [OL]. arXiv Preprint, arXiv: 1612 .01039.
doi: 10.1109/BigData.2016.7840672
[12] Borrajo L, Vieira A S, Iglesias E L.TCBR-HMM: An HMM-based Text Classifier with a CBR System[J]. Applied Soft Computing, 2015,26: 463-473.
doi: 10.1016/j.asoc.2014.10.019
[13] Owoputi O, O’Connor B, Dyer C, et al. Improved Part-of- Speech Tagging for Online Conversational Text with Word Clusters[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013: 7-13.
[14] Mesnil G, Dauphin Y, Yao K, et al.Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2015, 23(3): 530-539.
doi: 10.1109/TASLP.2014.2383614
[15] Jakob N, Gurevych I.Extracting Opinion Targets in a Single- and Cross-Domain Setting with Conditional Random Fields[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2010:1035-1045.
[16] Shu L, Liu B, Xu H, et al.Supervised Opinion Aspect Extraction by Exploiting Past Extraction Results [OL]. arXiv Preprint. arXiv:1612.07940.
[17] Choi Y, Cardie C.Hierarchical Sequential Learning for Extracting Opinions and Their Attributes[C]// Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, 2010:269-274.
[18] Wang W, Wang H, Song Y.Ranking Product Aspects Through Sentiment Analysis of Online Reviews[J]. Journal of Experimental & Theoretical Artificial Intelligence, 2017, 29(2): 227-246.
doi: 10.1080/0952813X.2015.1132270
[19] Zhang Z, Guo C, Goes P.Product Comparison Networks for Competitive Analysis of Online Word-of-Mouth[J]. ACM Transactions on Management Information Systems, 2013, 3(4): 1-22.
doi: 10.1145/2407740.2407744
[20] Jo Y, Oh A H.Aspect and Sentiment Unification Model for Online Review Analysis[C]//Proceedings of the ACM International Conference on Web Search and Data Mining. ACM, 2011:815-824.
[21] Moghaddam S, Ester M.ILDA: Interdependent LDA Model for Learning Latent Aspects and Their Ratings from Online Product Reviews[C]//Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2011:665-674.
[22] Huang S, Liu X, Peng X, et al.Fine-grained Product Features Extraction and Categorization in Reviews Opinion Mining[C]//Proceedings of IEEE 12th International Conference on Data Mining Workshops. IEEE, 2012:680-686.
[23] Yan Z, Xing M, Zhang D, et al.EXPRS: An Extended PageRank Method for Product Feature Extraction from Online Consumer Reviews[J]. Information & Management, 2015, 52(7): 850-858.
doi: 10.1016/j.im.2015.02.002
[24] Zhang L, Liu B, Lim S H, et al.Extracting and Ranking Product Features in Opinion Documents[C]// Proceedings of International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010: 1462-1470.
[25] Mihalcea R, Tarau P.TextRank: Bringing Order into Texts[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2004: 404-411.
[26] Zha Z J, Yu J, Tang J, et al.Product Aspect Ranking and Its Applications[J]. IEEE Transactions on Knowledge & Data Engineering, 2014, 26(5): 1211-1224.
doi: 10.1109/TKDE.2013.136
[27] Brin S, Page L.The Anatomy of a Large-scale Hypertextual Web Search Engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7): 107-117. .
[1] Zheng Xinman, Dong Yu. Constructing Degree Lexicon for STI Policy Texts[J]. 数据分析与知识发现, 2021, 5(10): 81-93.
[2] Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
[3] Hui Nie,Huan He. Identifying Implicit Features with Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
[4] Bocheng Li,Yunqiu Zhang,Kaixi Yang. Extracting Emotion Tags from Comments of Microblog Commodities[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[5] Gang Li,Huayang Zhou,Jin Mao,Sijing Chen. Classifying Social Media Users with Machine Learning[J]. 数据分析与知识发现, 2019, 3(8): 1-9.
[6] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[7] Yong Ding,Lu Cheng,Cuiqing Jiang. Choosing Portfolios Based on Bipartite Graph of P2P Lending Networks[J]. 数据分析与知识发现, 2019, 3(12): 76-83.
[8] Jiao Yan,Jing Ma,Kang Fang. Computing Text Semantic Similarity with Syntactic Network of Co-occurrence Distance[J]. 数据分析与知识发现, 2019, 3(12): 93-100.
[9] Qinghong Zhong,Xiaodong Qiao,Yunliang Zhang,Mengjuan Weng. Cross-media Fusion Method Based on LDA2Vec and Residual Network[J]. 数据分析与知识发现, 2019, 3(10): 78-88.
[10] Guijun Yang,Xue Xu,Fuqiang Zhao. Predicting User Ratings with XGBoost Algorithm[J]. 数据分析与知识发现, 2019, 3(1): 118-126.
[11] Huang Xiaoxi,Li Hanyu,Wang Rongbo,Wang Xiaohua,Chen Zhiqun. Recognizing Metaphor with Convolution Neural Network and SVM[J]. 数据分析与知识发现, 2018, 2(10): 77-83.
[12] Li Weiqing,Wang Weijun. Building Product Feature Dictionary with Large-scale Review Data[J]. 数据分析与知识发现, 2018, 2(1): 41-50.
[13] Li Changbing,Pang Chongpeng,Li Meiping. Extracting Product Features with Weight-based Apriori Algorithm[J]. 数据分析与知识发现, 2017, 1(9): 83-89.
[14] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[15] Lu Yonghe, Liang Minghui. Improvement of Text Feature Extraction with Genetic Algorithm[J]. 现代图书情报技术, 2014, 30(4): 48-57.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn