|
|
Extracting Product Features with Weight-based Apriori Algorithm |
Li Changbing, Pang Chongpeng(), Li Meiping |
School of Economics and Management, Chongqing University of Posts and Telecommunications, Chongqing 400065, China |
|
|
Abstract [Objective] This paper aims to reduce the noises while extracting product features from customer comments. [Methods] We used the TF-IDF and variance selection methods to extracted the needed data. Then, we set the thresholds to filter the extracted words and obtain the product feature set. Third, we generated frequent item sets with the Apriori algorithm. Finally, we defined various thresholds to obtain the optimal sets, which automatically extracted product features from user comments. [Results] We examined the effectiveness of the proposed method with comment texts on mobile phone products. Comparing the automatically extracted characteristics with the manually identified characteristics, we found that the precision P value was 72.44%, the recall R value was 77.59%, and the comprehensive F value reached 74.93%. [Limitations] The precision needs to be improved and there might be some human errors involving the manually identified terms. [Conclusions] The Apriori algorithm could help us extract product features effectively.
|
Received: 24 April 2017
Published: 18 October 2017
|
|
[1] |
Zhuang L, Jing F, Zhu X Y.Movie Review Mining and Summarization[C]//Proceedings of the 15th ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA.New York: ACM, 2006: 43-50.
|
[2] |
Kobayashi N, Inui K, Matsumoto Y, et al.Collecting Evaluative Expressions for Opinion Extraction[C]// Proceedings of the 1st International Joint Conference on Natural Language Processing. Berlin, Heidelberg: Springer- Verlag, 2004: 596-605.
|
[3] |
娄德成, 姚天昉. 汉语句子语义极性分析和观点抽取方法的研究[J]. 计算机应用, 2006, 26(11) : 2622-2625.
|
[3] |
(Lou Decheng, Yao Tianfang.Semantic Polarity Analysis and Opinion Mining on Chinese Review Sentences[J]. Journal of Computer Applications, 2006, 26(11): 2622-2625.)
|
[4] |
Hu M, Liu B.Mining Opinion Features in Customer Reviews[C]// Proceedings of the 19th National Conference on Artificial Intelligence. 2004.
|
[5] |
Popescu A M, Etzioni O.Extracting Product Features and Opinions From Reviews[C]//Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. 2005.
|
[6] |
杜思奇, 李红莲, 吕学强. 汉语组块分析在产品特征提取中的应用研究[J]. 现代图书情报技术, 2015(9): 26-30.
|
[6] |
(Du Siqi, Li Honglian, Lv Xueqiang.Application of Chinese Chunk Analysis in Product Feature Extraction[J]. New Technology of Library and Information Service, 2015(9): 26-30.)
|
[7] |
王永, 张勤, 杨晓洁. 中文网络评论中产品特征提取方法研究[J]. 现代图书情报技术, 2013(12): 70-73.
|
[7] |
(Wang Yong, Zhang Qin, Yang Xiaojie.Study on the Extraction of Product Features in Chinese Network Reviews[J]. New Technology of Library and Information Service, 2013(12): 70-73.)
|
[8] |
路永和, 梁明辉. 遗传算法在改进文本特征提取方法中的应用[J]. 现代图书情报技术, 2014(4): 48-57.
|
[8] |
(Lu Yonghe, Liang Minghui.Application of Genetic Algorithms in Improving Text Feature Extraction Method[J]. New Technology of Library and Information Service, 2014 (4): 48-57.)
|
[9] |
张建娥. 基于TFIDF和词语关联度的中文关键词提取方法[J]. 情报科学, 2012, 30(10): 1542-1544, 1555.
|
[9] |
(Zhang Jian’e.Chinese Keyword Extraction Method Based on TFIDF and Word Relevance Degree[J]. Information Science, 2012, 30(10): 1542-1544, 1555.)
|
[10] |
边根庆, 王月. 一种基于矩阵和权重改进的Apriori算法[J]. 微电子学与计算机, 2017, 34(1): 136-140.
|
[10] |
(Bian Genqing, Wang Yue.A Apriori Algorithm Based on Matrix and Weight Improvement[J]. Microelectronics and Computer, 2017, 34(1): 136-140.)
|
[11] |
Shi B, Chang K.Mining Chinese Reviews[C]//Proceedings of the 6th IEEE lnrternational Conference on Data Mining. 2006.
|
[12] |
李实, 叶强, 李一军, 等. 中文网络客户评论的产品特征挖掘方法研究[J]. 管理科学学报, 2009, 12(2): 142-152.
doi: 10.3321/j.issn:1007-9807.2009.02.015
|
[12] |
(Li Shi, Ye Qiang, Li Yijun, et al.Research on Product Feature Mining Method of Chinese Network Customer Review[J]. Chinese Journal of Management Science, 2009, 12(2): 142-152.)
doi: 10.3321/j.issn:1007-9807.2009.02.015
|
[13] |
李实, 叶强, 李一军, 等. 挖掘中文网络客户评论的产品特征及情感倾向[J]. 计算机应用研究, 2010, 27(8): 3016-3019.
doi: 10.3969/j.issn.1001-3695.2010.08.054
|
[13] |
(Li Shi, Ye Qiang, Li Yijun, et al.Characteristics and Emotional Tendency of Excavating Chinese Network Customer Reviews[J]. Application Research of Computers, 2010, 27(8): 3016-3019.)
doi: 10.3969/j.issn.1001-3695.2010.08.054
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|