[Objective] This paper aims to reduce the noises while extracting product features from customer comments. [Methods] We used the TF-IDF and variance selection methods to extracted the needed data. Then, we set the thresholds to filter the extracted words and obtain the product feature set. Third, we generated frequent item sets with the Apriori algorithm. Finally, we defined various thresholds to obtain the optimal sets, which automatically extracted product features from user comments. [Results] We examined the effectiveness of the proposed method with comment texts on mobile phone products. Comparing the automatically extracted characteristics with the manually identified characteristics, we found that the precision P value was 72.44%, the recall R value was 77.59%, and the comprehensive F value reached 74.93%. [Limitations] The precision needs to be improved and there might be some human errors involving the manually identified terms. [Conclusions] The Apriori algorithm could help us extract product features effectively.
Zhuang L, Jing F, Zhu X Y.Movie Review Mining and Summarization[C]//Proceedings of the 15th ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA.New York: ACM, 2006: 43-50.
Kobayashi N, Inui K, Matsumoto Y, et al.Collecting Evaluative Expressions for Opinion Extraction[C]// Proceedings of the 1st International Joint Conference on Natural Language Processing. Berlin, Heidelberg: Springer- Verlag, 2004: 596-605.
(Li Shi, Ye Qiang, Li Yijun, et al.Research on Product Feature Mining Method of Chinese Network Customer Review[J]. Chinese Journal of Management Science, 2009, 12(2): 142-152.)
(Li Shi, Ye Qiang, Li Yijun, et al.Characteristics and Emotional Tendency of Excavating Chinese Network Customer Reviews[J]. Application Research of Computers, 2010, 27(8): 3016-3019.)