Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 1-8    DOI: 10.11925/infotech.2096-3467.2017.0849
Current Issue | Archive | Adv Search |
Identifying Potential Customers Based on User-Generated Contents
Cuiqing Jiang(),Kailun Song,Yong Ding,Yao Liu
School of Management, Hefei University of Technology, Hefei 230009, China
Download: PDF(601 KB)   HTML ( 6
Export: BibTeX | EndNote (RIS)      

[Objective] This paper aims to identify potential customers by analyzing user-generated contents from product-specific online forums. [Methods] First, we converted the unbalanced dataset into multiple balanced subsets. Then, we employed the Stacking classification algorithm to construct identification model. Finally, we compared results of the proposed method with five baseline algorithms. [Results] Compared to the algorithms of Bayesnet, Logistic, C4.5, SMO and Naive Bayes, the F-measure of our method was increased by 17.4%, 26.5%, 24.1%, 29.3%, and 40.9%. Compared to Stacking, Bagging and Boosting methods, our F-measure increased by 10.1%, 5.9%, 13.1%. [Limitations] We only examined performance of the proposed methods with automotive industry. [Conclusions] The proposed method could effectively identify potential customers based on user-generated contents.

Key wordsUser-Generated Content      Potential Customer Identification      Stacking Classification Algorithm      Imbalanced Datasets     
Received: 22 August 2017      Published: 03 April 2018

Cite this article:

Cuiqing Jiang,Kailun Song,Yong Ding,Yao Liu. Identifying Potential Customers Based on User-Generated Contents. Data Analysis and Knowledge Discovery, 2018, 2(3): 1-8.

URL:     OR

[1] Shaw M J, Subramaniam C, Tan G W, et al.Knowledge Management and Data Mining for Marketing[J]. Decision Support Systems, 2001, 31(1): 127-137.
[2] 魏国华, 康志英. 以客户需求为导向的定制终端潜在客户挖掘模型研究[J]. 信息安全与技术, 2014, 5(3): 79-81.
[2] (Wei Guohua, Kang Zhiying.Research on the Model of Mining Customer Demand Potential Customers Customized Terminal[J]. Information Security & Technology, 2014, 5(3): 79-81.)
[3] 李杏谊. 数据挖掘技术在保险行业目标客户识别中的应用研究[D]. 广州: 中山大学, 2014.
[3] (Li Xingyi.Study on Application of Data Mining Technology in Insurance Target Customer Identification [D]. Guangzhou: Sun Yat-Sen University, 2014.)
[4] 王昱元. 基于数据挖掘的移动客户预测及分析[D]. 西安: 长安大学, 2016.
[4] (Wang Yuyuan.Prediction and Analysis of China Mobile Customers Based on Data Mining[D]. Xi’an: Chang’an University, 2016.)
[5] 曹淑鹏, 蒋竹, 严美艺. 运用决策树模型识别信用消费贷款潜在客户研究[J]. 北京金融评论, 2016(2): 36-53.
[5] (Cao Shupeng, Jiang Zhu, Yan Meiyi.Application of Decision Tree Model to Identify Potential Customers of Credit Consumption Loan[J]. Beijing Review of Financial Studies, 2016(2): 36-53.)
[6] Ganatra A.Draw Attention to Potential Customer with the Help of Subjective Measures in Sequential Pattern Mining (SPM) Approach[C]// Proceedings of the International Conference on Recent Trends in Information, Telecommunication and Computing. 2014.
[7] Chang H J, Hung L P, Ho C L.An Anticipation Model of Potential Customers’ Purchasing Behavior Based on Clustering Analysis and Association Rules Analysis[J]. Expert Systems with Applications, 2007, 32(3): 753-764.
[8] 过蓓蓓, 方兆本. 基于SVM的Web日志挖掘及潜在客户发现[J]. 管理工程学报, 2010, 24(1): 129-133.
[8] (Guo Beibei, Fang Zhaoben.Application of SVM in Mining Potential Customers from Web Log[J]. Journal of Industrial Engineering & Engineering Management, 2010, 24(1): 129-133.)
[9] Sun L, Duan Z.Web Potential Customer Classification Based on SVM[C]// Proceedings of the 2012 International Conference on Industrial Control and Electronics Engineering. 2012: 568-570.
[10] 郭林雪. 关联规则及协同过滤在汽车电子商务中的应用[J]. 科技经济导刊, 2017(8): 31.
[10] (Guo Linxue.Application of Association Rules and Collaborative Filtering in Automotive E-commerce[J]. Technology and Economic Guide, 2017(8): 31.)
[11] Hsieh H P, Li C T, Lin S D.Estimating Potential Customers Anywhere and Anytime Based on Location-Based Social Networks[A]// Machine Learning and Knowledge Discovery in Databases[M]. Springer International Publishing, 2015.
[12] 蒋翠清, 王齐林, 刘士喜, 等. 中文社会媒体环境下半监督学习的汽车缺陷识别方法[J]. 中国管理科学, 2014(S1): 677-685.
[12] (Jiang Cuiqing, Wang Qilin, Liu Shixi, et al.Semi-supervised Learning for Automobile Defect Identification in the Context of Chinese Social Media[J]. Chinese Journal of Management Science, 2014(S1): 677-685.)
[13] 火车采集器[CP/OL]. [2016-11-04]. .
[13] (LocoySpider [CP/OL]. [2016-11-04].
[14] Zheng X, Zhu S, Lin Z.Capturing the Essence of Word-of- Mouth for Social Commerce: Assessing the Quality of Online E-Commerce Reviews by a Semi-Supervised Approach[J]. Decision Support Systems, 2013, 56(1): 211-222.
[15] Abrahams A S, Fan W, Wang G A, et al.An Integrated Text Analytic Framework for Product Defect Discovery[J]. Production & Operations Management, 2015, 24(6): 975-990.
[16] Krishnamoorthy S.Linguistic Features for Review Helpfulness Prediction[J]. Expert Systems with Applications, 2015, 42(7): 3751-3759.
[17] Liu Y, Jiang C, Zhao H, et al.Using Contextual Features and Multi-view Ensemble Learning in Product Defect Identification from Online Discussion Forums[J]. Decision Support Systems, 2018, 105: 1-12.
[18] Abbasi A, Chen H.CyberGate: A Design Framework and System for Text Analysis of Computer-Mediated Communication[J]. MIS Quarterly, 2008, 32(4): 811-837.
[19] Abrahams A S, Jiao J, Fan W, et al.What’s Buzzing in the Blizzard of Buzz? Automotive Component Isolation in Social Media Postings[J]. Decision Support Systems, 2013, 55(4): 871-882.
[20] Lee S, Choeh J Y.Predicting the Helpfulness of Online Reviews Using Multilayer Perceptron Neural Networks[J]. Expert Systems with Applications, 2014, 41(6): 3041-3046.
[21] Almagrabi H, Malibari A, McNaught J. A Survey of Quality Prediction of Product Reviews[J]. International Journal of Advanced Computer Science & Applications, 2015, 6(11): 49-58.
[22] Xu N, Liu H, Chen J, et al.Selecting a Representative Set of Diverse Quality Reviews Automatically[C]// Proceedings of the 2014 SIAM International Conference on Data Mining. 2014.
[23] NTUSD[OL]. [2017-01-05]. .
[24] Zhu F, Zhang X.Impact of Online Consumer Reviews on Sales: The Moderating Role of Product and Consumer Characteristics[J]. Journal of Marketing, 2010, 74(2): 133-148.
[25] Oh C, Sheng O.Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement[C]// Proceedings of the Annual International Conference on Information Systems. 2011.
[26] Loughran T, McDonald B. When is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks[J]. Journal of Finance, 2011, 66(1): 35-65.
[27] Abrahams A S, Jiao J, Wang G A, et al.Vehicle Defect Discovery from Social Media[J]. Decision Support Systems, 2012, 54(1): 87-97.
[28] Law D, Gruss R, Abrahams A S.Automated Defect Discovery for Dishwasher Appliances from Online Consumer Reviews[J]. Expert Systems with Applications, 2017, 67: 84-94.
[29] Winkler M, Abrahams A S, Gruss R, et al.Toy Safety Surveillance from Online Reviews[J]. Decision Support Systems, 2016, 90: 23-32.
[30] NLPIR[OL]. [2017-01-10]. .
[31] Wolpert D H.Stacked Generalization[M]. Springer US, 2011.
[32] 汽车之家[OL]. [2016-11-14]. .
[32] (AutoHome [OL]. [2016-11-14].
[33] WEKA [K/OL]. [2017-01-18]. .
[1] Wang Yuefen,Jia Xinlu,Fu Zhu. Content Using Behavior of Academic Social Network System: Case Study of Popular Blogs from[J]. 现代图书情报技术, 2016, 32(6): 63-72.
[2] Zhang Xiaoyong,Zhou Qingqing,Zhang Chengzhi. Identifying Food Topics from User-Generated Contents in Microblogs[J]. 现代图书情报技术, 2016, 32(10): 70-80.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938