Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (11): 102-111    DOI: 10.11925/infotech.2096-3467.2020.0059
Current Issue | Archive | Adv Search |
Predicting Usefulness of Crowd Testing Reports with Deep Learning
Cai Jingxuan1,Wu Jiang1,2(),Wang Chengkun1
1School of Information Management, Wuhan University, Wuhan 430072, China
2Center for E-commerce Research and Development, Wuhan University, Wuhan 430072, China
Download: PDF (1461 KB)   HTML ( 11
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to predict the usefulness of crowd testing reports with author attributes, text features, and image features. [Methods] First, we adopted deep learning techniques to extract text and image features from crowd testing reports. Then, we constrcuted a prediction model with full-connected neural network. Third, we trained the new model with 80% of samples and different input combinations. Finally, we examined our model’s performance with the remaining samples. [Results] With the help of text or image features, the prediction accuracy of the model increased by 4.24% and 5.21%, respectively. Using both the text and image features, our model’s prediction accuracy increased by 6.96%. [Limitations] The extracted features of texts and images were not understandable and interpretable. Therefore, we cannot identify specific features represented by each layer of neural network in the model. [Conclusions] The proposed model with text and image features can effectively predict the usefulness of crowd testing reports.

Key wordsCrowd Testing      Signal Theory      Deep Learning      Feature Extraction      Predictive Analysis     
Received: 15 January 2020      Published: 27 September 2020
ZTFLH:  G203  
Corresponding Authors: Wu Jiang     E-mail: jiangw@whu.edu.cn

Cite this article:

Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning. Data Analysis and Knowledge Discovery, 2020, 4(11): 102-111.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0059     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I11/102

维度 变量 说明 Obs Mean S.D. Max Min
因变量 Like 众测报告获得的点赞数 1 550 27.28 33.56 385 0
作者属性 Au_level 作者在平台上的等级 1 550 9.55 3.72 26 1
Au_article 作者在平台上发布的文章总数 1 550 24.57 31.18 246 1
Au_test 作者参与众测次数 1 550 5.10 4.82 114 1
Au_followed 作者关注数 1 550 911.74 1 486.37 10 620 0
Au_follower 作者粉丝数 1 550 57.02 66.68 1 225 0
产品属性 Prod_type 试用产品的类型(美妆护肤、美食、生活、服饰、电子产品、母婴) 1 550 - - - -
Prod_price 试用产品的价格 1 550 177.32 215.30 1 700 11.99
Prod_num 同时参与该商品众测人数 1 550 7.98 7.35 32 1
Variables Definitions and Descriptive Statistics
类别 内容 来源
用户词典 (1)产品及商家名称:美妆蛋、小白鞋、良品铺子等
(2)体验描述:沙漠皮、开箱、神器等
(3)网络用语及缩写:鸡冻、踩雷、轻奢等
(1)网站中对众测商品的描述
(2)高频未分词
停词词典 (1)标点符号
(2)虚词
(3)表情
(4)其他无意义词语:哈哈等
(1)哈尔滨工业大学停用词表
(2)高频无意义词
User Dictionary and Stop Words
Topic Perplexity of Crowd Testing Report
Extraction Process of Image Feature
Feature Integration (Model 4)
Model Construction
模型代码 特征组合 精确率 召回率 F1值 准确率
模型1 作者属性+产品属性 0.767 6 0.878 3 0.791 7 0.782 1
模型2 作者属性+产品属性+报告文本 0.862 4 0.840 3 0.809 9 0.824 5
模型3 作者属性+产品属性+报告图片 0.870 0 0.857 1 0.835 9 0.834 2
模型4 作者属性+产品属性+报告文本+报告图片 0.885 4 0.887 0 0.865 8 0.851 7
Model Performance
模型代码 图片预训练模型 精确率 召回率 F1值 准确率
模型3 VGG19 0.870 0 0.857 1 0.835 9 0.834 2
ResNet50 0.843 0 0.803 1 0.822 6 0.816 7
InceptionV3 0.504 1 0.941 9 0.670 1 0.512 3
NASNetMobile 0.795 2 0.792 3 0.779 0 0.768 8
模型4 VGG19 0.885 4 0.887 0 0.865 8 0.851 7
ResNet50 0.786 0 0.838 1 0.803 6 0.793 8
InceptionV3 0.859 8 0.806 7 0.831 6 0.831 3
NASNetMobile 0.885 0 0.745 5 0.801 1 0.810 5
Performance with Different Picture Pre-training Model
[1] Han Y, Zhang Z. Impact of Free Sampling on Product Diffusion Based on Bass Model[J]. Electronic Commerce Research, 2018,18(2):125-141.
doi: 10.1007/s10660-017-9264-9
[2] Koch O F, Benlian A. The Effect of Free Sampling Strategies on Freemium Conversion Rates[J]. Electronic Markets, 2017,27:67-76.
doi: 10.1007/s12525-016-0236-z
[3] Wu L, Deng S, Jiang X. Sampling and Pricing Strategy Under Competition[J]. Omega, 2018,80:192-208.
doi: 10.1016/j.omega.2018.01.002
[4] Hsu C L, Lin J C C, Chiang H S. The Effects of Blogger Recommendations on Customers’ Online Shopping Intentions[J]. Internet Research, 2013,23(1):69-88.
doi: 10.1108/10662241311295782
[5] Biswas D, Grewal D, Roggeveen A. How the Order of Sampled Experiential Products Affects Choice[J]. Journal of Marketing Research, 2010,47(3):508-519.
doi: 10.1509/jmkr.47.3.508
[6] Lin Z, Zhang Y, Tan Y. An Empirical Study of Free Product Sampling and Rating Bias[J]. Information Systems Research, 2019,30(1):260-275.
doi: 10.1287/isre.2018.0801
[7] Fehr E, Gächter S. Fairness and Retaliation: The Economics of Reciprocity[J]. The Journal of Economic Perspectives, 2000,14(3):159-181.
[8] Schmidhuber J. Deep Learning in Neural Networks: An Overview[J]. Neural Networks, 2015,61:85-117.
doi: 10.1016/j.neunet.2014.09.003 pmid: 25462637
[9] Ni F T, Zhang J, Noori M N. Deep Learning for Data Anomaly Detection and Data Compression of a Long-Span Suspension Bridge[J]. Computer-Aided Civil and Infrastructure Engineering, 2020,35(7):685-700.
doi: 10.1111/mice.v35.7
[10] Xu W H, Huang H, Zhang J, et al. CNN-based Skip-Gram Method for Improving Classification Accuracy of Chinese Text[J]. KSII Transactions on Internet and Information Systems, 2019,13(12):6080-6096.
[11] Yadav A, Vishwakarma D K. Sentiment Analysis Using Deep Learning Architectures: A Review[J]. Artificial Intelligence Review, 2020,53(6):4335-4385.
doi: 10.1007/s10462-019-09794-5
[12] Yang W M, Zhang X C, Tian Y P, et al. Deep Learning for Single Image Super-Resolution: A Brief Review[J]. IEEE Transactions on Multimedia, 2019,21(12):3106-3121.
doi: 10.1109/TMM.6046
[13] Dhillon A, Verma G K. Convolutional Neural Network: A Review of Models, Methodologies and Applications to Object Detection[J]. Progress in Artificial Intelligence, 2020,9:85-112.
doi: 10.1007/s13748-019-00203-0
[14] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[OL]. arXiv Preprint, arXiv: 1409. 1556.
[15] Lakshmanan A, Krishnan H S. The A-ha! Experience: Insight and Discontinuous Learning in Product Usage[J]. Journal of Marketing, 2011,75(6):105-123.
[16] Kim J, Morris J D. The Power of Affective Response and Cognitive Structure In Product-Trial Attitude Formation[J]. Journal of Advertising, 2007,36(1):95-106.
doi: 10.2753/JOA0091-3367360107
[17] Chen H, Duan W, Zhou W. The Interplay Between Free Sampling and Word of Mouth in the Online Software Market[J]. Decision Support Systems, 2017,95:82-90.
doi: 10.1016/j.dss.2017.01.001
[18] Li F, Yi Z. Trial or No Trial: Supplying Costly Signals to Improve Profits[J]. Decision Sciences, 2016,48(4):795-827.
doi: 10.1111/deci.2017.48.issue-4
[19] Han Y, Zhang Z. Impact of Free Sampling on Product Diffusion Based on Bass Model[J]. Electronic Commerce Research, 2018,18(1):125-141.
doi: 10.1007/s10660-017-9264-9
[20] Connelly B, Certo T, Ireland R, et al. Signaling Theory: A Review and Assessment[J]. Journal of Management, 2011,37(1):39-67.
doi: 10.1177/0149206310388419
[21] Kirmani A, Rao A. No Pain, No Gain: A Critical Review of the Literature on Signaling Unobservable Product Quality[J]. Journal of Marketing, 2000,64(2):66-79.
doi: 10.1509/jmkg.64.2.66.18000
[22] Hampshire K, Hamill H, Mariwah S, et al. The Application of Signalling Theory to Health-Related Trust Problems: The Example of Herbal Clinics in Ghana and Tanzania[J]. Social Science & Medicine, 2017,188:109-118.
pmid: 28738317
[23] Mavlanova T, Benbunan-Fich R, Koufaris M. Signaling Theory and Information Asymmetry in Online Commerce[J]. Information & Management, 2012,49(5):240-247.
[24] Biswas D, Biswas A. The Diagnostic Role of Signals in the Context of Perceived Risks in Online Shopping: Do Signals Matter More on the Web?[J]. Journal of Interactive Marketing, 2004,18(3):30-45.
doi: 10.1002/dir.20010
[25] Spence M. Signaling in Retrospect and the Informational Structure of Markets[J]. American Economic Review, 2002,92(3):434-459.
doi: 10.1257/00028280260136200
[26] Donath J. Signals in Social Supernets[J]. Journal of Computer-Mediated Communication, 2007,13(1):231-251.
doi: 10.1111/j.1083-6101.2007.00394.x
[27] Jiang Z, Benbasat I. Virtual Product Experience: Effects of Visual and Functional Control of Products on Perceived Diagnosticity and Flow in Electronic Shopping[J]. Journal of Management Information Systems, 2005,21(3):111-148.
doi: 10.1080/07421222.2004.11045817
[28] Teng L, Ye N, Yu Y, et al. Effects of Culturally Verbal and Visual Congruency/Incongruency Across Cultures in a Competitive Advertising Context[J]. Journal of Business Research, 2014,67(3):288-294.
doi: 10.1016/j.jbusres.2013.05.015
[29] Evans J S B T. In Two Minds: Dual-Process Accounts of Reasoning[J]. Trends in Cognitive Sciences, 2003,7(10):454-459.
doi: 10.1016/j.tics.2003.08.012 pmid: 14550493
[30] Yoo J, Kim M. The Effects of Online Product Presentation on Consumer Responses: A Mental Imagery Perspective[J]. Journal of Business Research, 2014,67(11):2464-2472.
doi: 10.1016/j.jbusres.2014.03.006
[31] Macinnis D, Price L. The Role of Imagery in Information Processing: Review and Extensions[J]. Journal of Consumer Research, 1987,13(4):473-491.
doi: 10.1086/jcr.1987.13.issue-4
[32] Wang Q, Cui X, Huang L, et al. Seller Reputation or Product Presentation? An Empirical Investigation from Cue Utilization Perspective[J]. International Journal of Information Management, 2016,36(3):271-283.
doi: 10.1016/j.ijinfomgt.2015.12.006
[33] Zhang S, Lee D, Vir Singh P, et al. How Much is an Image Worth? Airbnb Property Demand Estimation Leveraging Large Scale Image Analytics[OL]. [2017-05-25]. https://ssrn.com/abstract=2976021.
[1] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[2] Zhao Yang, Zhang Zhixiong, Liu Huan, Ding Liangping. Classification of Chinese Medical Literature with BERT Model[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[3] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[4] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] Wang Xinyun,Wang Hao,Deng Sanhong,Zhang Baolong. Classification of Academic Papers for Periodical Selection[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[6] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[7] Wang Mo,Cui Yunpeng,Chen Li,Li Huan. A Deep Learning-based Method of Argumentative Zoning for Research Articles[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
[8] Deng Siyi,Le Xiaoqiu. Coreference Resolution Based on Dynamic Semantic Attention[J]. 数据分析与知识发现, 2020, 4(5): 46-53.
[9] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[10] Su Chuandong,Huang Xiaoxi,Wang Rongbo,Chen Zhiqun,Mao Junyu,Zhu Jiaying,Pan Yuhao. Identifying Chinese / English Metaphors with Word Embedding and Recurrent Neural Network[J]. 数据分析与知识发现, 2020, 4(4): 91-99.
[11] Liu Tong,Ni Weijian,Sun Yujian,Zeng Qingtian. Predicting Remaining Business Time with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 134-142.
[12] Da Jingwei,Yan Jiaqi,Deng Sanhong,Wang Zhongmin. Predicting Hospital Readmissions with Deep Learning: Case Study of Heart Diseases[J]. 数据分析与知识发现, 2020, 4(11): 63-73.
[13] Ding Heng,Li Yingxuan. Improving Online Q&A Service with Deep Learning[J]. 数据分析与知识发现, 2020, 4(10): 37-46.
[14] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[15] Hui Nie,Huan He. Identifying Implicit Features with Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn