Please wait a minute...
Advanced Search
数据分析与知识发现  2020, Vol. 4 Issue (11): 102-111     https://doi.org/10.11925/infotech.2096-3467.2020.0059
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于深度学习的众测报告有用性预测研究*
蔡婧璇1,吴江1,2(),王诚坤1
1武汉大学信息管理学院 武汉 430072
2武汉大学电子商务研究与发展中心 武汉 430072
Predicting Usefulness of Crowd Testing Reports with Deep Learning
Cai Jingxuan1,Wu Jiang1,2(),Wang Chengkun1
1School of Information Management, Wuhan University, Wuhan 430072, China
2Center for E-commerce Research and Development, Wuhan University, Wuhan 430072, China
全文: PDF (1461 KB)   HTML ( 23
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 以众测报告为研究对象,探索众测报告作者属性、产品属性、文本、图片对预测众测报告有用性的作用。【方法】 基于深度学习提取众测报告的文本特征和图片特征,使用全连接神经网络构建众测报告有用性预测模型,使用80%随机样本对不同输入组合下的模型进行训练学习,再以剩余样本作为测试集评估模型的预测效果。【结果】 单独加入文本特征后,模型的预测效果提升4.24%;单独加入图片特征后,模型的预测效果提升5.21%;同时加入文本特征和图片特征后,模型的预测效果提升6.96%。【局限】 深度学习提取的文本特征和图片特征可理解性与可解释性较差,因此,即使最终模型的预测结果比较准确,仍难以得知模型中每一层神经网络所代表的具体特征并总结归纳出模型做出最终决策所依赖的预测规则。【结论】 众测报告中文字描述的特征和图片特征都能有效预测众测报告对消费者的有用性,且两者对于预测众测报告对消费者的有用性具有相互验证和相互替代的作用。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
蔡婧璇
吴江
王诚坤
关键词 产品众测信号理论深度学习特征提取预测分析    
Abstract

[Objective] This paper tries to predict the usefulness of crowd testing reports with author attributes, text features, and image features. [Methods] First, we adopted deep learning techniques to extract text and image features from crowd testing reports. Then, we constrcuted a prediction model with full-connected neural network. Third, we trained the new model with 80% of samples and different input combinations. Finally, we examined our model’s performance with the remaining samples. [Results] With the help of text or image features, the prediction accuracy of the model increased by 4.24% and 5.21%, respectively. Using both the text and image features, our model’s prediction accuracy increased by 6.96%. [Limitations] The extracted features of texts and images were not understandable and interpretable. Therefore, we cannot identify specific features represented by each layer of neural network in the model. [Conclusions] The proposed model with text and image features can effectively predict the usefulness of crowd testing reports.

Key wordsCrowd Testing    Signal Theory    Deep Learning    Feature Extraction    Predictive Analysis
收稿日期: 2020-01-15      出版日期: 2020-09-27
ZTFLH:  G203  
基金资助:*本文系国家自然科学基金项目“信息不对称驱动的共享经济去中心化机制与风险的复杂性研究”的研究成果之一(71874131)
通讯作者: 吴江     E-mail: jiangw@whu.edu.cn
引用本文:   
蔡婧璇,吴江,王诚坤. 基于深度学习的众测报告有用性预测研究*[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning. Data Analysis and Knowledge Discovery, 2020, 4(11): 102-111.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.0059      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2020/V4/I11/102
维度 变量 说明 Obs Mean S.D. Max Min
因变量 Like 众测报告获得的点赞数 1 550 27.28 33.56 385 0
作者属性 Au_level 作者在平台上的等级 1 550 9.55 3.72 26 1
Au_article 作者在平台上发布的文章总数 1 550 24.57 31.18 246 1
Au_test 作者参与众测次数 1 550 5.10 4.82 114 1
Au_followed 作者关注数 1 550 911.74 1 486.37 10 620 0
Au_follower 作者粉丝数 1 550 57.02 66.68 1 225 0
产品属性 Prod_type 试用产品的类型(美妆护肤、美食、生活、服饰、电子产品、母婴) 1 550 - - - -
Prod_price 试用产品的价格 1 550 177.32 215.30 1 700 11.99
Prod_num 同时参与该商品众测人数 1 550 7.98 7.35 32 1
Table 1  基础属性定义及描述性统计
类别 内容 来源
用户词典 (1)产品及商家名称:美妆蛋、小白鞋、良品铺子等
(2)体验描述:沙漠皮、开箱、神器等
(3)网络用语及缩写:鸡冻、踩雷、轻奢等
(1)网站中对众测商品的描述
(2)高频未分词
停词词典 (1)标点符号
(2)虚词
(3)表情
(4)其他无意义词语:哈哈等
(1)哈尔滨工业大学停用词表
(2)高频无意义词
Table 2  用户词典及停用词表
Fig.1  众测报告文本主题困惑度
Fig.2  图片特征提取流程
Fig.3  特征融合流程图(以模型4为例)
Fig.4  模型构建流程图
模型代码 特征组合 精确率 召回率 F1值 准确率
模型1 作者属性+产品属性 0.767 6 0.878 3 0.791 7 0.782 1
模型2 作者属性+产品属性+报告文本 0.862 4 0.840 3 0.809 9 0.824 5
模型3 作者属性+产品属性+报告图片 0.870 0 0.857 1 0.835 9 0.834 2
模型4 作者属性+产品属性+报告文本+报告图片 0.885 4 0.887 0 0.865 8 0.851 7
Table 3  模型预测效果
模型代码 图片预训练模型 精确率 召回率 F1值 准确率
模型3 VGG19 0.870 0 0.857 1 0.835 9 0.834 2
ResNet50 0.843 0 0.803 1 0.822 6 0.816 7
InceptionV3 0.504 1 0.941 9 0.670 1 0.512 3
NASNetMobile 0.795 2 0.792 3 0.779 0 0.768 8
模型4 VGG19 0.885 4 0.887 0 0.865 8 0.851 7
ResNet50 0.786 0 0.838 1 0.803 6 0.793 8
InceptionV3 0.859 8 0.806 7 0.831 6 0.831 3
NASNetMobile 0.885 0 0.745 5 0.801 1 0.810 5
Table 4  使用不同图片预训练模型的预测效果
[1] Han Y, Zhang Z. Impact of Free Sampling on Product Diffusion Based on Bass Model[J]. Electronic Commerce Research, 2018,18(2):125-141.
doi: 10.1007/s10660-017-9264-9
[2] Koch O F, Benlian A. The Effect of Free Sampling Strategies on Freemium Conversion Rates[J]. Electronic Markets, 2017,27:67-76.
doi: 10.1007/s12525-016-0236-z
[3] Wu L, Deng S, Jiang X. Sampling and Pricing Strategy Under Competition[J]. Omega, 2018,80:192-208.
doi: 10.1016/j.omega.2018.01.002
[4] Hsu C L, Lin J C C, Chiang H S. The Effects of Blogger Recommendations on Customers’ Online Shopping Intentions[J]. Internet Research, 2013,23(1):69-88.
doi: 10.1108/10662241311295782
[5] Biswas D, Grewal D, Roggeveen A. How the Order of Sampled Experiential Products Affects Choice[J]. Journal of Marketing Research, 2010,47(3):508-519.
doi: 10.1509/jmkr.47.3.508
[6] Lin Z, Zhang Y, Tan Y. An Empirical Study of Free Product Sampling and Rating Bias[J]. Information Systems Research, 2019,30(1):260-275.
doi: 10.1287/isre.2018.0801
[7] Fehr E, Gächter S. Fairness and Retaliation: The Economics of Reciprocity[J]. The Journal of Economic Perspectives, 2000,14(3):159-181.
[8] Schmidhuber J. Deep Learning in Neural Networks: An Overview[J]. Neural Networks, 2015,61:85-117.
doi: 10.1016/j.neunet.2014.09.003 pmid: 25462637
[9] Ni F T, Zhang J, Noori M N. Deep Learning for Data Anomaly Detection and Data Compression of a Long-Span Suspension Bridge[J]. Computer-Aided Civil and Infrastructure Engineering, 2020,35(7):685-700.
doi: 10.1111/mice.v35.7
[10] Xu W H, Huang H, Zhang J, et al. CNN-based Skip-Gram Method for Improving Classification Accuracy of Chinese Text[J]. KSII Transactions on Internet and Information Systems, 2019,13(12):6080-6096.
[11] Yadav A, Vishwakarma D K. Sentiment Analysis Using Deep Learning Architectures: A Review[J]. Artificial Intelligence Review, 2020,53(6):4335-4385.
doi: 10.1007/s10462-019-09794-5
[12] Yang W M, Zhang X C, Tian Y P, et al. Deep Learning for Single Image Super-Resolution: A Brief Review[J]. IEEE Transactions on Multimedia, 2019,21(12):3106-3121.
doi: 10.1109/TMM.6046
[13] Dhillon A, Verma G K. Convolutional Neural Network: A Review of Models, Methodologies and Applications to Object Detection[J]. Progress in Artificial Intelligence, 2020,9:85-112.
doi: 10.1007/s13748-019-00203-0
[14] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[OL]. arXiv Preprint, arXiv: 1409. 1556.
[15] Lakshmanan A, Krishnan H S. The A-ha! Experience: Insight and Discontinuous Learning in Product Usage[J]. Journal of Marketing, 2011,75(6):105-123.
[16] Kim J, Morris J D. The Power of Affective Response and Cognitive Structure In Product-Trial Attitude Formation[J]. Journal of Advertising, 2007,36(1):95-106.
doi: 10.2753/JOA0091-3367360107
[17] Chen H, Duan W, Zhou W. The Interplay Between Free Sampling and Word of Mouth in the Online Software Market[J]. Decision Support Systems, 2017,95:82-90.
doi: 10.1016/j.dss.2017.01.001
[18] Li F, Yi Z. Trial or No Trial: Supplying Costly Signals to Improve Profits[J]. Decision Sciences, 2016,48(4):795-827.
doi: 10.1111/deci.2017.48.issue-4
[19] Han Y, Zhang Z. Impact of Free Sampling on Product Diffusion Based on Bass Model[J]. Electronic Commerce Research, 2018,18(1):125-141.
doi: 10.1007/s10660-017-9264-9
[20] Connelly B, Certo T, Ireland R, et al. Signaling Theory: A Review and Assessment[J]. Journal of Management, 2011,37(1):39-67.
doi: 10.1177/0149206310388419
[21] Kirmani A, Rao A. No Pain, No Gain: A Critical Review of the Literature on Signaling Unobservable Product Quality[J]. Journal of Marketing, 2000,64(2):66-79.
doi: 10.1509/jmkg.64.2.66.18000
[22] Hampshire K, Hamill H, Mariwah S, et al. The Application of Signalling Theory to Health-Related Trust Problems: The Example of Herbal Clinics in Ghana and Tanzania[J]. Social Science & Medicine, 2017,188:109-118.
pmid: 28738317
[23] Mavlanova T, Benbunan-Fich R, Koufaris M. Signaling Theory and Information Asymmetry in Online Commerce[J]. Information & Management, 2012,49(5):240-247.
[24] Biswas D, Biswas A. The Diagnostic Role of Signals in the Context of Perceived Risks in Online Shopping: Do Signals Matter More on the Web?[J]. Journal of Interactive Marketing, 2004,18(3):30-45.
doi: 10.1002/dir.20010
[25] Spence M. Signaling in Retrospect and the Informational Structure of Markets[J]. American Economic Review, 2002,92(3):434-459.
doi: 10.1257/00028280260136200
[26] Donath J. Signals in Social Supernets[J]. Journal of Computer-Mediated Communication, 2007,13(1):231-251.
doi: 10.1111/j.1083-6101.2007.00394.x
[27] Jiang Z, Benbasat I. Virtual Product Experience: Effects of Visual and Functional Control of Products on Perceived Diagnosticity and Flow in Electronic Shopping[J]. Journal of Management Information Systems, 2005,21(3):111-148.
doi: 10.1080/07421222.2004.11045817
[28] Teng L, Ye N, Yu Y, et al. Effects of Culturally Verbal and Visual Congruency/Incongruency Across Cultures in a Competitive Advertising Context[J]. Journal of Business Research, 2014,67(3):288-294.
doi: 10.1016/j.jbusres.2013.05.015
[29] Evans J S B T. In Two Minds: Dual-Process Accounts of Reasoning[J]. Trends in Cognitive Sciences, 2003,7(10):454-459.
doi: 10.1016/j.tics.2003.08.012 pmid: 14550493
[30] Yoo J, Kim M. The Effects of Online Product Presentation on Consumer Responses: A Mental Imagery Perspective[J]. Journal of Business Research, 2014,67(11):2464-2472.
doi: 10.1016/j.jbusres.2014.03.006
[31] Macinnis D, Price L. The Role of Imagery in Information Processing: Review and Extensions[J]. Journal of Consumer Research, 1987,13(4):473-491.
doi: 10.1086/jcr.1987.13.issue-4
[32] Wang Q, Cui X, Huang L, et al. Seller Reputation or Product Presentation? An Empirical Investigation from Cue Utilization Perspective[J]. International Journal of Information Management, 2016,36(3):271-283.
doi: 10.1016/j.ijinfomgt.2015.12.006
[33] Zhang S, Lee D, Vir Singh P, et al. How Much is an Image Worth? Airbnb Property Demand Estimation Leveraging Large Scale Image Analytics[OL]. [2017-05-25]. https://ssrn.com/abstract=2976021.
[1] 周泽聿,王昊,赵梓博,李跃艳,张小琴. 融合关联信息的GCN文本分类模型构建及其应用研究*[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[2] 赵丹宁,牟冬梅,白森. 基于深度学习的科技文献摘要结构要素自动抽取方法研究*[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[3] 徐月梅, 王子厚, 吴子歆. 一种基于CNN-BiLSTM多特征融合的股票走势预测模型*[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[4] 黄名选,蒋曹清,卢守东. 基于词嵌入与扩展词交集的查询扩展*[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[5] 钟佳娃,刘巍,王思丽,杨恒. 文本情感分析方法及应用综述*[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[6] 马莹雪,甘明鑫,肖克峻. 融合标签和内容信息的矩阵分解推荐方法*[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[7] 张国标,李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测*[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[8] 常城扬,王晓东,张胜磊. 基于深度学习方法对特定群体推特的动态政治情感极性分析*[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[9] 冯勇,刘洋,徐红艳,王嵘冰,张永刚. 融合近邻评论的GRU商品推荐模型*[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[10] 成彬,施水才,都云程,肖诗斌. 基于融合词性的BiLSTM-CRF的期刊关键词抽取方法[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[11] 胡昊天,吉晋锋,王东波,邓三鸿. 基于深度学习的食品安全事件实体一体化呈现平台构建*[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[12] 张琪,江川,纪有书,冯敏萱,李斌,许超,刘浏. 面向多领域先秦典籍的分词词性一体化自动标注模型构建*[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[13] 吕学强,罗艺雄,李家全,游新冬. 中文专利侵权检测研究综述*[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[14] 李丹阳, 甘明鑫. 基于多源信息融合的音乐推荐方法 *[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[15] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn