Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (10): 63-73     https://doi.org/10.11925/infotech.2096-3467.2022.0872
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于混合神经网络的中文在线评论产品特征提取及消费者需求分析*
史丽丽1,2,林军1,2(),朱桂阳3
1西安交通大学管理学院 西安 710049
2过程管理与效率工程教育部重点实验室 西安 710049
3杭州电子科技大学管理学院 杭州 310018
Extracting Product Features and Analyzing Customer Needs from Chinese Online Reviews with Hybrid Neural Network
Shi Lili1,2,Lin Jun1,2(),Zhu Guiyang3
1School of Management, Xi’an Jiaotong University, Xi’an 710049, China
2The Key Lab of the Ministry of Education for Process Management & Efficiency Engineering, Xi’an 710049, China
3School of Management, Hangzhou Dianzi University, Hangzhou 310018, China
全文: PDF (1072 KB)   HTML ( 16
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】从中文在线评论中提取产品特征,并结合评论内容对消费者需求进行分析。【方法】首先提出一种混合神经网络(HNN)模型用于从中文在线评论中提取产品特征,进一步将关键事件技术及抱怨和赞扬分析理论应用到Kano模型中,对产品特征进行分类和优先级排序。【结果】HNN模型的F1值达到94.85%,比变体基准模型平均提高10.52个百分点,比业界其他模型平均提高9.47个百分点。【局限】 所提方法是一种监督方法,对标记信息的需求限制了其应用。【结论】所提方法通过解决中文产品特征提取的问题,提升了产品特征提取的精度。结合提取的特征进行消费者需求分析,对产品特征进行分类和优先级排序,为产品管理者构建产品提升策略奠定基础。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
史丽丽
林军
朱桂阳
关键词 中文在线评论产品特征提取消费者需求分析深度学习    
Abstract

[Objective] This study aims to extract product features and analyze customer needs based on the content of Chinese online reviews. [Methods] First, we proposed a hybrid neural network (HNN) to extract product features. Then, we applied critical incident technique (CIT) and analysis of complaints and compliments (ACC) to the Kano model to classify and prioritize product features. [Results] The F1 value of the HNN model reached 94.85%, which was 10.52 percentage points higher than the variant benchmark models and 9.47 percentage points over other leading models on average. [Limitations] The proposed model is supervised learning, and the need for labeling information restricts its application. [Conclusions] The proposed method improves the accuracy of product feature extraction, as well as classifies and prioritizes product features based on customer needs. It lays a foundation for managers to develop product improvement strategies.

Key wordsChinese Online Reviews    Product Feature Extraction    Customer Requirements Analysis    Deep Learning
收稿日期: 2022-08-18      出版日期: 2023-03-28
ZTFLH:  TP391  
  F274  
基金资助:*国家自然科学基金面上项目(72071154);国家自然科学基金面上项目(71672140)
通讯作者: 林军,ORCID:0000-0002-2635-1816,E-mail:ljun@mail.xjtu.edu.cn。   
引用本文:   
史丽丽, 林军, 朱桂阳. 基于混合神经网络的中文在线评论产品特征提取及消费者需求分析*[J]. 数据分析与知识发现, 2023, 7(10): 63-73.
Shi Lili, Lin Jun, Zhu Guiyang. Extracting Product Features and Analyzing Customer Needs from Chinese Online Reviews with Hybrid Neural Network. Data Analysis and Knowledge Discovery, 2023, 7(10): 63-73.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0872      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I10/63
Fig.1  研究框架
Fig.2  HNN算法架构
评论属性 评论长度
产品 Huawei Mate 10 均值 96.14
总数量 139 684 最小值 3
预处理后数量 134 125 最大值 645
网站来源 JD.com 25% 27
评论评分 1~5 50% 53
收集时间 2017.11.1-2018.11.12 75% 118
Table 1  数据集描述
模块 参数
CNW 词向量维度 300
学习率 0.001
Dropout 0.5
卷积核尺寸 1-3
卷积核数量 每个尺寸各100
激活函数 ReLU
BLC 字向量维度 300
学习率 0.001
Dropout 0.3
BiLSTM个数 10
BiLSTM隐层神经元数量 32
Dropout 0.3
激活函数 ReLU
FC 隐层层数 1
隐层神经元数量 64
Dropout 0.3
激活函数 Softmax
输出维度 13
Table 2  超参数设置
变体 定义
CNW 词向量输入CNN中,CNN输出输入FC中
CNC 字向量输入CNN中,CNN输出输入FC中
BLW 词向量输入BiLSTM中,BiLSTM输出输入FC中
BLC 字向量输入BiLSTM中,BiLSTM输出输入FC中
CNW+CNC 词向量和字向量分别输入两个CNN中,两个CNN输出串联输入FC中
BLW+BLC 词向量和字向量分别输入两个BiLSTM中,两个BiLSTM输出串联输入FC中
BLW+CNC 词向量和字向量分别输入BiLSTM和CNN中,两个输出串联输入FC中
Table 3  变体定义
编号 变体 P/% R/% F1/%
1 CNW 92.79 94.21 93.49
2 CNC 86.89 94.52 90.54
3 BLW 67.05 72.76 69.79
4 BLC 71.06 76.09 73.49
5 CNW+CNC 90.44 90.09 90.27
6 BLW+BLC 79.83 81.84 80.82
7 BLW+CNC 89.69 94.22 91.90
8 CNW+BLC(HNN) 93.94 95.78 94.85
Table 4  变体性能
Fig.3  变体性能比较
模型 P/% R/% F1/%
SVM 62.58 95.70 75.67
C-CNN 92.16 95.75 93.92
D-RNN 85.02 88.11 86.54
HNN 93.94 95.78 94.85
Table 5  不同模型的性能比较
产品特征 P/% R/% F1/%
硬件 93.13 92.91 93.02
电池 99.26 94.15 96.64
功能 85.03 96.79 90.53
通信 95.54 96.57 96.05
音视频 97.08 99.11 98.09
性价比 85.63 93.8 89.53
屏幕 96.35 95.19 95.77
质量 82.53 93.49 87.66
相机 99.54 96.93 98.22
系统 96.46 98.01 97.23
外观及体验 94.65 97.68 96.14
包装及配件 96.83 95.05 95.93
物流及售后 99.15 95.48 97.28
Table 6  各产品特征分类结果
句子 预测 真实
1.续航时间相当可观 电池 电池
2.默认分辨率显示效果很好 屏幕 屏幕
3.读取和写入原有手机备份很方便 系统 系统
4.我收到的手机WiFi断流 通信 通信
5.稍微有点卡不如苹果流畅 系统 系统
6.店家发货非常快 物流及售后 物流及售后
Table 7  隐性产品特征示例
产品特征 u i /% v i /% 分类 u i v i
通信 0.508 1.585 必备特征 0.321
硬件 0.729 1.574 必备特征 0.463
音视频 0.859 1.574 必备特征 0.546
性价比 4.207 6.749 必备特征 0.623
物流及售后 25.490 38.434 必备特征 0.663
包装及配件 4.941 7.053 一维特征 0.701
电池 7.121 9.425 一维特征 0.756
屏幕 2.488 3.243 一维特征 0.767
外观及体验 17.734 13.791 一维特征 1.286
功能 2.225 1.627 魅力特征 1.368
质量 0.885 0.514 魅力特征 1.722
相机 9.968 4.775 魅力特征 2.088
系统 22.844 9.656 魅力特征 2.366
Table 8  基于Kano模型的特征分类
[1] Peng H Y, Ma Y K, Li Y, et al. Learning Multi-Grained Aspect Target Sequence for Chinese Sentiment Analysis[J]. Knowledge-Based Systems, 2018, 148: 167-176.
doi: 10.1016/j.knosys.2018.02.034
[2] Teahan W J, Wen Y Y, McNab R, et al. A Compression-Based Algorithm for Chinese Word Segmentation[J]. Computational Linguistics, 2000, 26(3): 375-393.
doi: 10.1162/089120100561746
[3] 唐琳, 郭崇慧, 陈静锋. 中文分词技术研究综述[J]. 数据分析与知识发现, 2020, 4(2/3): 1-17.
[3] (Tang Lin, Guo Chonghui, Chen Jingfeng. Review of Chinese Word Segmentation Studies[J]. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 1-17.)
[4] Hu M Q, Liu B. Mining Opinion Features in Customer Reviews[C]// Proceedings of the 19th National Conference on Artifical Intelligence. 2004: 755-760.
[5] Qiu G, Liu B, Bu J J, et al. Opinion Word Expansion and Target Extraction Through Double Propagation[J]. Computational Linguistics, 2011, 37(1): 9-27.
doi: 10.1162/coli_a_00034
[6] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[7] Schouten K, Frasincar F. Survey on Aspect-Level Sentiment Analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(3): 813-830.
doi: 10.1109/TKDE.2015.2485209
[8] Poria S, Cambria E, Gelbukh A. Aspect Extraction for Opinion Mining with a Deep Convolutional Neural Network[J]. Knowledge-Based Systems, 2016, 108: 42-49.
doi: 10.1016/j.knosys.2016.06.009
[9] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[10] Liu P F, Joty S, Meng H. Fine-Grained Opinion Mining with Recurrent Neural Networks and Word Embeddings[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1433-1443.
[11] Li X, Bing L D, Li P J, et al. Aspect Term Extraction with History Attention and Selective Transformation[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4194-4200.
[12] Zhou J, Chen Q, Huang J X, et al. Position-Aware Hierarchical Transfer Model for Aspect-Level Sentiment Classification[J]. Information Sciences, 2020, 513: 1-16.
doi: 10.1016/j.ins.2019.11.048
[13] Liu G L, Xu X F, Deng B L, et al. A Hybrid Method for Bilingual Text Sentiment Classification Based on Deep Learning[C]// Proceedings of the 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. 2016: 93-98.
[14] Zhou K, Long F. Sentiment Analysis of Text Based on CNN and Bi-directional LSTM Model[C]// Proceedings of the 24th IEEE International Conference on Automation and Computing. 2018: 613-617.
[15] Pei J H, Zhang C, Huang D G, et al. Combining Word Embedding and Semantic Lexicon for Chinese Word Similarity Computation[C]// Proceedings of International Conference on Computer Processing of Oriental Languages, National CCF Conference on Natural Language Processing and Chinese Computing. 2016: 766-777.
[16] 杨阳, 刘恩博, 顾春华, 等. 稀疏数据下结合词向量的短文本分类模型研究[J]. 计算机应用研究, 2022, 39(3): 711-715, 750.
[16] (Yang Yang, Liu Enbo, Gu Chunhua, et al. Research on Short Text Classification Model Combined with Word Vector for Sparse Data[J]. Application Research of Computers, 2022, 39(3): 711-715, 750.)
[17] Hashida S, Tamura K, Sakai T. Classifying Sightseeing Tweets Using Convolutional Neural Networks with Multi-Channel Distributed Representation[C]// Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics. 2018: 178-183.
[18] Chen X X, Xu L, Liu Z Y, et al. Joint Learning of Character and Word Embeddings[C]// Proceedings of the 24th International Conference on Artificial Intelligence. 2015: 1236-1242.
[19] Kano N, Seraku N, Takahashi F, et al. Attractive Quality and Must-Be Quality[J]. Journal of the Japanese Society for Quality Control, 1984, 14(2): 147-156.
[20] Qi J Y, Zhang Z P, Jeon S, et al. Mining Customer Requirements from Online Reviews: A Product Improvement Perspective[J]. Information & Management, 2016, 53(8): 951-963.
doi: 10.1016/j.im.2016.06.002
[21] Mikulić J, Prebežac D. A Critical Review of Techniques for Classifying Quality Attributes in the Kano Model[J]. Managing Service Quality, 2011, 21(1): 46-66.
doi: 10.1108/09604521111100243
[22] Flanagan J C. The Critical Incident Technique[J]. Psychological Bulletin, 1954, 51(4): 327-358.
doi: 10.1037/h0061470 pmid: 13177800
[23] Bott G, Tourish D. The Critical Incident Technique Reappraised: Using Critical Incidents to Illuminate Organizational Practices and Build Theory[J]. Qualitative Research in Organizations and Management, 2016, 11(4): 276-300.
doi: 10.1108/QROM-01-2016-1351
[24] Heo J Y, Kim K J. Development of a Scale to Measure the Quality of Mobile Location-Based Services[J]. Service Business, 2017, 11(1): 141-159.
doi: 10.1007/s11628-016-0305-6
[25] Cadotte E R, Turgeon N. Dissatisfiers and Satisfiers: Suggestions from Consumer Complaints and Compliments[J]. The Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior, 1988, 1: 74-79.
[26] Cadotte E R, Turgeon N. Key Factors in Guest Satisfaction[J]. Cornell Hotel and Restaurant Administration Quarterly, 1988, 28(4): 44-51.
doi: 10.1177/001088048802800415
[27] Tontini G, dos Santos Bento G, Milbratz T C, et al. Exploring the Nonlinear Impact of Critical Incidents on Customers’ General Evaluation of Hospitality Services[J]. International Journal of Hospitality Management, 2017, 66: 106-116.
doi: 10.1016/j.ijhm.2017.07.011
[28] Huizing M. Twitter as a Giant Ideabox-Systematically Identifying Customer Needs Regarding the Ring Video Doorbell Through Analysis of Tweets[D]. Enschede, The Netherlands: University of Twente, 2021.
[29] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[30] Liu W, Xu T G, Xu Q H, et al. An Encoding Strategy Based Word-Character LSTM for Chinese NER[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1(Long and Short Papers). 2019: 2379-2389.
[31] He R D, Lee W S, Ng H T, et al. Exploiting Document Knowledge for Aspect-Level Sentiment Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers). 2018: 579-585.
[32] dos Santos C, Gatti M. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts[C]// Proceedings of the 25th International Conference on Computational Linguistics. 2014: 69-78.
[33] Gu X D, Gu Y W, Wu H B. Cascaded Convolutional Neural Networks for Aspect-Based Opinion Summary[J]. Neural Processing Letters, 2017, 46(2): 581-594.
doi: 10.1007/s11063-017-9605-7
[34] Liu T F, Yu S Y, Xu B M, et al. Recurrent Networks with Attention and Convolutional Networks for Sentence Representation and Classification[J]. Applied Intelligence, 2018, 48(10): 3797-3806.
doi: 10.1007/s10489-018-1176-4
[35] Nowak J, Taspinar A, Scherer R. LSTM Recurrent Neural Networks for Short Text and Sentiment Classification[C]// Proceedings of International Conference on Artificial Intelligence and Soft Computing. 2017: 553-562.
[36] Smith L N. A Disciplined Approach to Neural Network Hyper-Parameters: Part 1 - - Learning Rate, Batch Size, Momentum, and Weight Decay[OL]. arXiv Preprint, arXiv: 1803.09820.
[37] Tamchyna A, Veselovská K. UFAL at SemEval-2016 Task 5: Recurrent Neural Networks for Sentence Classification[C]// Proceedings of the 10th International Workshop on Semantic Evaluation(SemEval-2016). 2016: 367-371.
[1] 向卓元, 陈浩, 王倩, 李娜. 面向任务型对话的小样本语言理解模型研究*[J]. 数据分析与知识发现, 2023, 7(9): 64-77.
[2] 聂卉, 蔡瑞昇. 引入注意力机制的在线问诊推荐研究*[J]. 数据分析与知识发现, 2023, 7(8): 138-148.
[3] 李广建, 袁钺. 基于深度学习的科技文献知识单元抽取研究综述[J]. 数据分析与知识发现, 2023, 7(7): 1-17.
[4] 王楠, 王淇. 基于深度学习的学生课堂专注度测评方法*[J]. 数据分析与知识发现, 2023, 7(6): 123-133.
[5] 汪晓凤, 孙雨洁, 王华珍, 张恒彰. 融合深度学习和知识图谱的类型可控问句生成模型构建及验证*[J]. 数据分析与知识发现, 2023, 7(6): 26-37.
[6] 吴佳伦, 张若楠, 康武林, 袁普卫. 基于患者相似性分析的药物推荐深度学习模型研究*[J]. 数据分析与知识发现, 2023, 7(6): 148-160.
[7] 刘洋, 张雯, 胡毅, 毛进, 黄菲. 基于多模态深度学习的酒店股票预测*[J]. 数据分析与知识发现, 2023, 7(5): 21-32.
[8] 黄学坚, 马廷淮, 王根生. 基于分层语义特征学习模型的微博谣言事件检测*[J]. 数据分析与知识发现, 2023, 7(5): 81-91.
[9] 王寅秋, 虞为, 陈俊鹏. 融合知识图谱的中文医疗问答社区自动问答研究*[J]. 数据分析与知识发现, 2023, 7(3): 97-109.
[10] 张贞港, 余传明. 基于实体与关系融合的知识图谱补全模型研究*[J]. 数据分析与知识发现, 2023, 7(2): 15-25.
[11] 沈丽宁, 杨佳艺, 裴家旋, 曹广, 陈功正. 基于OCC模型和情绪诱因事件抽取的细颗粒度情绪识别方法研究*[J]. 数据分析与知识发现, 2023, 7(2): 72-85.
[12] 王卫军, 宁致远, 杜一, 周园春. 基于多标签分类的科技文献学科交叉研究性质识别*[J]. 数据分析与知识发现, 2023, 7(1): 102-112.
[13] 肖宇晗, 林慧苹. 基于CWSA方面词提取模型的差异化需求挖掘方法研究——以京东手机评论为例*[J]. 数据分析与知识发现, 2023, 7(1): 63-75.
[14] 成全, 佘德昕. 融合患者体征与用药数据的图神经网络药物推荐方法研究*[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[15] 王露, 乐小虬. 科技论文引用内容分析研究进展[J]. 数据分析与知识发现, 2022, 6(4): 1-15.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn