|
|
Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews |
Xiao Yuhan,Lin Huiping() |
School of Software & Microelectronics, Peking University, Beijing 102600, China |
|
|
Abstract [Objective] This paper proposes a new deep learning algorithm to extract aspect words, aiming to achieve differentiated and refined user demand analysis. [Methods] We designed a Context Window Self-Attention (CWSA) model to extract aspect words. This model focuses on semantics of the context window and adjacent texts based on overall information of the full-texts. Then, we extracted the fine-grained product features from their reviews. Finally, we conducted the aspect-level sentiment analysis to further examine user demands. [Results] The paper constructed a Chinese dataset for aspect word extraction and aspect-level sentiment analysis with nearly 900,000 reviews of smartphones sold by JD.com. The proposed CWSA model’s F1 score reached 89.65% on this dataset, which was better than those of the baseline models. [Limitations] There are limited publicly accessible Chinese datasets for aspect word extraction and aspect-level sentiments. More Chinese and English datasets of multiple products need to be constructed to improve our model’s cross-language adaptability. [Conclusions] The proposed model improves differentiated and refined data mining.
|
Received: 13 March 2022
Published: 16 February 2023
|
|
Fund:National Key R&D Program of China(2018YFB1702900) |
Corresponding Authors:
Lin Huiping,ORCID:0000-0002-0500-1163, E-mail: linhp@ss.pku.edu.cn。
|
[1] |
Smith S, Smith G, Shen Y T. Redesign for Product Innovation[J]. Design Studies, 2012, 33(2): 160-184.
doi: 10.1016/j.destud.2011.08.003
|
[2] |
Liu C, RamirezSerrano A, Yin G F. An Optimum Design Selection Approach for Product Customization Development[J]. Journal of Intelligent Manufacturing, 2012, 23(4): 1433-1443.
doi: 10.1007/s10845-010-0473-5
|
[3] |
Gangurde S R, Akarte M M. Customer Preference Oriented Product Design Using AHPModified TOPSIS Approach[J]. Benchmarking: An International Journal, 2013, 20(4): 549-564.
doi: 10.1108/BIJ-08-2011-0058
|
[4] |
Geyer F, Lehnen J, Herstatt C. Customer Need Identification Methods in New Product Development: What Works “Best”?[J]. International Journal of Innovation and Technology Management, 2018, 15(1): 1850008.
doi: 10.1142/S0219877018500086
|
[5] |
Rai R. Identifying Key Product Attributes and Their Importance Levels from Online Customer Reviews[C]// Proceedings of ASME 2012 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. 2013: 533-540.
|
[6] |
Zhou F, Jiao J R, Schaefer D, et al. Hybrid Association Mining and Refinement for Affective Mapping in Emotional Design[J]. Journal of Computing and Information Science in Engineering, 2010, 10(3): 031010.
doi: 10.1115/1.3482063
|
[7] |
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
|
[8] |
Jones K S. A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 1972, 28(1): 11-21.
doi: 10.1108/eb026526
|
[9] |
Mihalcea R, Tarau P. TextRank: Bringing Order into Text[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004:404-411.
|
[10] |
Zhang L, Chu X N, Xue D Y. Identification of the ToBeImproved Product Features Based on Online Reviews for Product Redesign[J]. International Journal of Production Research, 2019, 57(8): 2464-2479.
doi: 10.1080/00207543.2018.1521019
|
[11] |
Lai X J, Zhang Q X, Chen Q X, et al. The Analytics of ProductDesign Requirements Using Dynamic Internet Data: Application to Chinese Smartphone Market[J]. International Journal of Production Research, 2019, 57(18): 5660-5684.
doi: 10.1080/00207543.2018.1541200
|
[12] |
李贺, 曹阳, 沈旺, 等. 基于LDA主题识别与Kano模型分析的用户需求研究[J]. 情报科学, 2021, 39(8): 3-11.
|
[12] |
( Li He, Cao Yang, Shen Wang, et al. User Demand Based on LDA Subject Identification and Kano Model Analysis[J]. Information Science, 2021, 39(8): 3-11.)
|
[13] |
Guan X Y, Cheng Z Y, He X N, et al. Attentive Aspect Modeling for ReviewAware Recommendation[J]. ACM Transactions on Information Systems, 2019, 37(3): 1-27.
|
[14] |
Hu M Q, Liu B. Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004: 168-177.
|
[15] |
Turney P D. Learning Algorithms for Keyphrase Extraction[J]. Information Retrieval, 2000, 2(4): 303-336.
doi: 10.1023/A:1009976227802
|
[16] |
Gollapalli S D, Caragea C. Extracting Keyphrases from Research Papers Using Citation Networks[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014: 1629-1635.
|
[17] |
韩忠明, 李梦琪, 刘雯, 等. 网络评论方面级观点挖掘方法研究综述[J]. 软件学报, 2018, 29(2): 417-441.
|
[17] |
( Han Zhongming, Li Mengqi, Liu Wen, et al. Survey of Studies on Aspect-Based Opinion Mining of Internet[J]. Journal of Software, 2018, 29(2): 417-441.)
|
[18] |
Pontiki M, Galanis D, Pavlopoulos J, et al. SemEval-2014 Task 4: Aspect Based Sentiment Analysis[C]// Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014: 27-35.
|
[19] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186.
|
[20] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
|
[21] |
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
|
[22] |
常耀成, 张宇翔, 王红, 等. 特征驱动的关键词提取算法综述[J]. 软件学报, 2018, 29(7): 2046-2070.
|
[22] |
( Chang Yaocheng, Zhang Yuxiang, Wang Hong, et al. Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms[J]. Journal of Software, 2018, 29(7): 2046-2070.)
|
[23] |
Liu P F, Joty S, Meng H. FineGrained Opinion Mining with Recurrent Neural Networks and Word Embeddings[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1433-1443.
|
[24] |
Ma D H, Li S J, Wu F Z, et al. Exploring SequencetoSequence Learning in Aspect Term Extraction[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 3538-3547.
|
[25] |
Sutskever I, Vinyals O, Le Q V. Sequence to SequenceLearning with Neural Networks[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. 2014: 3104-3112.
|
[26] |
Yang H, Zeng B Q, Yang J H, et al. A Multitask Learning Model for ChineseOriented Aspect Polarity Classification and Aspect Term Extraction[J]. Neurocomputing, 2021, 419: 344-356.
doi: 10.1016/j.neucom.2020.08.001
|
[27] |
肖宇晗, 林慧苹, 汪权彬, 等. 基于双特征嵌套注意力的方面词情感分析算法[J]. 智能系统学报, 2021, 16(1): 142-151.
|
[27] |
( Xiao Yuhan, Lin Huiping, Wang Quanbin, et al. An Algorithm for Aspect-Based Sentiment Analysis Based on Dual Features Attention-over-Attention[J]. CAAI Transactions on Intelligent Systems, 2021, 16(1): 142-151.)
|
[28] |
张严, 李天瑞. 面向评论的方面级情感分析综述[J]. 计算机科学, 2020, 47(6): 194-200.
doi: 10.11896/jsjkx.200200127
|
[28] |
( Zhang Yan, Li Tianrui. Review of Comment-Oriented Aspect-Based Sentiment Analysis[J]. Computer Science, 2020, 47(6): 194-200.)
doi: 10.11896/jsjkx.200200127
|
[29] |
Nguyen T H, Shirai K. AspectBased Sentiment Analysis Using Tree Kernel Based Relation Extraction[C]// Proceedings of International Conference on Intelligent Text Processing and Computational Linguistics. 2015: 114-125.
|
[30] |
Lipenkova J. A System for FineGrained AspectBased Sentiment Analysis of Chinese[C]// Proceedings of ACLIJCNLP 2015 System Demonstrations. 2015: 55-60.
|
[31] |
Kiritchenko S, Zhu X D, Cherry C, et al. NRCCanada2014:Detecting Aspects and Sentiment in Customer Reviews[C]// Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014: 437-442.
|
[32] |
Ma D H, Li S J, Zhang X D, et al. Interactive Attention Networks for Aspect-Level Sentiment Classification[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 4068-4074.
|
[33] |
Song Y W, Wang J H, Jiang T, et al. Attentional Encoder Network for Targeted Sentiment Classification[OL]. arXiv Preprint, arXiv: 1902.09314.
|
[34] |
Pennington J, Socher R, Manning C D. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
|
[35] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
|
[36] |
Zeng B Q, Yang H, Xu R Y, et al. LCF: A Local Context Focus Mechanism for AspectBased Sentiment Classification[J]. Applied Sciences, 2019, 9(16): 3389.
doi: 10.3390/app9163389
|
[37] |
Loshchilov I, Hutter F. Fixing Weight Decay Regularization in Adam[OL]. arXiv Preprint, arXiv: 1711.05101.
|
[38] |
Schuster M, Paliwal K K. Bidirectional Recurrent Neural Networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
doi: 10.1109/78.650093
|
[39] |
Gers F A, Schmidhuber J, Cummins F. Learning to Forget: Continual Prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471.
pmid: 11032042
|
[40] |
Xu H, Liu B, Shu L, et al. Double Embeddings and CNNBased Sequence Labeling for Aspect Extraction[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers). 2018: 592-598.
|
[41] |
Huang Z H, Xu W, Yu K. Bidirectional LSTMCRF Models for Sequence Tagging[OL]. arXiv Preprint, arXiv: 1508.01991.
|
[42] |
Sui D B, Chen Y B, Liu K, et al. Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 3830-3840.
|
[43] |
Reimers N, Gurevych I. Reporting Score Distributions Makes a Difference: Performance Study of LSTMNetworks for Sequence Tagging[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 338-348.
|
[44] |
Cui Y M, Che W X, Liu T, et al. Revisiting Pretrained Models for Chinese Natural Language Processing[C]// Proceedings of the Association for Computational Linguistics:EMNLP 2020. 2020: 657-668.
|
[45] |
Li S, Zhao Z, Hu R F, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 138-143.
|
[46] |
Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch[J]. Journal of Machine Learning Research, 2011, 12: 2493-2537.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|