Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (1): 63-75    DOI: 10.11925/infotech.2096-3467.2022.0207
Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews
Xiao Yuhan,Lin Huiping()
School of Software & Microelectronics, Peking University, Beijing 102600, China
[Objective] This paper proposes a new deep learning algorithm to extract aspect words, aiming to achieve differentiated and refined user demand analysis. [Methods] We designed a Context Window Self-Attention (CWSA) model to extract aspect words. This model focuses on semantics of the context window and adjacent texts based on overall information of the full-texts. Then, we extracted the fine-grained product features from their reviews. Finally, we conducted the aspect-level sentiment analysis to further examine user demands. [Results] The paper constructed a Chinese dataset for aspect word extraction and aspect-level sentiment analysis with nearly 900,000 reviews of smartphones sold by The proposed CWSA model’s F1 score reached 89.65% on this dataset, which was better than those of the baseline models. [Limitations] There are limited publicly accessible Chinese datasets for aspect word extraction and aspect-level sentiments. More Chinese and English datasets of multiple products need to be constructed to improve our model’s cross-language adaptability. [Conclusions] The proposed model improves differentiated and refined data mining.

Key wordsDeep Learning      Aspect Word Extraction      Sentiment Analysis      Differentiated Demand Mining     
Received: 13 March 2022      Published: 16 February 2023
ZTFLH:  TP391  
Fund:National Key R&D Program of China(2018YFB1702900)
Corresponding Authors: Lin Huiping,ORCID:0000-0002-0500-1163, E-mail:。   

Cite this article:

Xiao Yuhan, Lin Huiping. Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews. Data Analysis and Knowledge Discovery, 2023, 7(1): 63-75.

Flow Chart of Demand Mining Method Based on CWSA Aspect Word Extraction Model
Network Structure of CWSA
标签 O O O O B I O O O B I I I O O O
情感倾向 -2 -2 -2 -2 -1 -1 -2 -2 -2 1 1 1 1 -2 -2 -2
An Example of the Dataset Labeling
模型 F1/%
BiLSTM 76.46
CNN-CRF 80.59
BiLSTM-CRF 81.60
CGN 82.09
Seq2Seq4ATE 83.44
BERT-base 87.99
CWSA(本文) 89.65
Comparison of Experimental Results
用户评分 占比/%
1 13.80
2 3.20
3 8.10
4 0.80
5 74.10
Distribution of User Ratings
Distribution of User Comment Length
Length Distribution of Jieba Segmentation Results and that of Words Extracted by CWSA
Proportion of the Sum of Aspect Word Frequency in Each Interval
方面词 总频数 正面情感 中立情感 负面情感
屏幕 250 088 176 901 47 378 25 809
运行速度 248 014 203 120 13 429 21 465
拍照效果 215 501 175 628 17 103 22 770
音效 200 430 154 316 32 675 13 439
充电速度 7 503 6 967 173 363
发货速度 5 764 5 223 83 458
电池容量 5 162 4 256 340 566
屏幕分辨率 3 138 2 466 287 385
价保 1 613 35 16 1562
客服服务态度 1 439 1 133 45 261
后置摄像头 915 504 89 322
相素 432 343 12 77
120Hz刷新率 370 347 4 19
双立体声扬声器 40 38 2 0
諾基亚 6 2 0 4
超级快冲 5 5 0 0
Examples of Aspect Words and Corresponding Emotional Attitudes from JD Mobile Phone Reviews
