1College of Information Management, Nanjing Agricultural University, Nanjing 210031, China 2College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210031, China
[Objective] This paper proposes a sentiment analysis method for user reviews integrating margin sampling and tri-training. It addresses the issues of the large volume of user reviews, ambiguous sentiment tendencies, and short content. [Methods] First, we constructed a multi-class support vector machine based on a one-vs-all decomposition strategy. Then, we integrated a margin sampling strategy considering cosine similarity to create an initial set. Finally, we proposed a Tri-training algorithm combining a soft voting mechanism. [Results] The proposed algorithm improved the voting mechanism in the Tri-training algorithm, which further reduced the probability of misjudgment in sample classification by multiple classifiers. All categories achieved precision rates above 79%. [Limitations] The proposed method does not consider extracting information from multimedia data. [Conclusions] Compared with traditional and recently improved semi-supervised learning algorithms, the proposed algorithm demonstrates classification accuracy and efficiency superiority.
(Cyberspace Administration of China. The “14th Five-Year” National Informatization Planning[EB/OL]. [2021-12-27]. http://www.cac.gov.cn/2021-12/27/c_1642205314518676.htm.)
(Zhou Jian, Liu Yanbao, Liu Jiajia. Exploration of Intellectual Structure and Hot Issues in Sentiment Analysis Research[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(1): 111-124.)
(Liu Tong, Liu Chen, Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 51-58.)
(Li Lei, Song Jianwei, Liu Ji. Analyzing the Effect of Reputation Based on Sentiment Analysis of Online Comment Texts[J]. Chinese Journal of Management, 2020, 17(4): 583-591.)
(Ma Fengcai, Li Chunyue. Research on E-Commerce Consumer Satisfaction Measurement of Fresh Products—Analysis Based on Online Reviews of JD Fresh[J]. Price: Theory & Practice, 2020(5): 117-120.)
(Liu Yulin, Jian Lirong. Data Mining of E-Commerce Online Reviews Based on Sentiment Analysis[J]. Statistics & Information Forum, 2018, 33(12): 119-124.)
(Lu Weicong, Xu Jian. Sentiment Analysis of Network Users’ Reviews Based on Bipartite Network[J]. Information Studies: Theory & Application, 2018, 41(2): 121-126.)
[8]
Chang C H, Hwang S Y, Wu M L. Learning Bilingual Sentiment Lexicon for Online Reviews[J]. Electronic Commerce Research and Applications, 2021, 47: Article No.101037.
[9]
Zhang J, Lu X C, Liu D. Deriving Customer Preferences for Hotels Based on Aspect-Level Sentiment Analysis of Online Reviews[J]. Electronic Commerce Research and Applications, 2021, 49: Article No.101094.
[10]
Li H, Chen Q, Zhong Z, et al. E-Word of Mouth Sentiment Analysis for User Behavior Studies[J]. Information Processing and Management, 2022, 59(1): Article No.102784.
(Bao Qianhui, Li Jiali, Shi Shuzhen, et al. Sentimental Analysis of Online Reviews of Egg Consumption Based on DSLML[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(S1): 496-503.)
(Zhu Xiaoxia, Song Jiaxin, Meng Jianfang. Research on the Classification of Emotion in Microblog Comments Based on the Theme-Emotion Mining Model[J]. Information Studies: Theory & Application, 2019, 42(5): 159-164.)
[13]
Luo J M, Vu H Q, Li G, et al. Understanding Service Attributes of Robot Hotels: A Sentiment Analysis of Customer Online Reviews[J]. International Journal of Hospitality Management, 2021, 98: Article No.103032.
(Li Haojun, Lv Yun, Wang Xuhui, et al. A Deep Recommendation Model with Multi-Layer Interaction and Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2023, 7(3): 43-57.)
[15]
Lin H C K, Wang T H, Lin G C, et al. Applying Sentiment Analysis to Automatically Classify Consumer Comments Concerning Marketing 4Cs Aspects[J]. Applied Soft Computing, 2020, 97: Article No.106755.
[16]
Zhang J, Zhang A J, Liu D, et al. Customer Preferences Extraction for Air Purifiers Based on Fine-Grained Sentiment Analysis of Online Reviews[J]. Knowledge-Based Systems, 2021, 228: Article No.107259.
[17]
Wang W, Guo L H, Wu Y J. The Merits of a Sentiment Analysis of Antecedent Comments for the Prediction of Online Fundraising Outcomes[J]. Technological Forecasting & Social Change, 2022, 174: Article No.121070.
[18]
Wang P, Li J N, Hou J R. S2SAN: A Sentence-to-Sentence Attention Network for Sentiment Analysis of Online Reviews[J]. Decision Support Systems, 2021, 149: Article No.113603.
[19]
Zhang K, Zhu Y W, Zhang W J, et al. Cross-Modal Image Sentiment Analysis via Deep Correlation of Textual Semantic[J]. Knowledge-Based Systems, 2021, 216: Article No.106803.
(Zhou Ning, Zhong Na, Jin Gaoya, et al. Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding[J]. Data Analysis and Knowledge Discovery, 2023, 7(3): 58-68.)
(Liu Yi, Meng Lingkun, Bao Jigang, et al. A Comparative Study of Sentiment Computing Methods: Will Machine Learning be Overwhelming?[J]. Nankai Business Review, 2021, 24(5): 63-74.)
[22]
Zhao H L, Liu Z H, Yao X M, et al. A Machine Learning-Based Sentiment Analysis of Online Product Reviews with a Novel Term Weighting and Feature Selection Approach[J]. Information Processing & Management, 2021, 58(5): Article No.102656.
[23]
Liu Y, Lu J H, Yang J, et al. Sentiment Analysis for E-Commerce Product Reviews by Deep Learning Model of Bert-BiGRU-Softmax[J]. Mathematical Biosciences and Engineering, 2020, 17(6): 7819-7837.
doi: 10.3934/mbe.2020398
pmid: 33378922
(Shi Da, Wang Lele, Yi Bowen. Deep Data Mining for Online Reviews Usefulness: Hotel Reviews Data on TripAdvisor[J]. Nankai Business Review, 2020, 23(5): 64-75.)
(Yan Shangyi, Wang Jingya, Liu Xiaowen, et al. Microblog Sentiment Analysis with Multi-Head Self-Attention Pooling and Multi-Granularity Feature Interaction Fusion[J]. Data Analysis and Knowledge Discovery, 2023, 7(4): 32-45.)
[26]
Lin X, Ho C, Xia L, et al. Sentiment Analysis of Low-Carbon Travel APP User Comments Based on Deep Learning[J]. Sustainable Energy Technologies and Assessments, 2021, 44: Article No.101014.
[27]
Bigne E, Ruiz C, Cuenca A, et al. What Drives the Helpfulness of Online Reviews? A Deep Learning Study of Sentiment Analysis, Pictorial Content and Reviewer Expertise for Mature Destinations[J]. Journal of Destination Marketing & Management, 2021, 20: Article No.100570.
(Pang Qinghua, Dong Xianwei, Zhou Bin, et al. Keyword Extraction of Negative Online Reviews Based on Sentiment Analysis[J]. Information Science, 2022, 40(5): 111-117.)
[29]
Zhu J J, Chang Y C, Ku C H, et al. Online Critical Review Classification in Response Strategy and Service Provider Rating: Algorithms from Heuristic Processing, Sentiment Analysis to Deep Learning[J]. Journal of Business Research, 2021, 129: 860-877.
(Liu Yang, Ma Lili, Zhang Wen, et al. Detecting Sarcasm from Travel Reviews Based on Cross-Modal Deep Learning[J]. Data Analysis and Knowledge Discovery, 2022, 6(12): 23-31.)
(Zhou Ying, Zhang Xiaoyu, Yu Xiaofang. User Preference Analysis Based on Product Review Mining[J]. Information Science, 2022, 40(1): 58-65.)
[33]
周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 63-70.
[33]
(Zhou Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016: 63-70.)
[34]
李航. 机器学习方法[M]. 北京: 清华大学出版社, 2022: 3-27.
[34]
(Li Hang. Machine Learning Method[M]. Beijing: Tsinghua University Press, 2022: 3-27.)
[35]
Scudder H. Probability of Error of Some Adaptive Pattern-Recognition Machines[J]. IEEE Transactions on Information Theory, 1965, 11(3): 363-371.
[36]
Zhou Z H, Li M. Tri-Training: Exploiting Unlabeled Data Using Three Classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541.
(Xu Min. A New Method of Hidden Space Feature Augmentation for Self-Labeled Semi-Supervised SVM Classification[J]. Statistics & Decision, 2022, 38(7): 11-15.)