Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (9): 65-76    DOI: 10.11925/infotech.2096-3467.2021.1303
Current Issue | Archive | Adv Search |
Classifying Reasons of Hotel Reviews with Domain ERNIE and BiLSTM Model
Zhang Zhipeng,Mao Yusheng,Zhang Liyi()
School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF (1946 KB)   HTML ( 24
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a classification model to identify reasons of hotel reviews from online booking platforms. [Methods] Firstly, we constructed a pretraining corpus with millions of online reviews and manually annotated the ORSC dataset for the proposed model. Then, we extracted the text features of ORSC dataset by adding the constructed corpus to ERNIE model. Finally, we used the BiLSTM model to merge all features and identify reviews with reasons. [Results] On ORSC datasets, the DERNIE model’s accuracy was 91.33% while the F1 value was 91.20%. After adding BiLSTM features, the accuracy increased to 94.57% and the F1 value became 94.62%. [Limitations] The pre-trained language models require large amount of data from the additional corpus, which might affect the computing speed and efficiency. [Conclusions] Our new model can effectively identify reason sentences from online reviews.

Key wordsOnline Review      Opinion Reason Sentence Classification      ERNIE Model      BiLSTM Model     
Received: 16 November 2021      Published: 26 October 2022
ZTFLH:  TP391  
  G250  
Fund:National Natural Science Foundation of China(71874126)
Corresponding Authors: Zhang Liyi,ORCID: 0000-0001-8634-9227     E-mail: lyzhang@whu.edu.cn

Cite this article:

Zhang Zhipeng, Mao Yusheng, Zhang Liyi. Classifying Reasons of Hotel Reviews with Domain ERNIE and BiLSTM Model. Data Analysis and Knowledge Discovery, 2022, 6(9): 65-76.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.1303     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I9/65

The Opinion Reason Classification Model
The Different Mask Strategies of BERT and ERNIE
Prediction Model for the Next Sentence
The Structure of DERNIE Model
The Structure of BiLSTM Model
The Structure of DERNIE-BiLSTM Model
类别 评论
观点
原因句
1.服务人员未经同意擅自进入房间。
2.房间实在太小,二个人都无法并排走
3.无窗,面积很小,非常潮湿闷气,空调的水都是用大
矿泉水瓶接的厕所无完整隔断,导致房内更加潮湿。
但总体来说,住了一夜没有耽误行程,已经很ok了。
非观点
原因句
1.综合条件太差
2.帮朋友订的,不知道怎么样
3.楼下是洗浴,楼上不知道是什么,两三点钟的时候好多脚步声,上楼下楼的,严重影响休息。体验很差!
Examples of ORSC Dataset
超参数 TextCNN DERNIE BERT-BiLSTM ERNIE-BiLSTM DERNIE-BiLSTM
character embedding dimensions 100 768 768 768 768
hidden dimensions 100 768 768 768 768
max sequence length 64 64 64 64 64
batch_size 32 16 32 32 32
learning rate 1e-3 3e-5 5e-5 3e-5 5e-5
epochs 6 11 7 13 20
dropout 0.5 0.1 0.1 0.1 0.1
Hyperparameters Settings of ORSC Experiment
L in the Pre-training Process
">
Changes of Loss L in the Pre-training Process
例子 样本 BERT预测 ERNIE预测 DERNIE预测
1 很好,主动给我们介绍附近的景点。 服台人务 朋友关系 服务态度
2 卫生差, 有小虫子咬得却都是疱 虽然 虽使 床上
3 极差,住的人三六九等,半夜被吵醒多次 睡眠 环境 隔音
4 硬件设施,和其他酒店差距有点大! 不般 方面 一般
5 位置就是离 近,卫生很差 酒店很 学校很 火车站
Results of Cloze Experiment
方法 Accuracy (%) Precision (%) Recall (%) F1-score (%)
TextCNN 90.81 90.64 91.07 90.86
DERNIE 91.33 92.91 89.55 91.20
BERT-BiLSTM 92.57 92.27 92.97 92.62
ERNIE-BiLSTM 94.10 93.86 94.40 94.13
DERNIE-BiLSTM 94.57 94.00 95.25 94.62
Results of ORSC Experiment
[1] Li G, Liu F. Sentiment Analysis Based on Clustering: A Framework in Improving Accuracy and Recognizing Neutral Opinions[J]. Applied Intelligence, 2014, 40(3): 441-452.
doi: 10.1007/s10489-013-0463-3
[2] Jeyapriya A, Selvi C S K. Extracting Aspects and Mining Opinions in Product Reviews Using Supervised Learning Algorithm[C]// Proceeding of the 2nd International Conference on Electronics and Communication Systems. IEEE: 548-552.
[3] Abas A R, El-Henawy I, Mohamed H, et al. Deep Learning Model for Fine-Grained Aspect-Based Opinion Mining[J]. IEEE Access, 2020, 8: 128845-128855.
doi: 10.1109/ACCESS.2020.3008824
[4] 徐福, 黄贤英, 蒋兴渝, 等. 用于方面提取的软原型增强自适应损失模型[J]. 计算机应用研究, 2021, 38(11): 3310-3315.
[4] ( Xu Fu, Huang Xianying, Jiang Xingyu, et al. Soft Prototype Enhanced Adaptive Loss Model for Aspect Extraction[J]. Application Research of Computers, 2021, 38(11): 3310-3315.)
[5] Sun Y, Wang S H, Li Y K, et al. ERNIE: Enhanced Representation Through Knowledge Integration[OL]. arXiv Preprint, arXiv: 1904.09223.
[6] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[7] Hu M Q, Liu B. Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004: 168-177.
[8] Qiu G, Liu B, Bu J J, et al. Expanding Domain Sentiment Lexicon Through Double Propagation[C]// Proceedings of the 21st International Joint Conference on Artificial Intelligence. 2009: 1199-1204.
[9] Lakkaraju H, Bhattacharyya C, Bhattacharya I, et al. Exploiting Coherence for the Simultaneous Discovery of Latent Facets and Associated Sentiments[C]// Proceedings of the 11th SIAM International Conference on Data Mining. 2011: 498-509.
[10] Li S, Zhou L N, Li Y J. Improving Aspect Extraction by Augmenting a Frequency-Based Method with Web-Based Similarity Measures[J]. Information Processing & Management, 2015, 51(1): 58-67.
doi: 10.1016/j.ipm.2014.08.005
[11] 周清清, 章成志. 在线用户评论细粒度属性抽取[J]. 情报学报, 2017, 36(5): 484-493.
[11] ( Zhou Qingqing, Zhang Chengzhi. Fine-Grained Aspect Extraction from Online Customer Reviews[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(5): 484-493.)
[12] Andrzejewski D, Zhu X J, Craven M. Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors[C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 25-32.
[13] Lin C H, He Y L, Everson R, et al. Weakly Supervised Joint Sentiment-Topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
doi: 10.1109/TKDE.2011.48
[14] Luo W J, Zhuang F Z, Zhao W Z, et al. QPLSA: Utilizing Quad-Tuples for Aspect Identification and Rating[J]. Information Processing & Management, 2015, 51(1): 25-41.
doi: 10.1016/j.ipm.2014.08.004
[15] Jin W, Ho H H. A Novel Lexicalized HMM-Based Learning Framework for Web Opinion Mining[C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 465-472.
[16] Li X, Lam W. Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 2886-2892.
[17] Wu C H, Wu F Z, Wu S X, et al. A Hybrid Unsupervised Method for Aspect Term and Opinion Target Extraction[J]. Knowledge-Based Systems, 2018, 148: 66-73.
doi: 10.1016/j.knosys.2018.01.019
[18] Yu J F, Jiang J, Xia R. Global Inference for Aspect and Opinion Terms Co-Extraction Based on Multi-Task Neural Networks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019, 27(1): 168-177.
doi: 10.1109/TASLP.2018.2875170
[19] Peters M E, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv: 1802.05365.
[20] Chen Q, Zhuo Z, Wang W. BERT for Joint Intent Classification and Slot Filling[OL]. arXiv Preprint, arXiv: 1902.10909.
[21] Li X Y, Zhang H, Zhou X H. Chinese Clinical Named Entity Recognition with Variant Neural Structures Based on BERT Methods[J]. Journal of Biomedical Informatics, 2020, 107: 103422.
doi: 10.1016/j.jbi.2020.103422
[22] Wang Q C, Liu P Y, Zhu Z F, et al. A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning[J]. Applied Sciences, 2019, 9(21): 4701.
doi: 10.3390/app9214701
[23] Wang X L, Xu H, Sun X M, et al. Combining Fine-Tuning with a Feature-Based Approach for Aspect Extraction on Reviews[C]// Proceedings of the 2020 AAAI Conference on Artificial Intelligence, 2020: 13951-13952.
[24] Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882
[25] Taylor W L. “Cloze Procedure”: A New Tool for Measuring Readability[J]. Journalism Quarterly, 1953, 30(4): 415-433.
doi: 10.1177/107769905303000401
[26] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[27] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
pmid: 9377276
[28] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF Models for Sequence Tagging[OL]. arXiv Preprint, arXiv: 1508.01991.
[29] Micikevicius P, Narang S, Alben J, et al. Mixed Precision Training[OL]. arXiv Preprint, arXiv: 1710.03740.
[1] Wang Wei, Gao Ning, Xu Yuting, Wang Hongwei. Topic Evolution of Online Reviews for Crowdfunding Campaigns[J]. 数据分析与知识发现, 2021, 5(10): 103-123.
[2] Shen Zhuo,Li Yan. Mining User Reviews with PreLM-FT Fine-Grain Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[3] Yu Bengong,Zhang Peihang,Xu Qingtang. Selecting Products Based on F-BiGRU Sentiment Analysis[J]. 数据分析与知识发现, 2018, 2(9): 22-30.
[4] Wu Jiang,Liu Wanwan. Identifying Reviews with More Positive Votes——Case Study of Amazon.cn[J]. 数据分析与知识发现, 2017, 1(9): 16-27.
[5] Li Hui,Hu Yunfeng. Analyzing Online Reviews with Dynamic Sentiment Topic Model[J]. 数据分析与知识发现, 2017, 1(9): 74-82.
[6] Zhang Yanfeng,Li He,Peng Lihui,Hou Litie. Identifying Useful Online Reviews with Semantic Feature Extraction[J]. 数据分析与知识发现, 2017, 1(12): 74-83.
[7] Yang Haixia,Wu Weifang,Sun Hanlin. Analyzing Travelers’ Preferences for Hotels Based on Structural Topic Model[J]. 现代图书情报技术, 2016, 32(9): 51-57.
[8] Zhang Yanfeng,Li He,Peng Lihui. Research on the Brand Switching Intention of Online Product Reviews Based on the Fuzzy Sentiment Calculation[J]. 现代图书情报技术, 2016, 32(5): 64-71.
[9] Gao Song,Wang Hongwei,Feng Gang,Wang Wei. Review of Comparative Opinions Mining Studies of Online Comments[J]. 现代图书情报技术, 2016, 32(10): 1-12.
[10] Du Jiazhong, Xu Jian, Liu Ying. Research on Construction of Feature-Sentiment Ontology and Sentiment Analysis[J]. 现代图书情报技术, 2014, 30(5): 74-82.
[11] Sun Xiaoling, Zhao Yuxiang, Zhu Qinghua. Analyzing the Demand of Online Product Review System’ s Features Using Kano Model: An Empirical Study of Chinese Online Shops[J]. 现代图书情报技术, 2013, (6): 76-84.
[12] Li Zhiyu. Study on the Reviews Effectiveness Sequencing Model of Online Products[J]. 现代图书情报技术, 2013, (4): 62-68.
[13] Zhang Hongbin, Li Guangli. Research on Sentiment Orientation Analysis of Product Online Reviews[J]. 现代图书情报技术, 2012, (10): 61-66.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn