Predicting Prices and Analyzing Features of Online Short-Term Rentals Based on XGBoost
Cao Rui1,Liao Bin1(),Li Min1,2,Sun Ruina1,3,4
1College of Statistics and Data Science, Xinjiang University of Finance & Economics, Urumqi 830012, China 2School of Information Science and Engineering, Xinjiang University, Urumqi 830008, China 3Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China 4School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] This paper proposed a model to predict prices and analyze properties of online short-term rentals based on XGBoost, aiming to address the issue of lacking reasonable pricing suggestion mechanism for housing with different characteristics. [Methods] We collected data from the Airbnb platform and used Lasso to extract features from these raw data as well as reduced their dimensions. Then, we input the extracted data to XGBoost and iteratively trained the prediction model. Finally, we used the SHAP value to interpret the model features. [Results] The RMSE, MAE and R-squared values of the proposed model were 0.091, 0.065 and 0.798 respectively after tuning the hyperparameters, which were better than those of the four existing models. [Limitations] Our new model could not merge the features of real-time online business data, which influenced the prediction accuracy. [Conclusions] The proposed model has good interpretability, and could identify the key factors affecting housing prices, which helps the landlords improve services.
曹睿,廖彬,李敏,孙瑞娜. 基于XGBoost的在线短租市场价格预测及特征分析模型*[J]. 数据分析与知识发现, 2021, 5(6): 51-65.
Cao Rui,Liao Bin,Li Min,Sun Ruina. Predicting Prices and Analyzing Features of Online Short-Term Rentals Based on XGBoost. Data Analysis and Knowledge Discovery, 2021, 5(6): 51-65.
(Wang Baoqian, Deng Fei. Research on Market Pricing Factors Based on Consumer Preferences to Choose Short Rental[J]. Statistics & Information Forum, 2018,33(7):92-99.)
Wang D, Nicolau J L. Price Determinants of Sharing Economy Based Accommodation Rental: A Study of Listings from 33 Cities on Airbnb.com[J]. International Journal of Hospitality Management, 2017,62:120-131.
(Xu Yan, Dai Fei. Research on Canvas Innovation of Online Short-Term Business Model Under the Sharing Economy——Based on the Comparative Analysis of the Short-Term Business Model of Piglet and the Short-Term Rent of Tujia[J]. Price: Theory & Practice, 2019(6):137-140.)
(Zhao Jianxin, Zhu Ge, Song Lingyu. Influencing Factors of User Decision via Online Short-rent Platform[J]. Journal of Beijing University of Posts and Telecommunications(Social Sciences Edition), 2017,19(5):56-61.)
(Dong Qian, Sun Nana, Li Wei. Real Estate Price Prediction Based on Web Search Data[J]. Statistical Research, 2014,31(10):81-88.)
邓磊. 基于机器学习的酒店价格预测分析[D]. 南京: 东南大学, 2017.
(Deng Lei. The Analysis of Hotel Price Prediction Based on Machine Learning[D]. Nanjing: Southeast University, 2017.)
Zhang H L, Zhang J, Lu S J, et al. Modeling Hotel Room Price with Geographically Weighted Regression[J]. International Journal of Hospitality Management, 2011,30(4):1036-1043.
夏学文. 商品房价格预测模型及其应用[J]. 统计学与应用, 2017,6(1):81-86.
(Xia Xuewen. The Price Forecast Model of Commodity Houses and Its Application[J]. Statistics and Application, 2017,6(1):81-86.)
(Xie Yong, Xiang Wei, Ji Mengzhong, et al. An Application and Analysis of Forecast Housing Rental Based on XGBoost and LightGBM Algorithms[J]. Computer Applications and Software, 2019,36(9):151-155,191.)
Hu L R, He S J, Han Z X, et al. Monitoring Housing Rental Prices Based on Social Media: An Integrated Approach of Machine-Learning Algorithms and Hedonic Modeling to Inform Equitable Housing Policies[J]. Land Use Policy, 2019,82:657-673.
Parsa A B, Movahedi A, Taghipour H, et al. Toward Safer Highways, Application of XGBoost and SHAP for Real-Time Accident Detection and Feature Analysis[J]. Accident Analysis & Prevention, 2020,136:105405.
Mangalathu S, Hwang S H, Jeon J S. Failure Mode and Effects Analysis of RC Members Based on Machine-learning-based SHapley Additive exPlanations (SHAP) Approach[J]. Engineering Structures, 2020,219:110927.
Xu J S, Saleh M, Hatzopoulou M. A Machine Learning Approach Capturing the Effects of Driving Behaviour and Driver Characteristics on Trip-Level Emissions[J]. Atmospheric Environment, 2020,224:117311.
Sánchez-Franco M J, Alonso-Dos-Santos M. Exploring Gender-Based Influences on Key Features of Airbnb Accommodations[J/OL]. Economic Research-Ekonomska Istraživanja, https://doi.org/10.1080/1331677X.2020.1831943.
Chen T Q, Guestrin C. XGBoost: A Scalable Tree Boosting System[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 785-794.