Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance?
Abstract
Housing frenzies in China have attracted widespread global attention over the past few years, but the key is how to more accurately forecast housing prices in order to establish an effective real estate policy. Based on the ubiquitousness and immediacy of Internet data, this research adopts a broader version of text mining to search for keywords in relation to housing prices and then evaluates the predictive abilities using machine learning algorithms. Our findings indicate that this new method, especially random forest, not only detects turning points, but also offers prediction ability that clearly outperforms traditional regression analysis. Overall, the prediction based on online search data through a machine learning mechanism helps us better understand the trends of house prices in China.
First published online 10 June 2020
Keyword : housing frenzies, Internet search, text mining, machine learning
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Baker, S., & Fradkin, A. (2017). The impact of unemployment insurance on job search: evidence from Google search data. Review of Economics and Statistics, 99, 756–768. https://doi.org/10.1162/REST_a_00674
Beracha, E., & Wintoki, M. B. (2013). Forecasting residential real estate price changes from online search activity. Journal of Real Estate Research, 35, 283–312. https://aresjournals.org/ doi/abs/10.5555/rees.35.3.c0ru080q45n34064
Chauvet, M., Gabriel, S. A., & Lutz, C. (2016). Mortgage default risk: new evidence from internet search queries. Journal of Urban Economics, 96, 91–111. https://doi.org/10.1016/j.jue.2016.08.004
Chen, J., Guo, F., & Wu, Y. (2011). One decade of urban housing reform in China: urban housing price dynamics and the role of migration and urbanization, 1995-2005. Habitat International, 35, 1–8. https://doi.org/10.1016/j.habitatint.2010.02.003
Chen, J., Ong, C., Zheng, L., & Hsu, S. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21, 273–283. https://doi.org/10.3846/1648715X.2016.1259190
Chen, Y., Liu, X., Li, X., Liu, Y., & Xu, X. (2016). Mapping the fine-scale spatial pattern of housing rent in the metropolitan area by using online rental listings and ensemble learning. Applied Geography, 75, 200–212. https://doi.org/10.1016/j.apgeog.2016.08.011
Chiang, S. (2014). Housing markets in China and policy implications: co-movement or ripple effect. China & World Economy, 22, 103–120. https://doi.org/10.1111/cwe.12094
Choi, H., & Varian, H. (2012). Predicting the present with Google trends. Economic Record, 88, 2–9. https://doi.org/10.1111/j.1475-4932.2012.00809.x
Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. Journal of Finance, 66, 1461–1499. https://doi.org/10.1111/j.1540-6261.2011.01679.x
Ettredge, M., Gerdes, J., & Karuga, G. (2005). Using web-based search data to predict macroeconomic statistics. Communications of the ACM, 48, 87–92. https://doi.org/10.1145/1096000.1096010
Ginsberg, J., Mohebb, M. H., Patel, R. S., Brammer, L., Smolinsky, M. S., & Brilliant, L. (2009). Detecting influence epidemics using search engine query data. Nature, 457, 1012–1014. https://doi.org/10.1038/nature07634
Glaeser, E., Huang, W., Ma, Y., & Shleifer, A. (2017). A real estate boom with Chinese characteristics. Journal of Economic Perspectives, 31, 93–116. https://doi.org/10.1257/jep.31.1.93
Gong, Y., Hu, J., & Boelhouwer, P. J. (2016). Spatial interrelations of Chinese housing markets: spatial causality, convergence and diffusion. Regional Science and Urban Economics, 59, 103–117. https://doi.org/10.1016/j.regsciurbeco.2016.06.003
Guzman, G. (2011). Internet search behavior as an economic forecasting tool: the case of inflation expectation. Journal of Economic and Social Measurement, 36, 119–167. https://doi.org/10.3233/JEM-2011-0342
Howard, J., & Bowles, M. (2012). The two most important algorithms in predictive modeling today. In Strata Conference: Santa Clara.
Hu, L., He, S., Han, Z., Xiao, H., Su, S., Weng, M., & Cai, Z. (2019). Monitoring housing rental prices based on social media: an integrated approach of machine-learning algorithms and hedonic modelling to inform equitable housing policies. Land Use Policy, 82, 657–673. https://doi.org/10.1016/j.landusepol.2018.12.030
Hui, E. C. M., & Yue, S. (2006). Housing price bubbles in Hong Kong, Beijing and Shanghai: a comparative study. Journal of Real Estate Finance and Economics, 33, 299–327. https://doi.org/10.1007/s11146-006-0335-2
Jirong, G., Zhu, M., & Jiang, L. (2011). Housing price forecasting based on genetic algorithm and support vector machine. Expert Systems with Applications, 38, 3383–3386. https://doi.org/10.1016/j.eswa.2010.08.123
Lee, C., Liang, C., & Liu, Y. (2019). A comparison of the predictive powers of tenure choices between property ownership and renting. International Journal of Strategic Property Management, 23, 130–141. https://doi.org/10.3846/ijspm.2019.7064
Lee, C., Lee, C., & Chiang, S. (2016). Ripple effect and regional house prices dynamics in China. International Journal of Strategic Property Management, 20, 397–408. https://doi.org/10.3846/1648715X.2015.1124148
Lee, K. O., & Mori, M. (2016). Do conspicuous consumers pay higher housing premiums? Spatial and temporal variation in the United States. Real Estate Economics, 44, 726–728. https://doi.org/10.1111/1540-6229.12115
Liu, T., Chang, H., Su, C., & Jiang, X. (2016). China’s housing bubble burst? Economics of Transition, 24, 361–389. https://doi.org/10.1111/ecot.12093
Maclennan, D., & O’Sullivan, A. (2012). Housing markets, signals and search. Journal of Property Research, 29, 324–340. https://doi.org/10.1080/09599916.2012.717102
Mullainathan, S., & Obermeyer, Z. (2017). Does machine learning automate moral hazard and error? American Economic Review, 107, 476–480. https://doi.org/10.1257/aer.p20171084
Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31, 87–106. https://doi.org/10.1257/jep.31.2.87
Nardo, M., Petrcco-Giudici, M., & Naltsidis, M. (2015). Walking down Wall Street with a tablet: a survey of stock market predictions using the Web. Journal of Economic Survey, 30, 356–369. https://doi.org/10.1111/joes.12102
Park, B., & Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: the case of Fairfax county, Virginia housing data. Expert Systems with Applications, 42, 2928–2934. https://doi.org/10.1016/j.eswa.2014.11.040
Piazzesi, M., Schneider, M., & Stroebel, J. (2020). Segmented housing search. American Economic Review, 110, 720−759. https://doi.org/10.1257/aer.20141772
Plakandaras, V., Gupta, R. Gogas, P., & Papadimitriou, T. (2015). Forecasting the U.S. real house price index. Economic Modelling, 45, 259–267. https://doi.org/10.1016/j.econmod.2014.10.050
Rae, A. (2015). Online housing search and the geography of submarkets. Housing Studies, 30, 453–472. https://doi.org/10.1080/02673037.2014.974142
Rae, A., & Sener, E. (2016). How website users segment a city: the geography of housing search in London. Cities, 52, 140–147. https://doi.org/10.1016/j.cities.2015.12.002
Ren, Y., Xiong, C., & Yuan, Y. (2012). House price bubbles in China. China Economic Review, 23, 786–800. https://doi.org/10.1016/j.chieco.2012.04.001
Tan, Y., Xu, H., & Hui, E. C. M. (2017). Forecasting property price indices in Hong Kong based on a grey model. International Journal of Strategic Property Management, 21, 256–272. https://doi.org/10.3846/1648715X.2016.1249535
Tsai, I., & Chiang, S. (2019). Exuberance and spillovers in housing markets: evidence from first- and second-tier cities in China. Regional Science and Urban Economics, 77, 75–86. https://doi.org/10.1016/j.regsciurbeco.2019.02.005
Van Dijk, D. W., & Francke, M. K. (2018). Internet search behavior, liquidity and prices in the housing market. Real Estate Economics, 46, 368–403. https://doi.org/10.1111/1540-6229.12187
Van Veldhuizen, S., Vogt, B., & Vogt, B. (2016). Internet searches and transactions on the Dutch housing market. Applied Economics Letters, 23, 1321–1324. https://doi.org/10.1080/13504851.2016.1153785
Varian, H. R. (2014). “Big data”: new tricks for econometrics. Journal of Economic Perspectives, 28, 3–28. https://doi.org/10.1257/jep.28.2.3
Weng, Y., & Gong, P. (2017). On price co-movement and volatility spillover effects in China’s housing markets. International Journal of Strategic Property Management, 21, 240–255. https://doi.org/10.3846/1648715X.2016.1271369
Wu, J., & Deng, Y. (2015). Intercity information diffusion and price discovery in housing markets: evidence from Google searches. Journal of Real Estate Finance and Economics, 50, 289–306. https://doi.org/10.1007/s11146-014-9493-9
Wu, L., & Brynjolfsson, E. (2015). The future of prediction: how Google searches foreshadow housing prices and sales (Working Paper). National Bureau for Economic Research. https://doi.org/10.7208/chicago/9780226206981.003.0003
Zheng, S., Sun, W., & Kahn, M. E. (2016). Investor confidence as a determinant of China’s urban housing market dynamics. Real Estate Economics, 44, 814–845. https://doi.org/10.1111/1540-6229.12119