Air pollution modelling is one of the key tools for researchers, scientists, and urban planners to support the sustainable development of the urban environment. This modelling tool is critical for the users in the age of rapid urbanization to understand pollution distribution in the modelling area. Recent updates in air quality regulations are challenging the state-of-the-art air pollution modelling techniques by requiring accurate predictions on a high temporal level, i.e. predictions at the hourly level rather than the annual level. Current state-of-the-art models are designed to have good prediction accuracy on the low temporal resolution by assuming that the pollution is in steady state. Making predictions on higher temporal resolution violates this assumption and cause inaccurate predictions. There are existing statistical modelling approaches for air pollution modelling, however, these approaches also struggle to make accurate predictions on higher temporal resolution.

This work is looking into the development of a statistical regression based air pollution model which produces accurate high temporal level predictions by utilizing advanced regression algorithm to exploit the hidden knowledge in data with high temporal resolution. The analysis of the predictions of multiple advanced statistical regression algorithms is investigated to determine the most accurate approach hence the Random Forest Regression method is proposed for the given regression task. A novel model ensemble method is then developed to utilize multiple Random Forest Regression models trained on the different subset of the available input data. Motivated by the high computational requirement of the developed methods, this thesis also investigates the scalability and the robustness of the developed methods. Based on the experience gained from this investigation, this work proposes further model ensemble methods to improve the accuracy of the statistical regression approach for air pollution modelling.

The developed air pollution model presented in this thesis produces more accurate hourly concentration level predictions than the current state-of-the-art method, hence, the approach gives the opportunity for better understanding of the pollution in the urban area.

Download Not Available

BibTex Entry

 author = {Gabor Makrai},
 day = {1},
 month = {1},
 publisher = {University of York},
 school = {University of York},
 title = {Reducing the Errors in High Resolution Environmental Modelling},
 url = {},
 year = {2018}