Predictive data analytics for air pollutant data

In view of government’s measure and public health alert on air pollution, air pollutant is a forecast demanding. However, prediction of single air pollutant is not comprehensive as air pollution is caused by various air pollutants. Thus, this project implements Air Quality Index (AQI) to identify th...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Fu, Danli
مؤلفون آخرون: Wong Kin Shun, Terence
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/163583
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:In view of government’s measure and public health alert on air pollution, air pollutant is a forecast demanding. However, prediction of single air pollutant is not comprehensive as air pollution is caused by various air pollutants. Thus, this project implements Air Quality Index (AQI) to identify the level of air quality. We use data provided by the environmental protection department (EPD) in Hong Kong and Hong Kong Observatory (HKO) to predict AQI level through FSP, RSP, NOx, SO2, pressure, air temperature and dew point. Past AQI values are calculated through major pollutants FSP, RSP, SO2 and NOx and then use to forecast the AQI level in the following day. In this project , we use both regression and classification strategies to predict the air quality level for the next day. In regression methodologies, we study autoregressive integrated moving average (ARIMA) model and multilayer perceptron (MLP) model. In classification methodologies, we study decision tree (DT), random forest (RF) and XGBoost. From the experiment results, for our project, there is still considerable error in identifying the level of air pollution by predicting the specific AQI value in the next day. On the other hand, with binary prediction, through experiments we conclude that imbalanced class distribution impacts the accuracy of minority group. This study also investigates feature importance to RF and XGBoost models, it suggests that AQI value is strongly associated with FSP, RSP, SO2 and its value on previous day.