Air pollution, a critical issue stemming from economic growth and industrial development, leads to the release of harmful gases and particles into the air. These pollutants are key contributors to smog, acid rain, and the greenhouse effect, which in turn cause adverse weather conditions, global temperature changes, and detrimental effects on ecosystems [
1]. Furthermore, air pollution poses substantial health risks, including respiratory disorders, lung cancer, and exacerbation of conditions like asthma and allergies [
2]. Urbanization amplifies these challenges in densely populated areas.
The role of AIoT (Artificial Intelligence of Things) has become increasingly significant in understanding and managing air pollution, particularly in monitoring and analyzing carbon emissions. AIoT technologies enable more sophisticated and comprehensive monitoring of air quality by integrating automated monitoring stations with advanced data analytics [
3]. This integration facilitates real-time tracking of pollutant emissions, offering granular insights into their sources and dispersion patterns. Air quality forecasting, crucial for effective pollution control, has evolved with the application of AIoT. This approach enhances traditional statistical models, including linear and nonlinear machine capturing algorithms [
4], with deep learning techniques. These AI-driven models excel in extracting relevant features and capturing temporal dependencies in air quality data [
5,
6].The explosion of time series data, fueled by technological advancements, industrialization, and the proliferation of sensors, has been a game changer in this field. Time series data, a chronological sequence of data points, is instrumental in various aspects of life, including environmental monitoring [
7,
8]. AIoT technologies significantly enhance the utility of time series data by enabling more efficient processing and analysis, leading to better predictions and understanding of air quality trends.
Advanced forecasting techniques, augmented by AIoT, provide critical insights for researchers and policymakers. These insights are essential for designing effective pollution management and mitigation strategies, improving the understanding of the complex interplay between human activities, atmospheric conditions, and the environment. The proposed EEMD-CEEMDAN-GCN model, enhanced by AIoT, addresses the challenges in accurately capturing the dynamics of air quality data. It represents a breakthrough in the field of air quality forecasting, offering a sophisticated approach to dealing with the intricacies of atmospheric data, with a particular focus on carbon emission monitoring. This model’s integration with AIoT technologies marks a significant advancement in emissions monitoring, setting a new standard for accuracy and efficiency in environmental management and sustainability efforts.
Literature review
Time series data prediction has become a crucial aspect of extensive data analysis, profoundly impacting the transformation and optimization of various living and industrial activities, especially in the context of AIoT (Artificial Intelligence of Things) and emissions management [
9,
10]. The integration of AIoT in these analyses significantly enhances the ability to monitor and predict environmental factors, particularly emissions, with greater accuracy and efficiency.
In the realm of emissions forecasting, Patra [
11] utilized multi-layer perceptron (MLP), support vector regression (SVR), and autoregressive integrated moving average (ARIMA) algorithms for one-month carbon dioxide and nitrogen dioxide predictions. This study used the public Air Quality database from the UCI Machine Learning Repository [
12], combined with data from 5 metal oxide chemical detectors in an Air Quality Chemical Multisensory Device, producing 390 instances of daily mean responses. The AIoT framework plays a crucial role here, enabling the integration and analysis of diverse data sources for more precise emissions predictions.
Bekkar et al. [
13] proposed a GCN-based artificial intelligence framework, leveraging AIoT in smart city contexts for enhanced air pollution forecasting. This approach utilizes IoT data to provide real-time, accurate predictions of air quality, focusing on emissions monitoring and control.Feng H et al. [
14] introduced an encoder-decoder model based on deep learning to address data gaps in air quality and meteorological series, with a focus on South Korea. This model, enhanced by AIoT capabilities, significantly improves the prediction of air quality, particularly emissions data, by handling missing values more effectively.
Waseem K H et al. [
15] compared the performance of RNN, GCN, and gated recurrent unit (GRU) [
16] networks for air pollution prediction using AirNet data, with an emphasis on emissions. The integration of these models with AIoT technologies facilitates more sophisticated analysis and forecasting of pollution levels. Qi Z et al. [
17] developed a deep air adaptable technique that combines feature selection with a spatiotemporal semi-supervised neural network, an approach that can greatly benefit from AIoT in capturing the dynamic nature of emissions data across different geographies. Ahmed M et al. [
18] estimated the Air Quality Index (AQI) using an RNN-GCN model, an approach that can be significantly enhanced by incorporating AIoT for real-time emissions monitoring and prediction. Lastly, Masih [
19] reviewed machine learning techniques in environmental science and engineering research, highlighting the importance of ensemble learning, linear regression, neural networks, and SVM for pollution estimation and forecasting tasks. The integration of these techniques with AIoT technologies presents a significant advancement in emissions monitoring, offering more accurate, efficient, and timely predictions, which is crucial for effective environmental management and policy-making.
The studies evaluate the efficiency of the models using metrics that is RMSE, MAE, and R2, demonstrating the effectiveness of deep learning approaches in air pollution estimation and forecasting.
According to the relevant literature, developing a numerical model poses challenges due to meteorological systems’ complex and uncertain nature, resulting in low forecast accuracy. The ability of statistical models to predict unpredictable regressive sequence data depends on the data’s consistency. When approximating nonlinear sequential data, machine-learning and deep-learning techniques offer adaptive skills and advantages. However, they still need help in learning and achieving high prediction accuracy with nonstationary data. Furthermore, when neural networks are employed directly for modeling the Atmospheric Quality measure (AQI), which is a composite measure that inherits the fluctuation and variability characteristics of the meteorological framework, it has a detrimental impact on prediction models and results in low accuracy.
Researchers have explored integrated prediction models to overcome these limitations to enhance forecast stability and accuracy. The empirical modal decomposition approach, specifically complementary ensemble empirical mode decomposition-SVR (CEEMD-SVR) [
20], has shown promising results in predicting PM
2.5 mass concentration. The CEEMD-Elman model [
21] rely on empirical mode decomposition (EMD) has provided a foundation for successful AQI trend prediction. Techniques involving complementary sets [
22] EMD combined with GCN neural networks have been proposed for enhancing short-term power load prediction accuracy. A air velocity combined estimation process [
23] based on empirical ensemble mode decomposition (EEMD) has been developed to enhance EMD mode mixing. Additionally, combining GCN model with signal decomposition techniques, such as EMD, has proven effective in improving prediction accuracy for various applications, including hourly concentration prediction [
24].
The study underscores the importance of meticulously analyzing and preprocessing the original data before developing predictive models, a process significantly enhanced by AIoT (Artificial Intelligence of Things) technologies. EEMD (Ensemble Empirical Mode Decomposition) and its advanced iteration, CEEMDAN (Complete Ensemble Empirical Mode Decomposition with Adaptive Noise) [
25,
26], are identified as algorithms adept at tackling these preprocessing challenges. When combined with AIoT, these methods can handle large-scale data from various IoT sensors, ensuring more refined and accurate data preparation for subsequent analysis. Hybrid models that merge EEMD or CEEMDAN with deep learning techniques have demonstrated increased accuracy in applications like financial time series forecasting and short-term stock price trend prediction [
26]. The integration of these models with AIoT platforms enables the handling of vast and complex data sets, typical in financial markets, enhancing prediction accuracy and reliability.
GCN (Graph Convolutional Network) is recognized as an effective strategy for predicting chaotic time series [
27‐
29], and its application within an AIoT framework allows for more sophisticated analysis of data characterized by high volatility and unpredictability. This is particularly useful in industries where data is influenced by a multitude of interconnected factors, such as energy or traffic management. The “decomposition before reconstruction” paradigm, primarily utilizing EEMD and CEEMDAN [
30], has proven successful in various forecasting domains, including PM2.5 prediction and long-term stream-flow forecasting. When these decomposition methods are applied in conjunction with AIoT, they bring additional benefits such as the ability to process large volumes of environmental data, effectively overcoming mode mixing issues and achieving low reconstruction errors. This makes them particularly suitable for time series decomposition in studies where IoT devices are used for environmental monitoring and data collection. The integration of these methods with AIoT technologies thus enhances the overall efficiency and accuracy of time series analysis, especially in applications requiring high precision, such as environmental monitoring and resource management.
Additionally, a number of investigators have recently highlighted to the benefits of a graph neural network in fields like flow of traffic estimation [
31‐
34], parking availability prediction [
35], pedestrian trajectories prediction [
36,
37]. These benefits have also been applied to other domains, such as air quality, and several authors have employed neural networks with graphs for predicting air quality. These benefits have also been applied to other domains, such as air quality, and a few researchers have employed graph neural networks for forecasting air quality. Using records for the Beijing-Tianjin-Hebei and Pearl River Delta urban areas, Han et al. [
38] put forward the Self-Supervised a hierarchy Graph Neural Network based on cities-functional zones-regions hierarchical graph network to perform extremely fine air quality prediction. To perform the Air Quality Index (AQI) predictions, Ram et al. [
39] proposed a Dual GCN (DGCN) and LSTM network combined with a wireless sensor network and Internet of Things (IoT). DGCN aids in processing the sensor’s data that was later processed by the graph LSTM [
40‐
43].
The existing literature on air quality prediction models reveals a substantial research gap in effectively addressing the unpredictable and nonlinear nature of atmospheric conditions, leading to challenges in achieving accurate predictions. While statistical, machine learning, and deep learning models have been extensively employed, limitations persist. To bridge this gap, a novel approach integrating empirical mode decomposition (EMD) techniques, specifically complementary ensemble empirical mode decomposition (CEEMDAN) and enhanced empirical mode decomposition (EEMD), with the Graph Convolutional Network (GCN) is introduced. GCN is used to effectively record the topological data of the whole monitoring network. This innovative hybrid model, EEMD-CEEMDAN-GCN, aims to overcome the shortcomings of traditional models and enhance prediction accuracy in the dynamic field of air pollution forecasting. The proposed model’s application to real-world datasets, specifically the Air Quality dataset, substantiates its superiority over existing methods like GCN, EMD-GCN, EEMD-GCN, CEEMDAN-GCN, EMD-CEEMDAN-GCN and EEMD-CEEMDAN-GCN.