5.1 Traffic Flow Forecasting
Traffic flow typically connects to the number of vehicles, crowds, and passengers passing through a particular space, such as a road segment, sensor point deployed on the road, or bus/subway station in an observed time interval. Accurate traffic prediction can help to reveal real-time traffic demands and be critical for traffic management, public safety, route planning, line scheduling, and staff preallocation.
For spatial representation of traffic flow forecasting, researchers usually utilize grid-based [
33], multi-graph [
35] and dynamic graph methods [
55]. For instance, Zhang et al. [
33] partitioned cities into regular grid maps based on geographical coordinates and organized the collected traffic data as Euclidean 2D or 3D tensors, so that CNNs can be applied to extract spatial topologies. Du et al. [
85] designed a hybrid multi-modal learning method to learn spatio-temporal dependencies for short-term traffic flow forecasting jointly.
These grid-based methods are suitable for predicting traffic region data but cannot model the non-linear graph structure. Therefore, several multi-graph fusion methods are proposed to learn the spatial physical and semantic information. For example, PVCGN [
35] could effectively capture complex ridership correlation from tailored traffic graphs. Specifically, a physical graph was directly constructed based on the real-world station topology connection. In addition, a similarity graph and a correlation graph as complementary graphs were designed to reveal the similarity and correlation of inter-station passenger flow. This paper incorporated a static graph and two virtual graphs into the graph convolution gated recurrent unit to learn the spatio-temporal relation and applied a fully connected gated recurrent unit (FC-GRU) to model the global evolution information. Finally, a Seq2Seq model with GC-GRU and FC-GRU was employed for metro ridership forecasting.
This particular method of predicting traffic by combining multiple graphs relies on predefined spatial dependencies, which are based on prior knowledge. However, the spatial relationships between traffic data constantly evolve at different time steps. Therefore, Xie et al. [
55] designed a dynamic graph convolutional network to learn spatial features, leveraged transformer to obtain long-range temporal information, and employed gated fusion to combine spatial features and temporal dependencies for urban subway station passenger flow forecasting.
The typical methods for temporal representation of traffic flow forecasting are RNN and its variants for short-term traffic flow prediction [
35,
85,
86]. and the self-attention-based methods for long-term traffic flow prediction [
55]. Li et al. [
86] utilized the GNN and residual lstm for the traffic flow prediction. First, this paper calculated the correlation coefficient with min-max normalization to remove spatial heterogeneity. Then, it employed z-score transformation to eliminate daily periodicity for stronger temporal auto-correlation.
In addition, some researchers regard traffic flow prediction as a service-level/line-level task [
88,
89]. For example, Luo et al. [
89] proposed a spatio-temporal hashing multi-graph convolution network, where two types of sub-graphs were constructed from perspectives of physical adjacency and semantic similarity, respectively. This model explicitly captures spatio-temporal dependencies among bus stations/lines. Luo et al. [
88] designed the MDL-SPFP model to jointly predict the arriving bus service flow, line-level on-board passenger flow, and line-level boarding/alighting passenger, which combines three modules, attention mechanism, residual block, and multi-scale convolution, to capture various complex non-linear spatio-temporal dependencies well.
Below, we discuss the differences among the passenger flow forecasting tasks at station, line, and bus service levels. The passenger flow forecasting at the station level is to predict the passenger flow of each station without distinguishing the passenger flow of different lines, while the passenger flow at the line level is to distinguish the passenger flow of different stations on different lines, which is a more fine-grained passenger flow forecasting. The passenger flow forecast of bus service level will further distinguish the passenger flow of different stations, lines, and vehicles. Therefore, the ridership prediction at the line/bus service level will incorporate additional prior knowledge, such as the specific line and vehicle information. Moreover, regarding spatio-temporal modeling, the inclusion of intermediate hubs and their influence will be considered.
5.2 Traffic Speed Forecasting
Traffic speed is generally based on the average speed of vehicles through certain locations, such as a sensor point in the observed time interval. This task is mainly about vehicles, researchers develop spatio-temporal modeling methods to predict the speed of vehicles, which is beneficial for travel planning.
For the spatial representation of traffic speed forecasting, sensors installed on the highway road are irregular. The different sensors are connected through a graph structure. Thus, scholars generally use graph neural networks to capture the spatial correlation for traffic speed prediction. For the temporal representation of traffic speed forecasting, some TCN-based [
44], causal TCN-based methods [
45,
46] are proposed to capture different receptive temporal features. For instance, Liu et al. [
43] proposed a bidirectional diffusion convolution framework to model spatial dependency and a sequence-to-sequence architecture with GRU was employed to extract temporal dependency for traffic speed forecasting. This is a pioneering work of traffic speed prediction based on the graph neural network. Afterward, researchers established different models based on the distinct characteristics of traffic speed data. To learn spatio-temporal relations synchronously, STSGCN [
47] elaborately constructed a spatio-temporal synchronous modeling mechanism to learn localized spatio-temporal correlations and designed multiple modules with different periods to model spatio-temporal heterogeneity for traffic speed forecasting.
Due to shallow GNNs incapable of capturing long-range spatial correlations, Fang et al. [
23] developed a tensor-based ordinary differential equation (ODE) network to model spatio-temporal dependencies for traffic speed forecasting. This work applied deeper networks to learn spatio-temporal features synchronously, which constructed a semantical adjacency matrix to obtain spatial features and elaborately designed a temporal dilated convolution structure to extract long-term temporal dependencies. To obtain the dynamic spatio-temporal dependencies, Lu et al. [
90] combined the graph sequence neural network with a horizontal attention mechanism and a vertical attention mechanism to process graph sequences for traffic speed prediction.
5.3 Traffic Demand Forecasting
Traffic demand is the number of passengers with pick-up or drop-off demands, such as ride-sharing, taxi, or bike sharing, for a particular region in the observed time interval. Accurate traffic demand prediction can help to guide an efficient disposition of supplies.
For the spatial representation of traffic demand forecasting, scholars usually employ grid-based [
85], multi-graph fusion [
36], and dynamic graph methods [
58]. For the temporal representation of traffic demand forecasting, general methods are RNN and its variants [
36,
91]. Early, Du et al. [
91] designed a dynamic transition CNN to obtain spatial distributions for traffic demand forecasting and to learn dynamic demand evolution. This grid-based traffic demand prediction method utilizes deep learning methods to model the spatial and temporal correlation. Then, Geng et al. [
36] constructed a neighboring graph, a functional similarity graph, and a transportation connectivity graph to learn the non-Euclidean spatial dependencies. This work is a grid-based spatial structure. The neighborhood graph was designed based on spatial proximity. The functional similarity was used by point-of-interest similarity vectors. The transportation connectivity was constructed based on the connection through motorways, highways, or public transportation. It utilized GNN to learn the three graph features fused the outputs and applied a contextual gated RNN to model temporal features for ride-hailing demand forecasting. This study employs various predefined geographic adjacency or other function graphs to represent the complex spatial semantic information, but it ignores the dynamic spatial and temporal modeling. Therefore, Huang et al. [
58] developed a dynamical spatial-temporal GNN model to achieve the traffic demand prediction task.