NEURAL NETWORK MODEL FOR PREDICTING PASSENGER CONGESTION TO OPTIMIZE TRAFFIC MANAGEMENT FOR URBAN PUBLIC TRANSPORT

The development of public transport in cities is an effective way to reduce “congestion” in the road network and, as a result, increase the speed of passenger transportation. Improving the qua¬lity of urban bus services helps attract more passengers. Bus intervals are calculated once for each route line individually, based on the average congestion of passengers at the stops. In turn, the sudden accumulation of a large number of passengers at bus stops causes that not all passengers can move in a timely manner, which causes concern for passengers. This is one of the factors that redu¬ces the quality of passenger transport services. The aim of the study is to develop a model for predicting the congestion of passengers at bus stops to optimize traffic management of urban public transport. Materials and methods. This article presents a neural network model for predicting passenger congestion at bus stops. It takes into account the spatio-temporal characteristics of bus traffic. Results. The developed model for predicting passenger congestion at bus stops was tested on real data from bus route 3 (Dushanbe, Tajikistan). The model made it possible to predict passenger traffic (the number of passengers at bus stops) with an accuracy of 72% to 74.5% of the actual number of passengers at bus stops. Conclusion. The proposed method, in contrast to other methods, allows you to automatically adapt the forecasting model to the changing conditions of the route line. This method is universal and can be used for other route lines (bus stops). It does not require much time to reconfigure.

pre-holidays. The unevenness of the hours of the day is characterized by a sharp increase in the number of passengers during peak hours preceding the start and end of work, as well as at the start and end of the work of entertainment enterprises [1].
In order to maximally satisfy the demand of passenger traffic for travel by bus, it is necessary to develop dynamic route schedules for the movement of buses. But this will require a mathematical model of the passenger traffic of the route line, which, according to the forecast data, builds up various bus traffic.
Chinese scientists (Rui Xue, Daniel Sun and Shukai Chen) [2] have developed a model for the short-term prediction of passenger traffic on the basis of the time series of historical data. The authors propose an interactive multiple model (IMM) filtering algorithm for combining forecasts of time series models to develop a hybrid method for short-term forecasting of passenger traffic (weekly, daily, and 15-minute time series). The IMM algorithm is suitable for predicting traffic conditions in real time. However , in the course of the study, based on this algorithm, it was not possible to build accurate models for taking into account seasonal fluctuations, which led to serious lag in forecasts, and the approach based on the IMM algorithm was not extended to much shorter-term forecasts (for example, 3 minutes and 5 minutes) [3][4][5][6].
For short-term forecasting of passenger traffic, a model was developed based on the use of neural networks (CNN) [7][8][9]. When using this model, the video recording from the video surveillance camera of the bus interior was used as input information. However, the use of a video surveillance camera recording of a bus interior is not always possible and requires coordination with the city services responsible for the safety of the city's population.
Within the framework of the Smart Nation project, Singaporean scientists are working to create a model for predicting passenger traffic (crowds at stopping stations) based on GPS data from citizens' mobile phones [10]. But this approach is unacceptable in most countries of the world, since the law on privacy does not allow the use of confidential data (coordinates of movement) of citizens.
Classic models for forecasting passenger traffic are outdated and are not able to take into account the flexibility of the rapidly changing rate of passenger traffic at different stopping stations and times. Modern smart models, while still evolving, do much better than classic models and look promising. The purpose of this work is to develop a neural network model for predicting the accumulation of passengers at stopping points and a description of the results of its testing to optimize urban public transport traffic control. Implementation of the developed model on the example of public transport in Dushanbe allows you to optimize the dynamic schedule of buses on a given route.

Materials and method
In the city of Dushanbe, since 2017, a non-cash payment system has been implemented in public transport. The system works like all known cashless payment systems, except that the entrance to the bus is provided only from the front door, and the exit from the middle and rear doors. A validator is installed at the entrance, which is very convenient for passengers to immediately pay for their fare at the entrance. Validator reads as a special travel card, so with the application installed on the phone (QR-code for your phone screen) (Fig. 1).  This work is implemented on the example of the movement of public transport (buses) of route line No. 3A (running along the line Vokzal -Marom and Marom -Vokzal). The names of the stopping stations with indication of the conditional numbering are given in Table 1. The conventional designation of the numbering of the stopping stations is made for simplicity. Table 1 The names of the stopping stations with indication of the conditional numbering Buses in route line No. 3А begin to move intensively from 05:00 to 23:06 from the stop station "Railway Station" to "Terminal -Marom" and back. The traffic map of route line No. 3А is shown in Fig. 2. The length of this route is 11.1 km. The number of vehicles on this route, in accordance with demand, is 24 units. Table 2 shows a heat map of the number of passengers depending on the time of day for route line No. 3А (according to the reports of validators [24] installed on the bus interior).
In the work under consideration, daily reports from validators are used as a data set. The reports show the time of the transaction (payment) and the unique identification number of the passenger. Based on these data, we prepared a sample consisting of features such as X1at the time of arrival (landing) of the bus, X2the number of passengers landed from each stop, X3departure time of the previous bus, X4date (in unit days in a year) and X(5-11)days and weeks (in binarized form [11]).
Since the route line under consideration (No. 3A) has 20 stopping stations, therefore, we have a sample for each stopping station. Table 3 shows a fragment of the data sample.   Scaling the values of the elements of the dataset. Neural works poorly with the values of data items that have different ranges. For example, in our case, the bus time can be changed from 300 minutes to 1440 minutes. Such heterogeneity of data can complicate the learning process. It is customary to apply normalization to such heterogeneous data [11]. Normalizationis a broad category of methods that seek to make the similarity of different samples more visible to machine learning models, which helps the model isolate and generalize new data. The most common form of classification is suitable for this task: for each feature in the input data (a column in the input data matrix), the average for this feature i X subtracted from each value of the data element X , and the difference is divided by the standard deviation σ, as a result of the feature it is centered by zero and has a standard deviation of one. .

Artificial neural networks
Modeling using neural networks refers to machine learning methods, but compared to other algorithms, it has a large number of settings, which allows you to approximate nonlinear functions as accurately as possible. The neuron functioning diagram is shown in Fig. 3.

Fig. 3. Scheme of neuron functioning
In mathematical representation, the functioning (see Fig. 3) of neuron k can be described by the following pair of equations: k Yneuron output signal. Using a neural network, allows to approximate almost any function due to the optimal selection of synaptic weights [14].

Selecting and justifying the network architecture, as well as configuring network hyperparameters
The choice and justification of the network architecture is the first and important stage in the development of a neural network model. Depending on the type of information being processed (input data), several types of neural networks are distinguished (fully connected or multilayer networks, recurrent networks, convolutional networks, generative adversarial networks, etc.). Since tabular data are used in the problem being solved, we chose fully connected (multilayer) neural networks. Such networks are trained by supervised learning, with back propagation of the error [15][16][17][18][19][20][21][22][23].

Mоdel 1
From the conditions of the problem, we have 20 stops on the route " Vokzal -Marom "and 20 stops on the route"Marom -Vokzal". Based on the fact that we are not interested in final stops (there is no need to predict passenger traffic), therefore, we have 38 stopping points. Since our dataset for all stopping points has the same characteristics (elements), it is enough for us to build (justify and configure) one model and duplicate it for other stopping points, and, accordingly, additional training is provided on the dataset.
For the problem to be solved, there are 11 signals at the network input consisting of such elements (attributes) as X1, X2, X3, …, X11 (see Table 3). Therefore, as the source network, a network consisting of two layers was originally constructed: a transformation layer consisting of 11 neurons, and an output layer with one neuron. Accordingly, since we have the predicted value of the number of passengers at bus stops as the output signal (we are solving the regression problem), we chose the mean Square error (MSE) as the loss function, which is set by the expression: where k Yis the model predicted value of the target variable; k Y desired value of the standard model [18][19][20][21][22][23]. The loss functionis an objective function that needs to be minimized in the course of training, so it is a measure of success for the problem we are solving.
Several types of optimizers were used as a network optimizer (Adagrad, Adadelta, Adam, Adamax, Adaline, SGD, Nadam, RMSprop). Experimental calculations have shown that the best network performance is achieved with the "Adam" optimizer that implements gradient descent with impulse. An optimizer is a mechanism by which the network will update the weights of neural connections based on the observed data and the loss function (Fig. 4) [12,13].

Fig. 4. Block-diagram of the network learning process
To monitor the indicator of the quality of the developed network at the stages of training and testing, we chose (module) the Mean Absolute Error (MAE)this is one of the convenient metrics of the quality of the models in regression problems, which is given by the expression: where k Yis the target variable predicted by the model; k Y -desired value of the target variable [21]. It is often used in regression problems and shows the amount of deviation of the predicted change from the true value.
Next, a series of experimental calculations was carried out [11], according to the results of which the best hyperparameters (the number of neurons of the first and second layers, the size of mini-packets and the number of learning epochs) of the network of our model were determined. The optimal parameters and settings of the network are shown in Table 4, and the finished network architecture is shown in Fig. 5.   In the process of performing experimental computations, in order to avoid retraining the network, several regularization methods were used in the development of model 1: early stopping of training, selection of mini-packets (batch-reguliarization) of training, and dropout regularization [12].

Transfer learning model 1
As noted above, for each stopping point, we copy the prepared model 1, and therefore we carry out the procedure for additional training of these models to the data sets of stopping points.

Estimating the accuracy of the models
After we trained our models (the number of which is 38), we conducted a test procedure on test data from each stop point. Despite the fact that our data sets were not large, our models predicted passenger traffic (the number of passengers at stopping points) with an accuracy of 72% to 74.5% of the actual number of passengers at stopping points. This suggests that it is necessary to use a larger amount of data (at least for several years) [19][20][21][22].

Recommendations
In order to improve the accuracy of the models for predicting passenger traffic at stopping points, it is necessary:  retrain models on large data sets;  use the method of sequential iteration between alternating adjacent models in order to extract hidden interrelated features between stopping points;  use recurrent layers in models (RNN , LSTM).

Conclusions
In this paper, a neural network model for predicting passenger congestion at bus stops has been developed and studied. Its distinctive feature is that it takes into account the spatial and temporal characteristics of passenger transport. It was tested on real data from bus route 3 (Dushanbe, Tajikistan). For the first time, temporary validator reports (transactions of travel Bank cards) were used as input data, which excludes unauthorized receipt of any information about passengers. The result of testing the developed neural network model made it possible to predict passenger traffic with an accuracy of 74.5% of the actual number of passengers at bus stops.
The approach considered in the article to solving the problem of optimizing the traffic management of urban public transport makes it possible to automatically adapt the developed forecasting model to the changing conditions of the route line (for example, starting duplicate route lines, changing the temporary operating modes of the city, etc.). This is its originality. The developed neural network model is universal and can be used for other route lines (bus stops). This does not require much time to reconfigure the developed neural network and train it.