DEVELOPMENT OF ALGORITHMS FOR CHOOSING THE BEST TIME SERIES MODELS AND NEURAL NETWORKS TO PREDICT COVID-19 CASES

Time series analysis became one of the most investigated fields of knowledge during spreading of the COVID-19 around the world. The problem of modeling and forecasting infection cases of COVID-19, deaths, recoveries and other parameters is still urgent. Purpose of the study. Our article is devoted to investigation of classical statistical and neural network models that can be used for forecasting COVID-19 cases. Materials and methods. We discuss neural network model NNAR, compare it with linear and nonlinear models (BATS, TBATS, Holt's linear trend, ARIMA, classical epidemiological SIR model). In our article we discuss the Epemedic.Network algorithm using the R programming language. This algorithm takes the time series as input data and chooses the best model from SIR, statistical models and neural network model. The model selection criterion is the MAPE error. We consider the implementation of our algorithm for analysis of time series for COVID -19 spreading in Chelyabinsk region, and predicting the possible peak of the third wave using three possible scenarios. We mention that the considered algorithm can work for any time series, not only for epidemiological ones. Results. The developed algorithm helped to identify the pattern of COVID -19 infection for Chelyabinsk region using the models realized as parts of the considered algorithm. It should be noted that the considered models make it possible to form short-term forecasts with sufficient accuracy. We show that the increase in the number of neurons led to increasing accuracy, as there are other cases where the error is reduced in case of reducing the number of neurons, and this depends on COVID -19 infection spreading pattern. Conclusion. Hence, to get a very accurate forecast, we recommend re-running the algorithm weekly. For medium-range forecasting, only the NNAR model can be used from among those considered but it also allows to get good forecasts only with horizon 1–2 weeks.


Introduction
COVID-19 is one of the most serious problems facing the entire world today. In this article, we consider methods for predicting the spread of COVID-19 (cases of infection, death, and recovery) in the Chelyabinsk region, using time series analysis models and NNAR neural networks. On March 21, countries considered. There are also studies that show that the ARIMA model and cubic smoothing spline models had lower forecast errors and narr models.

Fig. 1. Scheme of Algorithm Epidemic. Network selecting the best model for predicting COVID
The results obtained cannot be generalized to all countries to different patterns of the spread of the virus. As for the SIR model, even at the beginning of the pa demic, it was shown to be ineffective in predicting cases of coronavirus infection. For example, using this model, it was found that the peak of the second wave of infection cases in Pakistan should have o curred on August 25, 2020, however, in fact, the peak of infection in this country in December 2020 [5]. The "covid19. Analytics" package, developed in the R langua denced by the results of the SIR model and the prediction of the time of occurrence of the second (and subsequent) wave cycles. In Fig. 1 we show the scheme of the developed software module, which allows you to choose the best model with the available initial data. For the experiment, the Yandex dataset [6] was used on infections, deaths, and hospital discharge from March 12, 2020 countries considered. There are also studies that show that the ARIMA model and cubic smoothing spline models had lower forecast errors and narrower forecast intervals compared to Holt and TBATS

. Scheme of Algorithm Epidemic. Network selecting the best model for predicting COVID
The results obtained cannot be generalized to all countries affected by the COVID to different patterns of the spread of the virus. As for the SIR model, even at the beginning of the pa demic, it was shown to be ineffective in predicting cases of coronavirus infection. For example, using it was found that the peak of the second wave of infection cases in Pakistan should have o curred on August 25, 2020, however, in fact, the peak of infection in this country in December 2020 [5]. The "covid19. Analytics" package, developed in the R language, has the same drawbacks. This is ev denced by the results of the SIR model and the prediction of the time of occurrence of the second (and subsequent) wave cycles. In Fig. 1 we show the scheme of the developed software module, which allows the best model with the available initial data. For the experiment, the Yandex dataset [6] was used on infections, deaths, and hospital discharge from March 12, 2020, to April 09, 2021. Let's affected by the COVID-19 pandemic due to different patterns of the spread of the virus. As for the SIR model, even at the beginning of the pandemic, it was shown to be ineffective in predicting cases of coronavirus infection. For example, using it was found that the peak of the second wave of infection cases in Pakistan should have occurred on August 25, 2020, however, in fact, the peak of infection in this country in December 2020 [5].
ge, has the same drawbacks. This is evidenced by the results of the SIR model and the prediction of the time of occurrence of the second (and subsequent) wave cycles. In Fig. 1 we show the scheme of the developed software module, which allows the best model with the available initial data. For the experiment, the Yandex dataset [6] was used on infections, deaths, and hospital discharge from March 12, 2020, to April 09, 2021. Let's consider the models used in this algorithm. All models are subd ries analysis models; (2) neural network models; (3) epidemiological models. One of was devoted to the simulation of the COVID Feb 14th to April 11th, 2020. The authors of this paper forecasted the remaining infectious cases with three scenarios that differed in terms of the stringency level of social distancing. Despite the prediction of infectious cases in short-term intervals, the constru spread and pattern of the epidemic in the long term. Remarkably, most of the published SIR models d veloped to predict COVID-19 for other communities suffered from the same conformity. The SIR mo dels are based on assumptions that seem not to be true in the case of the COVID more sophisticated modeling strategies and detailed knowledge of the biomedical and epidemiological aspects of the disease are needed to forecast the pandemic

BATS и TBATS Models
The TBATS model is a state transform, ARMA errors, trends, and seasonal components called the TBATS model, which is used to analyze univariate time series models and was developed by De Livera et al. [ tioning of these models is shown in Fig. 2. The main difference between the TBATS model and the BATS model is the ability to forecast with variable seasonality. The main advantage of these models is the ability to use multiple seasonality. Nevertheless, in some cases, the use of these models is not advisable, since the results of the same order of obtained by other methods that are less demanding on computational resources.

Linear Holt model
Adaptive exponential smoothing models are a fairly popular tool for predicting the spread of cor navirus infection. These models also serv sponding to the development of the epidemic in of most of the studies presented is the lack of an explanation for the choice of the corresp specification, as well as the lack of an "explanation" for the choice of model hyperparameters [ note the article [2], which shows that the exponential smoothing model for the time series under consi deration gives more accurate results than the ARIMA model. The Holt explain in any way the nature of the epidemic and focuses exclusively on the data itself. Thus, in this model, we can note the phenomenon of insignificant sev the true development of the infectious process, but with the work schedule of individual medical se vices (testing laboratories, as well as administrative services) [

ARIMA model
The ARIMA model consists of three components [ values used to predict the next value; determined by the parameter p in the autoregressive model (2) MA (Moving Average) -used to determine the number of past forecast errors used to predict future values; determined by the q parameter obtained from the ACF (auto term) -if the series is not stationary, then its To check the stationarity of the series, the extended Dickey (KPSS) test is used. The same tests allow determining the parameter d of the model. consider the models used in this algorithm. All models are subdivided into three categories: (1) time s ries analysis models; (2) neural network models; (3) epidemiological models. One of was devoted to the simulation of the COVID-19 in the Isfahan province of Iran for the period from April 11th, 2020. The authors of this paper forecasted the remaining infectious cases with three scenarios that differed in terms of the stringency level of social distancing. Despite the prediction term intervals, the constructed SIR model was unable to forecast the actual spread and pattern of the epidemic in the long term. Remarkably, most of the published SIR models d 19 for other communities suffered from the same conformity. The SIR mo sed on assumptions that seem not to be true in the case of the COVID more sophisticated modeling strategies and detailed knowledge of the biomedical and epidemiological aspects of the disease are needed to forecast the pandemic

Series Analysis Models BATS и TBATS Models
The TBATS model is a state-space trigonometric exponential smoothing model with Box transform, ARMA errors, trends, and seasonal components called the TBATS model, which is used to time series models and was developed by De Livera et al. [8]. A figure of the fun tioning of these models is shown in Fig. 2. The main difference between the TBATS model and the BATS model is the ability to forecast with variable seasonality.

Fig. 2. Scheme of the BATS and TBATS models
The main advantage of these models is the ability to use multiple seasonality. Nevertheless, in some cases, the use of these models is not advisable, since the results of the same order of obtained by other methods that are less demanding on computational resources.
Adaptive exponential smoothing models are a fairly popular tool for predicting the spread of cor navirus infection. These models also served as a general tool for making time-series projections corr sponding to the development of the epidemic in different countries [2,9,10]. True, the main drawback of most of the studies presented is the lack of an explanation for the choice of the corresp specification, as well as the lack of an "explanation" for the choice of model hyperparameters [ note the article [2], which shows that the exponential smoothing model for the time series under consi deration gives more accurate results than the ARIMA model. The Holt-Winters model does not really explain in any way the nature of the epidemic and focuses exclusively on the data itself. Thus, in this model, we can note the phenomenon of insignificant seven-day cyclicity, associated primarily not with the true development of the infectious process, but with the work schedule of individual medical se vices (testing laboratories, as well as administrative services) [9].
sists of three components [11]: (1) AR (autoregressive term) values used to predict the next value; determined by the parameter p in the autoregressive model used to determine the number of past forecast errors used to predict future values; determined by the q parameter obtained from the ACF (auto-correlation function); I (integrating if the series is not stationary, then its difference of order d is found, which is a stationary series. To check the stationarity of the series, the extended Dickey-Fuller, Kwiatkowski (KPSS) test is used. The same tests allow determining the parameter d of the model. ivided into three categories: (1) time series analysis models; (2) neural network models; (3) epidemiological models. One of the first papers [7] 19 in the Isfahan province of Iran for the period from April 11th, 2020. The authors of this paper forecasted the remaining infectious cases with three scenarios that differed in terms of the stringency level of social distancing. Despite the prediction cted SIR model was unable to forecast the actual spread and pattern of the epidemic in the long term. Remarkably, most of the published SIR models de-19 for other communities suffered from the same conformity. The SIR mosed on assumptions that seem not to be true in the case of the COVID-19 epidemic. Hence, more sophisticated modeling strategies and detailed knowledge of the biomedical and epidemiological space trigonometric exponential smoothing model with Box-Cox transform, ARMA errors, trends, and seasonal components called the TBATS model, which is used to ]. A figure of the functioning of these models is shown in Fig. 2. The main difference between the TBATS model and The main advantage of these models is the ability to use multiple seasonality. Nevertheless, in some cases, the use of these models is not advisable, since the results of the same order of accuracy can be Adaptive exponential smoothing models are a fairly popular tool for predicting the spread of coroseries projections corre-]. True, the main drawback of most of the studies presented is the lack of an explanation for the choice of the corresponding model specification, as well as the lack of an "explanation" for the choice of model hyperparameters [9]. We also note the article [2], which shows that the exponential smoothing model for the time series under consi-Winters model does not really explain in any way the nature of the epidemic and focuses exclusively on the data itself. Thus, in this day cyclicity, associated primarily not with the true development of the infectious process, but with the work schedule of individual medical ser-]: (1) AR (autoregressive term) -refers to past values used to predict the next value; determined by the parameter p in the autoregressive model; used to determine the number of past forecast errors used to predict future correlation function); I (integrating difference of order d is found, which is a stationary series. Fuller, Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test is used. The same tests allow determining the parameter d of the model.
In [11], it is shown that the parameters of the model predicting the spread of COVID rent for different regions of the Russian Federation (and states), in addition, the parameters of the model change over time. The paper considers the possibility of model for time series corresponding to the same process occurring in different conditions.

Neural network model with autoregressive
One of the model prediction methods is an artificial neural network, base models of the brain and allowing to establish the relationship between the response variable and its pr dictors, which is a complex nonlinear relationship.
In this work, we used a linear autoregressive model with delay, which we will call the NNAR mo del. To predict cases of COVID -19 in the Chelyabinsk region, the NNAR model (6.5) was used, which is a neural network with the last observation. The vector and with five neurons in the hidden layer (Fig. 3).

Fig. 3. NNAR model for predicting COVID
We have two types of neural networks: simple and multi ral network, which has no hidden layer and is equivalent to linear regression in this type, has coefficients attached to predictors that cause weights and the prediction obtained by a linear combination of input data and weights are chosen using a training algorithm that can minimize a "Cost Function" such as MSE, so in this type of neural network, linear regression is an efficient method for training a model.
The second type of neural networks, called feed each node from the previous levels, and the outputs of the nodes in one layer are the inputs in the next layer, and there is a combination between the inputs of each node and a weighted linear combination, and there is also non-linear function for modified results before output, open neurons j are combined linearly to give = + ∑ , in a hidden layer ( ) = p "Learned" from the data. Weights are often limited to keep them from getting too large. The weight li iting parameter is known as the "decay param ized and then updated using the observed data. Therefore, there is an elem the predictions made by the neural network. Therefore, the network is usually trained multiple times using different random starting points, and the results are averaged. , it is shown that the parameters of the model predicting the spread of COVID rent for different regions of the Russian Federation (and states), in addition, the parameters of the model change over time. The paper considers the possibility of automatic selection of parameters of the ARIMA model for time series corresponding to the same process occurring in different conditions.

Neural network model with autoregressive
One of the model prediction methods is an artificial neural network, based on simple mathematical models of the brain and allowing to establish the relationship between the response variable and its pr dictors, which is a complex nonlinear relationship.
In this work, we used a linear autoregressive model with delay, which we will call the NNAR mo 19 in the Chelyabinsk region, the NNAR model (6.5) was used, which is a neural network with the last observation. The vector is used as input to predict the output of and with five neurons in the hidden layer (Fig. 3).

. NNAR model for predicting COVID-19 cases in Chelyabinsk
We have two types of neural networks: simple and multi-level feed-forward network. A simple ne ral network, which has no hidden layer and is equivalent to linear regression in this type, has coefficients attached to predictors that cause weights and the prediction obtained by a linear combination of input eights are chosen using a training algorithm that can minimize a "Cost Function" such as MSE, so in this type of neural network, linear regression is an efficient method for training a model.
The second type of neural networks, called feed-forward layered network, In this type, the inputs of each node from the previous levels, and the outputs of the nodes in one layer are the inputs in the next layer, and there is a combination between the inputs of each node and a weighted linear combination, linear function for modified results before output, open neurons j are combined . For example, a sigmoid nonlinear function that is used to change parameters , , , , and and , , , Weights are often limited to keep them from getting too large. The weight li iting parameter is known as the "decay parameter" and is often set to 0.1. First, the weights are ra ized and then updated using the observed data. Therefore, there is an element of randomness in the predictions made by the neural network. Therefore, the network is usually trained multiple times using different random starting points, and the results are averaged.

29
, it is shown that the parameters of the model predicting the spread of COVID-19 are different for different regions of the Russian Federation (and states), in addition, the parameters of the model automatic selection of parameters of the ARIMA model for time series corresponding to the same process occurring in different conditions. d on simple mathematical models of the brain and allowing to establish the relationship between the response variable and its pre-In this work, we used a linear autoregressive model with delay, which we will call the NNAR mo-19 in the Chelyabinsk region, the NNAR model (6.5) was used, which is used as input to predict the output of forward network. A simple neural network, which has no hidden layer and is equivalent to linear regression in this type, has coefficients attached to predictors that cause weights and the prediction obtained by a linear combination of input eights are chosen using a training algorithm that can minimize a "Cost Function" such as MSE, so in this type of neural network, linear regression is an efficient method for training a model. network, In this type, the inputs of each node from the previous levels, and the outputs of the nodes in one layer are the inputs in the next layer, and there is a combination between the inputs of each node and a weighted linear combination, linear function for modified results before output, open neurons j are combined example, a sigmoid nonlinear function that is used to change , , , , , and , Weights are often limited to keep them from getting too large. The weight limeter" and is often set to 0.1. First, the weights are randoment of randomness in the predictions made by the neural network. Therefore, the network is usually trained multiple times using Neural network with autoregressive (NNAR) lagged value in time series data that is used for input into neural networks, so we used lagging in linear autoregressive model, which we can call this NNAR model, which means neural network autoregressive model in our implementation for prediction the third wave of COVID-19 in Chelyabinsk, we used the NNAR model (6,5) for the first scenario and NNAR (6,10) for the second scenario, that is, the model is a neural network with the last observation. is used as input to predict the output and with five neurons in the hidden layer.

Epidemiological SIR Model
Epidemiological models such as SIR (susceptible, infected, recovered), and their many variants describe the density of infected people I using a typical equation [12]. At the beginning of the spread of infection, the number of infected, and recovered people is much less than the number of susceptible ones, so we can approximate S with a constant. Using this approximation, we obtain a linear differential equation with constant coefficients, according to the solution of which the growth in the number of infected persons at the beginning of the epidemic is exponential, and then slows down as the number of susceptible to infection decreases.
However, the classical SIR model does not provide a high quality of the obtained forecasts [5,13,14] due to differences in the algorithms for choosing its parameters. In [5], the work with an extension for the R language called covid19.analytics is described in detail. In [15], a model is used that provides a complete picture of the spread of COVID-19 anywhere in the world. The author of this package claims to do this by accessing, and retrieving data publicly available, and published from two main sources. The package also provides basic analysis and visualization tools and functions for exploring these and other similarly structured datasets. The main disadvantage of this package at the moment is the use of exclusively the classical SIR model for forecasting, which gives a very large error. However, in reality, effective (and not so) measures to contain the epidemic (quarantines, restriction of activities and movement, the use of masks, etc.) are developed and practiced everywhere, which affects the change in the trajectory of the epidemic and, as a result, leads to the fact that the coefficients of such a model become variable. In the article [9], the authors retain the coefficients of the model based on the newly obtained data, which is justified for obtaining short-term forecasts (up to 10 days) with high accuracy. The reason for the lack of accuracy of the model lies in the fact that one of the most important assumptions of this model is to divide the population into three homogeneous groups, and therefore this model is not suitable for the example of clearly heterogeneous societies. During the year of the pandemic, it became clear that these models give the best results for long-term forecasting (more than 7 days).

Software implementation of the considered algorithms
The considered models BATS, TBATS, the linear Holt model, ARIMA, SIR, and the neural network model NNAR were implemented using the R language. The results of computational experiments are given in [16], and the source code of the algorithm is in [17]. Here are some of the results obtained using the developed algorithm. obtained using the considered models. Hence, it can be seen that the epidemiological and neural network models give an error that is 1-2 orders of magnitude higher than the time series analysis models. The first scenario NNAR (6,5) June, 21, 2021 Unknown 3 The Third scenario NNAR (19,15) July, 18, 2021 Unknown

Fig. 4. Forecast of peaks of infection waves in the Chelyabinsk region
Despite the low (compared to time series models) forecasting accuracy using the NNAR model, this model can be successfully used to construct not only short-term, but also medium-term and long-term forecasts. Consider the results of using the NNAR model to predict infection peaks (Table 2, Fig. 4).