Experience of Multi-Layer Neural Network Application for Empirical Regularities Identification

2019, vol. 19, no. 2, pp. 160–165 160 Introduction In 1962 I graduated from Mathematics and Mechanics Faculty and luckily I got a position in Sverdlovsk Department of Institute of Mathematics of Academy of Sciences in USSR. My teachers S.N. Chernikov and I.I. Eremin provided me with the topic “Linear inequalities and linear programming”. At the same time Yu.P. Vasiliev began his work in Sverdlovsk Department of Institute of Mathematics of Academy of Sciences in USSR who was looking for his mission on algebra and in translation tasks. He asked me to write an essay on translation theory to Zbl fuer Math & ihre Grenzgebiete, the task was fulfilled. After that I had not touched mathematical linguistics till 2010 but then I came across the articles by a prominent linguist I.A. Melchuk. Besides linear inequalities and linear programming I got acquainted to the topic of factor analysis and an idea came to study factor nomination. I was engaged in the topic with a linguist E.Yu. Polyakova and we published several articles in English about it. They awoke the interest of O.V. Loginovsky who published some of them in the journal “Bulletin of South Ural State University” and we are very thankful for that. The given article will cover, beside the factor names, the model identification.


Introduction
expanding the set of features characterizing the performance of enterprises, a more accurate prediction is obtained, and the same effect is achieved when increasing the interval during which the data on enterprises are recorded [5][6].
The problem of comparing the enterprises can be understood either as that of ranking the enterprises, or as that of choosing the most preferable object out of a certain set [7]. Practice has shown that the methods based on the use of a priori weights of factors and the search for an object that meets the maximum weighted sum of factors lead to biased results. Weights represent the value that must be determined, this is the problem. The sets of weights are local -each of them is suitable only for the specific problem and the given group of the enterprises.
Consider the problem of choosing the best company in detail. Suppose there is a certain set of objects M the activities of which are aimed at achieving a certain goal. The operation of each object is characterized by the values of n features, that is, there exists a representation : М  R n . Therefore, our starting point is the vector of the state of the economic object: x = [x 1 , …, x n ]. Indicators of the operation quality of the economic object: f 0 (x), f 1 (x), …, f m (x). These indicators should stay within certain limits, and some of them we aspire to make either minimum or maximum. Such a general formulation can be contradictory, and it is necessary to use the apparatus for resolving contradictions and bringing the formulation of the problem to a correct form consistent with economic meaning. We order the objects in terms of some criterial function, but the criterion is usually poorly defined, blurred, and possibly contradictory.
We are considering the problem of modeling empirical regularities with a small number of experimental and observational data. A mathematical model can be a regression equation or a diagnostic rule, or a prediction rule. With a small sample the recognition methods are more effective. In this case the influence of factor management is taken into account by means of variation of the values of factors when they are substituted into the equation of regularity or in the decisive rule of diagnostics and prediction. In addition, we apply the selection of essential features and the generation of useful attributes (secondary parameters). This mathematical apparatus is necessary for predicting and diagnosing the states of economic objects.
We consider the neural network in terms of the theory of committee constructions as a collective of neurons (individuals), and the neural network is a mechanism for coordinating the work of neurons in collective solutions, this is a way of harmonizing individual opinions, when the collective opinion is the right reaction to input, that is, the desired empirical relation.
Therefore, now we will consider the application of committee constructions in the problems of choice and diagnostics. The idea implies searching for a collective of decisive rules instead of one decisive rule, this collective develops a collective decision under the procedure that processes the individual decisions of the collective members. The models of choice and diagnostics as a rule lead to inconsistent systems of inequalities, when solution concept generalizations are to be searched for instead of solutions. A collective decision represents such a generalization.
For example, the committee of an inequality system is such a set of elements where each inequality is satisfied by most of the elements of this set. Committee constructions are a certain class of solution concept generalizations for the problems that can be either consistent or inconsistent. This is a class of discrete approximations for contradictory problems, they can also be correlated with fuzzy solutions. The method of committees now determines one of the directions of analysis and solving problems of efficient choice of variants, optimization, diagnostics and classification. Let us give the definition of one of the basic committee constructions as an example, namely: for 0 < p <1: p, the committee of the inclusion system is a set of elements such that each inclusion is satisfied by more than the p-th part of this set.
Committee constructions can also be considered as a certain class of solution concept generalizations in the case of inconsistent systems of equations, inequalities and inclusions, and as a means of parallelization in solving problems of choice, diagnostics and prediction. As the concept generalization of problem solving, committee constructions are sets of elements that have some (but not all as a rule) properties of a solution, it is a kind of fuzzy solutions.
Committee constructions act as a means of parallelization directly in multi-layer neural networks. Namely, we have shown that for training a neural network the exact solution of the classification problem, the method of constructing a committee of a certain system of affine inequalities can be applied.
Proceeding from what has been said, it can be concluded that the committee method is connected with one of the important directions of research and numerical solution of both the problems of diagnostics and choice of variants and those of setting neural networks to obtain the required response to input information on a particular problem of the person making decisions.
When using the method of committees, its important features for application problems were found out such as heuristics, interpretability, flexibility: the possibility of further training and reconfiguration, the possibility of using the most natural class of functions -piecewise affine. When setting the problem of classification, diagnostics and prediction only correctness is required, that is the same object is not to be assigned to different classes.
The other side of the issue of committee constructions is related to the concept of coalitions in the development of collective solutions, and the situations differ sharply in the case of collective preferences (there are many "pitfalls") and in the case of collective classification rules, in which case the procedures can be rigorously justified and they have more opportunities. Therefore, it is important to be able to reduce decision-making problems and prediction ones to classification ones.
When diagnosing and predicting enterprises, factor weights are often used, which are given by experts, and then the experts' opinions are voted on. However, such procedures may be incorrect, and there exists an apparatus for constructing the correct procedures. Let us explain this. Consider the problem of diagnosing objects by collectives of experts, using coalitions in collective preference problems. Let X be the set of variants from which we need to choose -by some criteria -a certain variant of x. Let the set of experts or decision-makers, set C, engage in the problem of such choice. In the case when the choice is made on the basis of preferences, each member f of C set is in fact a binary relation of preference r(f). This means that for some x, y from X, the statement x r(f) y can occur, this means that for f, x is preferable to y. The collective preference r = r(C) can be considered as a certain function of individual preferences: r =  (r(f): f runs through the set C). At first glance, this assumption seems natural, but it is the source of further contradictions. It turned out that the collective preference can not be a universal rule, it depends on the specific variants of x, y and on the preferences r(f). In other words, the rule  can not be universal, it must be local.
We have shown that when the problem is reduced to the classification one, it is possible to develop teams of experts (committees) that properly solve the problem of diagnostics of the methods of teaching neural networks in two layers, and then the committee method has enabled to obtain accurate results and justified training procedures that enable to solve a wide class of problems reducible to the separation of finite sets with the only requirement of nonemptiness of their intersection.
One important area is related to voting procedures when assessing the state of objects. In the sphere of voting, the situation is extremely difficult, and here at every step there are paradoxes. We have shown that contradictions can be avoided in the case when the solution of the choice problem is reduced to a series of classification problems, in which case the committee method gives good results. The threelayer neural network corresponds to the method of committees, and from the theorems of the committees existence it follows that such a network can be trained by means of precedents to solve any problem whose solution can be expressed by a word in some finite alphabet.
Here are the arguments in favor of reducing decision-making to a series of classification problems. The procedure of collective solutions, close to multi-criteria optimization, is the most important in the problems of variants choice. The problem of making concerted decisions by a collective of people or a collective of decisive rules arises constantly in the prediction problems. However, it turned out that the most effective voting procedure can not be offered a priori. It always depends on the specific situation and, in fact, with a competent approach, turns into a process of congruence of the interests of the parties -a process that requires great care so as not to fall into one of the many formal traps. This is important for the diagnostics by the collectives of experts. In fact, this is a game of several people, where wins the one who counts well and uses the slightest mistakes of partners. The study of the problem of congruence of the individual opinions of experts and decision-makers has passed to a qualitatively new mathematical level.
The solution of almost any problem can be represented in the form of a scheme: The problem Z  parametrizer S  x = [x 1 , …, x n ]  solver  arg Z = f (x). Solver is a computer of this or that type. Instead of talking about the algorithm for solving the problem Z from the class Z, we will talk about an algorithm that enables us to reconstruct from a sequence (code) x from X by means of a program  the sequence (code) y = argZ, y -from Y.
Strictly speaking, this range of questions is connected with the idea of splitting a complex problem into a network of simple ones. This idea is implemented in different sections of mathematics under different names: the modular principle in software packages (N.N. Yanenko), the splitting principle in mathematical physics (G.I. Marchuk), the decomposition method in optimization, the finite element method in computational physics etc. The question is whether it is possible to synthesize the solution of a large complex problem from the set of solutions of subproblems.
Thus, let us solve the following specific problem. It is required from the observation data (object / feature table) to reveal the regularities of the form y = f (x), where y is the target indicator, and x is a vector of input features (factors). Based on this information, the prediction of the enterprise activity parameters is to be made. The relation is to be obtained in a neural network form.
The problem of feature control is connected with the approach. This problem is divided into several stages: feature selection, feature transformation (development of rectifying space), evaluation of specific features and their aggregates, evaluation of feature variation effect on the result of classification.
Setting of the neural network to simulate the relation y = f (x) is reduced to discriminant analysis. Thus, to simulate empirical regularities, we consider the problem of discriminant analysis -that of constructing a function f from a function class F that separates precedent sets A and B. We denote this problem as DA (A, B, F): to find f from F: f (x) > 0 for x from A, f (x) < 0 for x from B.
The separating committee: C = [f 1 , …, f q ], each inequality in our problem being satisfied by more than half of the elements in C.
These problems are solved on the basis of accumulated observations on the dynamics of indicators. Image recognition and regression analysis are used to find the empirical relations between the indicators. Then on this basis, the assessment of features and their systems, selection of useful features and their selection are considered. Namely, let f = arg DA (A, B, F), that is, f is a separating function for the sets A and B. If there is an object state vector: x = [x 1 , …, x n ], which we want to transfer to the class A, then we solve the problem of feature control: to find y = [y 1 , …, y n ] such that f (x + y) > 0. In a more general model, u is the control operator acting on the state vector x so that x could be transferred to the appropriate class. This is related to the evaluation of factors: the value of the factor (input indicator) x i is the elasticity of the criterion (objective) functions f 1 , …, f p with respect to the factor x i , val (x) = the matrix composed of the vectors grad f j (x).
To detail the substantive scheme of simulating the work of an industrial enterprise, we must take into account the fact that the construction of an adequate mathematical model of economic indicators requires the development of a substantial concept of the economic and production process at an industrial facility and its formalization. The result is the selection of input and output indicators.

Conclusion
1. In many (but not all) situations, the neural network expert system has proven to be more efficient than the Ivakhnenko system [5], but it deserves attention as a supplement to the committee method.
2. In combination with regression analysis, the committee neural network shows greater reliability.