Cybernetics and Systems Analysis. Vol. 35. No. 5. 1999
APPLICATION OF NEURONET
OF GENETIC
ALGORITHlVIS TO OPTIMIZATION
RECOGNITION
DEVICES 1
T. N. Baidyk and I~. M. Kussui' The possibilities of applying genetic algorithms solve problems of recognition of handwritten results of an experimental study are given. The efficiency of neural networks after optimization.
UDC 007.001.362
to optimization of the structure of neural networks that and printed symbols and words are considered. The experiments performed demonstrate an increase in the Ways of improving the results obtained are discussed.
Keywords: neural networks, pattern-recognition algorithms, optimization of pattern features, recognition of handwritten words.
In solving problems of pattern recognition, researchers encounter the problem of selection of features used by recognition algorithms. In this case, the number of features should not be large since it produces an effect on the recognition speed. The selection of informative features is a multialternative problem and is usually connected with exhaustion algorithms. In 1992, we began our experiments on optimization of a collection of features with the help of a genetic algorithm [1] that makes it possible to automate the feature selection and search for the global optimum in the course of optimization. Since the genetic algorithm is very time-consuming, a fast neurocomputer was used [2]. Let us consider the definition and solution of a special problem of recognition of handwritten words. We regard the problem of recognition of separate symbols and words as the first stage of recognition of handwritten texts. We assume that it is expedient to divide the process of solving the problem of recognition of handwritten texts into at least the following two basic stages: 1) recognition of an abridged dictionary of handwritten words: 2) recognition of arbitrary handwritten texts. Both problems can be solved for one or many writers. The t-ust problem is much easier since, if a word is taken in its entirety, then many more informative features can be selected in it than in the case where individual letters that appear in the structure of a word are recognized. Therefore, from our viewpoint, better results can be obtained at the first stages of recognition of words from an abridged dictionary. This problem can be of interest in its own fight in creating devices for reading questionnaires, addresses on envelopes, etc. In recognizing arbitrary words or word-forms that have not been used for special training, more complicated algorithms supported by knowledge bases and hierarchical recognition structures are required to obtain high reliability of recognition. Let us consider in more detail the development of a system of recognition of words from an abridged dictionary. The algorithms of recognition of handwritten symbols given below are oriented toward the use of the B-512 neurocomputer developed together with the Japanese WACOM Corporation. The B-512 neurocomputer makes it possible to execute digit-by-digit logical operations over words consisting of 512 binary digits, shift 512-digit words, and also execute a number IThese results were partially obtained due to grant U4M000 of the Soros International Scientific Fund and also due to the "Neurocomputer'" project (1992-994) of the State Committee of NAS of Ukraine on Science and Engineering. International Research and Training Center of Information Technologies and Systems, National Academy of Sciences of Ukraine, Kiev, Ukraine. Translated from Kibernetika i Sistemnyi Analiz, No. 5, pp. 23-32, September-October, 1999. Original article submitted March 23, 1998.
700
'3 1060-0396/99/3505-070052,.00 9
Kluwer Academic/Plenum Publishers
of operations that decrease the execution time of models of neural networks. D. A. Rachkovskii developed the base software for the neurocomputer.
FEATURES
USED F O R R E C O G N I T I O N
OF H A N D W R I T T E N
WORDS
By the features of an image we mean the distinctive elements of the image that tbrm a part of a letter. The features can be seaments of strai,,ht lines with various slope an~les, arcs that have various radiuses and are rotated through various an~les. intersections of segments, etc. The presence or absence of each feature is determined at a given point of the image. For the kth feature, the quantity y~:(i j) = 1 is introduced if the feature is presented at a certain point of the image and Yk (i j) = 0 ~_,
~
~
,
,_.
otherwise. To recognize handwritten words, 10 features were used, each of which was a segment of a straight line of certain length and orientation. The angle contained by the segments that represent each feature was equal to I8 degrees. To select these features, the lines of handwritten symbols were thinned up to 1 pixel and then thickened up to 3 pixels. Thus, the lines of each symbol image had the specified standard width. The presence of a feature was determined by the presence of a segment of a given length and orientation in the line of the symbols of a word that was processed as described above. In addition to these features, arcs of various radiuses and orientations were also used. Each individually written word was read by a scanner with a resolution of 300--400 points per inch. A brief description of the algorithm for recognition of handwritten words from an abridged dictionary is as follows. After scanning the words, the image obtained is processed as tbllows: 1) the image lines are thinned; 2) the image lines are thickened; 3) the informative features are selected; 4) the features selected are coded; 5) the codes obtained are entered into the neural network for recognition or training. The algorithms of cellular logic developed by the authors are used to realize these operations. Let us consider a window of an image of size 3 x 3 pixels. The designations of the pixels for the position of the window at a point (i j) are given below: x i-I j-I
x i-I j
x i-I j+l
Xi j-I
.v.q
x i j+l
Xi+l j-1
Xi+l j
Xi+l j+l-
Let us describe the algorithms of cellular logic. 1. Algorithm of Thinning Lines. The algorithm is realized according to the following formulas" xij = ( x i j & ( x i
j _ 1 k.d-.-, xi j+l )) U ( x i j & x i _
1 j_l &-,- x i _ 1 j ) k . d ( X i j & A i + 1 j_l &--- xi+ 1 j),
X~j = (XijcYr (X i j+l k.d-- X i j-1 )) U(xijcYg.xi-I j+l (~K ----X i _ 1 j ) ~ ( x q & x i +
1 j+l &'" Xi+l j ).
X ij = ( X ~j cYr ( X i_ 1 j k.) -.- X i + 1 j )) k.) ( X ij fl~r x i_ 1 j-I r162 "" x i j-1 ) k..) ( X ij cYr x i_ 1 j+l •
X6 = (.~,'tj~/: (Xi+ 1j 1, where x q =
U
--
Xi_ 1 j )) k.) (.'r &.~,'i+1j71 r
"" X i j+l ),
X i j-1 ) k_d(X~jcYr Xi+ 1 j+l &'" Xi j+l )"
if the pixel with coordinates q belongs to a line,
O, if the pixel with coordinates ij does not belong to a line. 011
The first formula from this group makes it possible to thin a line from the left, i.e., if the situation 011 takes place, 011 011 then the transformation of the formula will result in the situation O01. 011
701
The second formula thins a line from the right, and the third and fourth tbrmulas thin it from above and from below, respectively. All the operations in the algorithm described are executed in parallel over a line consisting of 512 symbols. The entire image is processed by sequential search for such lines. The operations in the formulas are executed sequentially. The algorithm comes to an end if the number of unities does not decrease after execution of operations on the entire image. With a view toward accelerating the operation of the algorithm, a simplified variant of thinning lines is developed and tested. Three pixels have been analyzed in this simplified variant. Consider now the algorithms of thinning from the left, from the right, from above, and from below. The algorithm of thinning a line consists of all tour algorithms. Let us now consider the case of thinning from the left, where an image includes the situation of pixel arrangement 0 1 1 and one unity that corresponds to the line can be removed: Xtj = Xtj & " (Xi j+l &X~i & " Xi j-I )" 0 In the situation where pixels are vertically arranged 1, lines can be thinned from above. To this end, the following 1
formula is used: Xij =Xij & ' - (... Xi_ 1 j & X i j ~-Xi+l j ) .
Similarly, the tbrmulas for thinning from the right (the situation 1 1 O) and from below (the situation 1) are as follows: 0 xij = xij & ' - (-ri j-I & x i j & " xi j+l ), xij = x i j & . - (... xi+ 1 j & x i j & x i - i j ) .
The advantage of the simplified algorithm lies in its high operating speed. A drawback is the fact that this algorithm does not guarantee the continuity of a line, i.e., discontinuities of a line can appear as a result of its operation. The efficiency of the procedures proposed is increased with increase in the length of the computer word whose digits are processed in parallel. The speed of execution of such an algorithm on a neurocomputer with 512-digit words is ~eater by several told than the speed of operation of sequential skeletonization algorithms. 2. Algorithm of Thickening Lines. Lines are thickened by the formulas x ij = Y ij k-) Y i_l j k.) X i+l j t..) x i j_ 1 k.) x i j+ 1.
The loop of thickening is executed one or two times. The final choice of the number of complete loops is based on the experimental check of the entire recognition algorithm. The operations of thickening a line are executed simultaneously over a 512-digit line. 3. Selection of Informative Features. A feature is specified by the table of coordinates o f the points that appear in the feature; the origin of coordinates is assumed to be selected at a point of the image (i j), relative to which the presence or absence of the feature is determined. It is convenient to use parallel digit-by-digit operations of the B-512 neurocomputer to select the features of an image. These digit-by-digit computations are carried out by the formula y = x r-41 , s-jl d~. x r-42 9 s-j2. &'" "8~.x r-4m , s-Jm '
where ip and jq are the numbers taken from the table corresponding to the kth feature and x is the binary value of brightness. If y = 1, then a feature is present at a point (r, S); if v = 0, then it is absent. This formula reflects the computations that are carried out for one point. Bit-by-bit operations in logical modules are simultaneously executed over the entire line consisting of 512 pixels. The selection of intbrmative features can be carried out with the help of various algorithms. The simplest algorithm is the selection of the features that occur most frequently in a training sample. Some more advanced algorithms compute the Shannon measure of entropy of a feature. However, we believe that the most effective methods of selecting informative features are the optimization procedures in which the space of parameters is assumed to be the binary space of presence or absence of a feature and the goal function includes the probability of correct recognition, the total of the features selected, 702
and the complexity of computing them. A very high operating speed of a recognition device is required to execute the optimization programs: therefore, it has been impossible to use such programs up to the present. After the development of such neurocomputers as the B-512, we were able to perform our experiments, since the B-512 neurocomputer has sufficient efficiency to realize optimization algorithms. In what follows, we will describe algorithms of evolutionary optimization and their realization with the help of the neurocomputer.
O P T I M I Z A T I O N OF A C O L L E C T I O N OF FEATURES To conduct optimization experiments, I0 handwritten English words, roman letters, and handwritten decimal digits were chosen. The obtained probability of recognition of the handwritten words was 99-100% (for a known handwriting). The probability of recognition of the roman letters and digits was somewhat smaller (98%) (the experiments were conducted by D. A. Rachkovskii and are being published with his kind permission). For an unknown handwriting, the probability of recognition of handwritten words was approximately 80%; after optimization of the collection of features, it was about 84%. Let us consider the results of optimization of the collection of features in more detail. The objective of the performed experimental investigations on the determination of an optimum collection of features is to provide a higher probability of correct recognition of handwritten symbols. Ten features used for recognition of handwritten symbols at the first stage of the investigations were chosen intuitively. To search for more effective collections, an extended set of features was formed intuitively and the following problem was stated: choose a subset from this set that would optimize some function of quality of recognition. This problem was solved by simulation of the evolution of a biological species. The method of simulation of evolution or the genetic algorithm, as it is now called, has been developing since the sixties and is now widely used by many researchers [3-7]. In the algorithm used in this article to simulate evolution, a specific variant of a system for recognizing handwritten symbols is an analog of a living organism ("individual"). In all the variants, the general structure of the neural network remained invariable, and only the combination of features at the network input was varied. Thus, a subset of features was assigned to each "individual," from which the "individual" was trained to recognize, and then the function of quality of recognition was computed. The process of generating the "'descendants of individuals'" was simulated taking account of "mutations" and also the process of "natural selection." The problem was formulated to optimize the collection of features for recognition of ten handwritten words written by different people. The initial set consisted of 41 features; the features were segments of straight lines and arcs of various radiuses and various orientations. Each of the features was presented in the words recognized, but the combinations of features that would make it possible to obtain better recognition results were a priori not known. The task of the genetic algorithm that simulated biological evolution was to search for such combinations. A 41-bit binary vector E = (e I , e2 e41 ) in which one feature corresponds to each bit will be called an "individual." If e i = 1, then the ith feature is included in the collection on which recognition is based: if e i = 0 , then the ith feature is not included in the collection. A mutation m i is defined as a change in the value of the ith component of the vector E (substituting unity for zero or vice versa). We have used a "unisex" evolutionary algorithm, in which all the "descendants" were born from one "parent," rather than from a pair of them. Random mutations took place during simulation. The probability of a mutation of each feature p ( m i) was dependent only on the goal function of the parent, i.e., . . . . .
p ( m i) = p(Q),
where the goal function Q was defined as the fourth power of the number of recognition errors, namely, Q = c N 4 , where N is the number of errors and c is a constant. The number of errors was counted during recognition of different variants of ten handwritten words. To calculate the goal function for some "'individual," the neural network was trained beforehand to recognize a word with the use of the set of features of this "'individual." The recognition of handwritten words written by different people was chosen as a test problem. To write the words (one, two . . . . . ten), special forms were prepared. Each person prepared several ways of writing a word. As a result, we had 12 ways of handwriting each of the ten words tbr each person. The following two samples were formed from this set: Ql was used as a training set (80 specimens) and Q2 was used as a test sample (40 specimens). The test sample was not used fbr training. This sample was used to recognize words and determine the percentage of correctly recognized words. The words were entered into the computer by a scanner with a resolution of 300--400 binary dots per inch. 703
To simulate natural selection, an algorithm was used in which the probability of generating a "descendant" was proportional to the quantity I/Q, which permitted the "individual" with the minimum goal function to generate more "descendants" than an "'individual" with a large goal function. The individuals of the initial generation were generated by a random-number generator. The associative-projective network chosen for testing the efficacy of the evolutionary algorithm consisted of 4096 neurons. To approve the algorithm of evolutionary optimization, D. A. Rachkovskii provided a model of a neural network. The selection of a subset of features S l, as was already noted above, corresponds to the designing of a device recognizing a handwritten word by using only the features that appear in the subset $1. In the software implementation of the evolutionary algorithm, the stage of constructing an initial collection of such recognition devices was realized with the help of a special procedure that used a random-number generator. Thus, a "genotype'" of the zero generation of "individuals" was formed. The probability of forming an individual element at each "genotype" digit was chosen equal to 1/2, i.e., about 20 features out of 41 were presented on average in the genotype vector. Then, the algorithm was constructed as a model of alternation of "generations" of such recognition devices. In our experiments, each generation consisted of 10 or 32 individuals. At the first stage, the numbers of features are determined as a result of operation of the random-number generator. We can form input vectors based on the numbers of features for each handwritten word and use them to train the associative-projective neural network. After training the network to recognize all the words from the sample Qi, the network is examined with the help of the sample Q2, and the number of incorrectly recognized words is computed. Then, the values of the goal function Q are computed. In the experiments, the goal function was of the form Q = ( q x X + c 2 xZ) c, where c ! and c 2 are some constants, X is the number of features, and Z is the number of errors in recognizing the test sample. In the series of experiments performed, c I was assumed to be equal to 0, and the goal function was a function of the number of errors, i.e., the objective of realization of evolutionary optimization was the selection of "individuals" such that the number of errors was decreased. The exponent of a power in the goal function was chosen equal to c = 4. After computing the goal functions Q for all the "individuals" of the zero "generation" G(0), a new "generation" G(1) is formed. To form a "generation," a stochastic procedure is used, with the help of which the "individuals" of the generation G(0) generate "'descendants," i.e., the "individuals" of the "generation" G(I). The probability of generation of new "individuals" is verl. The less the value of Q, the more the probability that a given "'individual" generates "'descendants." The production of "individuals" of a new "generation" is realized by the formula verl = c 3 x(1 / Q), where c 3 is some constant. The probability of mutations is vmutl. The greater the value of the goal function of an "individual," the greater the probability of its mutations. New "'generations" are also trained and examined with the help of the test sample. This cycle, including the production of new "generations," is repeated until a definite value of the goal function is reached or a predetermined number of generation cycles is carried out. In the pertbrmance of the rE'st experiment, 11 generations were produced. Each generation consisted of 32 "'individuals '" The average values of errors and average values of the goal functions were computed for each generation. In Fig. 1, curve 2 represents the dependence of the average values of the goal functions on the numbers of generations produced. The collection of features that corresponded to the best "individuals" was used for the subsequent experimentation. In the performance of the second experiment, 11 "'generations" were produced; each generation consisted of 10 "'individuals." In Fig. 1, curve 1 represents the dependence of the average values of the goal functions on the number of "'generations" produced. The experiments were performed on the above-mentioned high-speed neurocomputer developed by U'l'krainian and Japanese scientists and obtained from the W A C O M Corporation. The complete cycle of training and recognition required several minutes. The results obtained show that the evolutionary model makes it possible to select the optimum collection of features. However, the efficiency of the neurocomputer was insufficient for a detailed investigation of properties of the algorithm of evolutionary optimization. To this end, the efficiency of the neurocomputer must be increased so that the training and recognition cycle is approximately equal to 1 sec (or less). At present, there exist technical possibilities for the realization of such a neurocomputer [8]. In recent years, we have developed neural classifiers whose performance is dozens of times greater than our previous neural networks [9, 10]. This has allowed us to aplSly genetic algorithms not only to the optimization of input feature collections but also to the optimization of the internal structures of neural classifiers; at the same time, the genetic algorithm can also be programmed on an ordinary PC. Let us consider the results of application of the genetic algorithm to the optimization o f the structure of a neural 704
1000
6OO
2OO
0
i
1
2
i
3
4
5
i
6
,
7
8
,
9
10
11
Fig. 1
A Fig. 2
Fig. 3
Fig. 4
classifier with random thresholds [9, 10]. This neural classifier consists of the following two parts: (1) a neural structure that is unable to be trained (we call it "inherent"); the neuron thresholds of the "inherent" structure are generated by a random-number generator and do not vary in the course of operation of the neural classifier: (2) an ordinary perceptron in which the connections between neurons are modified in the course of training of the classifier. The structure of the neural classifier with random thresholds is described in [9]. It is relevant to remark that the destination of the "inherent" classifier structure is a transformation of the space of features that makes it possible to linearly separate different classes: this problem is successfully solved by a three-layered perceptron. Let us taow consider in more detail the application of the genetic algorithm to the optimization of the classifier structure. A three-layered perceptron developed for recognition of printed symbols was used to solve this problem. The neurons of the retina corresponded to the first layer, and the neurons designating the alphabet symbols [I0] corresponded to the third one. The connections between the second and third perceptron layers were modified (the synaptic weights of these connections were varied in the course of training). The second perceptron layer consisted of 256 or 512 neurons. The connections between the fi_rst and second layers ("inherent" structure) were not modified. The genetic algorithm was used to optimize this untrainable structure. A distinctive feature of constructing training and examination perceptron samples lay in the selection of symbols that were frequently confused with one other (tbr example, I and I, b and h). Both training and test samples consisted of 120 symbols for some experiments and 315 symbols for others. The work was organized in such a manner that the check of the
705
~sol
L
0
,
0
20
,
i
,
i
,,,
40
80
80
100
~ .
120
.
.
.
.
.
i
140
Fig. 5
number of recognition errors was carried out after each training cycle of recognizing the test sample. The goal function was evaluated as the sum of the numbers of errors in recognizing the last five tests after 10 or 15 training cycles. Each symbol was placed in a window of size 30 times 32 pixels. In Fig. 2, an example of the letter A is presented. Each window represented a part of the retina, on which one neuron having an excitatory connection with a neuron of the second layer and four neurons having inhibitory connections with a neuron of the second layer were randomly chosen. An example of such connections between the first and second layers is depicted in Fig. 3. The neuron of the second layer operates only in the case where a signal from a symbol image is at its excitatory input and the signals at all of its inhibitory inputs are absent. Each neuron of the second layer has its own distribution of dots over the image from which it receives inhibitory and excitatory signals9 The totality of such connections between the first and second layers determines the structure of the perceptron (Fig. 4). The structure of the perceptron was optimized by the genetic algorithm, which operated as follows. Some perceptrons were generated with various structures that were chosen at random. We called these perceptrons the zero generation. Each of them was trained, and the quality of its operation was estimated on the basis of the tests used in the latest five training cycles. Some of the best perceptrons that are called parents in this article were selected from the zero generation. Each parent generated some descendants. The procedure of generating descendants was as follows. A parent structure in which M N neurons of the second layer were replaced by new neurons that had other connections with the retina neurons was used as a basis: at the same time, the neurons and connection were selected at random. Such a replacement was called a mutation. In the first case, the number of parents was P N = 4 and the number of mutations was M N = 25. Each parent had five descendants (the total size of a generation was equal to G S Z = P N x 5 = 20), and the parents perished after reproductions. The average values of the goal function of parents for various generations are presented in Fig. 5 (curve 1). In the second case, the number of parents was equal to six. The best parent had six descendants, the second parent had five descendants, etc. The worst parent had one descendant. The parents survived after reproductions and could compete with their descendants in subsequent generations. In this case, the number of mutations was the same as in the first case ( M N =25). The average values of the goal function are depicted in Fig. 5 (curve 2). In the third case, the number of mutations varied. The best parent had seven mutations as a result of generating each descendant, the second parent had 14 mutations, etc. The worst parent had 42 mutations (Fig. 5, curve 3). In the fourth case, the number of parents was increased up to nine, and the total number of descendants reached 45 (Fig. 5, curve 4). In all cases, the progress and decrease in the number of errors is obvious. As a result of optimization, the number of errors for the test sample was reduced by a factor of about 1.5. As was revealed by additional experiments, the same decrease in the number of recognition errors could be obtained by an approximately fourfold increase in the number of neurons in the intermediate layer. In this case, the recognition time and required memory were also increased fourfold. Hence, the genetic algorithm of optimization makes it possible to
706
substantially reduce the size of the neural network and recognition time with the same quality of recognition. The results of optimization of the neural classifier structure show that the quality of its operation can be improved: however, the classifier can also be used if a lack of time or other reasons do not allow one to optimize it. Its operation without optimization has been investigated for problems of optical recognition of printed symbols. The experiments were performed on two types of printed symbols [10]. The first type consisted of symbols printed on a matrix printer with a medium printing quality, and the symbols of the second type were taken from high-quality printed magazines. The printer symbols were recognized without errors. The recognition speed was 40 symbols per second. In the second case, the quality of recognition reached 98-100%, and recognition speed was 300 symbols per second. The recognition speed was estimated on an IBM PC 386 (40 MHz). The obtained operating speed of the classifier in recognizing printed symbols exceeds the operating speed of well-known optical character recognition (OCR) systems, for example, that of the AUTOR system [1 I], in which the processing of one text page (about 1500 symbols) requires 2-2.5 min. In well-known foreign OCR systems, the percentage of correctly recognized symbols is equal to 99.5%, and the speed of data input is from 0.5 min to 2 min. For systems with an additional hardware support, this time can be reduced to 10 sec per page [12]. We have described two examples of the application of genetic algorithms to the optimization of recognizing neuronet devices. In the first example, the collection of features used for recognition of handwritten symbols was optimized: in the second, the untrainable part of the neural network structure was optimized. In both examples, a significant increase in the percentage of correctly recognized symbols was noted. The effect was more obvious in the second example. The number of recognition errors was reduced by a factor of 1.5. In the future, the evaluation of the efficiency of genetic algorithms in optimizing neural interpolators that are widely used to solve prediction problems will be of interest.
REFERENCES
1. 2. 3. 4. 5. 6. 7.
8.
9.
10.
I1. 12.
T . N . Baidyk, "Evolutionary optimization of a collection of features," in: Neural Networks and Neurocomputers, V. M. Glushkov Cybernetics Inst. of NAS of Ukraine, Kiev (1993), pp. 26-33. N.M. Amosov, T. N. Baidyk, A. D. Gol'tsev, et al., in: N. M. Amosov (ed.), Neurocomputers and Intellectual Robots [in Russian], Naukova Dumka, Kiev (1991). A.H. Clopf and E. E. Gose, "An evolutionary pattern recognition network," IEEE Trans. Syst. Sci. Cybem., 5, No. 3, 247-250 (1969). A.N. Mucciardy and E. E. Gose, "Evolutionary pattern recognition in incomplete nonlinear multithreshold networks," IEEE Trans. Electron. Comput., EC-15, No. 2, 257-261 (1966). E.M. Kussul and A. N. Luk, "Evolution as a process of search for an optimum." Sov. Sci. Rev. Sci. Developments in the USSR, 3, No. 3, 168-172 (1972). Keon-Myung Lee, Kyung-Me Lee, and Takeshi Yamakava, "Genetic algorithm for the traveling salesman problem with precedence constraints," Proc. 5th Europ. Congr. on Intel. Techn. and Soft Comput., 1, 809-813 (1997). E. Kochergov, N. Strechen, and T. Kalganova, "Using a genetic algorithm for optimizing the functional decomposition of multiple-valued functions," Proc. 5th Europ. Congr. on Intel. Yechn. and Soft Comput., 1,826-830 (1997). 1~. M. Kussul, D. A. Rachkovskij, and T. N. Baidyk, "Associative-projective neural networks: architecture, implementation, applications," Proc. Fourth Intern. Conf. Neural Networks & Their Appl., Nov. 4-8, EC2 Publ., 463-474 Nimes, France, ( 1991). E.M. Kussul, T. N. Baidyk, V. V. Lukovitch, and D. A. Rachkovskij, "Adaptive high performance classifier based on random threshold neurons," in: R. Trappl (ed.), Cybernetics and Systems'94, World Scientific, Singapore (1994), pp. 1687-1695. E. M. Kussul and T. N. Baidyk, "'Neural random threshold classifier in OCR applications," in: Proc. of the Second All-Ukrain. Int. Conf. UkrOBRAZ'94, V. M. Glushkov Cybern. Inst. of NAS of Ukraine, Dec. 20-24, Kiev (1994), pp. 154-157. "Text input from a sheet of paper without a keyboard," Komp'. Press, No. 2, 38 (1993). I. Zenkin, A. Kucherov, B. Mazo, and A. Petrov, "What is a system of automatic text reading'? Or once again about OCR," Komp'. Press, No. 3, 23-28 (1993).
707