Wuhan Universtty Journal o f Natural Sciences
Vol. 6 No. 3 2001,659~664
Article ID. 1007-1202(2001)03-0659-06
Parallel Evolutionary Modeling for Nonlinear Ordinary Differential Equations Kang Zhuo, Liu Pu, Kang Li-shan Computation Center, State Key Lab. of Software Engineering, Wuhan University,Wuhan 430072, China
Abstract:We introduce a new parallel evolutionary algorithm in modeling dynamic systems by nonlinear higher-order ordinary differential equations (NHODEs). The NHODEs models are much more universal than the traditional linear models. In order to accelerate the modeling process, we propose and realize a parallel evolutionary algorithm using distributed CORBA object on the heterogeneous networking. Some numerical experiments show that the new algorithm is feasible and efficient.
Key words 9parallel evolutionary algorithm ; higher-order ordinary differential equation; CORBA C L C n u m b e r - TP 301.6
0
Introduction
Mathematical model, consisting of a group of equations which open up the factors and their relationship of a given object or system, moreover, depict the law which makes a dynamic system progresses, is the scientific abstraction or reflection of the essence of the real world. There are a lot of complex dynamic phenomenon such as price fluctuations, weather changes, population increases existing in engineering projects, management, natural science and social science. So, to a dynamic system, how to erect a proper mathematical model by which we can analysis and predict its development is an important research realm. In order to model the dynamic systems, many people have already introduced some models such as AR (n), MA (rn), ARMA (n, m), ARMA (n, m ) c13. But all of them are linear difference equation models that have limits to applications. On the other hand, the formal approach of modeling is a stepwise refinement and time-consuming progress. People determine the model structure at first by some experiences, then estimate the parameters of the model, at last evaluate the model by experiments, and if the results are not satis-
fled, they must try the steps above again and again. Computer technique, certainly, can be used in this approach. And this approach is much like an evolutionary process. So why not use Evolutionary Algorithm to set up a NHODE (Nonlinear High Order Differential Equations) model, a more universal model, to a dynamic system? Then how to accelerate the modeling process? All these are the main points of this paper. In section 1, an evolutionary algorithm in modeling NHODE models is introduced. A parallel evolutionary algorithm based on CORBA to accelerate the modeling process is proposed in section 2. Some numerical experiments have been done in section 3. Finally in section 4 there are some conclusions presented.
1 Evolutionary Algorithm on NHODEs 1.1
Problem Depiction
Assuming that, to a one-dimension dynamic system x ( t ) , there is a series of observed data X during the past m + 1 time click t, = to + i ~ At, i = 0,1,2,'",m, X = (X(to),X(tl),X(tz),'",X(tm)) T (1)
Received date: 2000-10-15 Foundation item: Supported by the National Natural Science Foundation of China(No. 70071042 and No. 60073043) Biography: Kang Zhuo(1970-), male, Lecturer,research interest: network computing and evolutionary computation.
Wuhan University Journal of Natural Sciences
660
Where to is the starting time of sampling and At is the interval of sampling. The problem of modeling NHODEs for the dynamic system is to find a high order differential equation model as follows x (") = f ( t , x , x '
Vol. 6 X 14)
ff
, x " , x ("-1))
X(to) = Xo
x' (to) = Xo
(2)
Fig. 1
l
x ('-1)(to) = Xo ('-1)
Which satisfies II x - x - II -- min, V f E F (3) In (3), X" is the value vector x" (t) in the sampiing points, while x" (t) is the solution to the problem (2), x" = (x" (to), x" (tt), x" (tz),'"
, x" (t.,)) T
(4) tl x - x" II --
(x(t,)
-
-
x" (ti)) z
(5)
t~ O
F, called the model space of a dynamic system, is the set of compound functions combined by elementary functions with function complexity no less than D , which defined as the tree depth when a compound function is expressed by a binary tree. 1.2 Evolutionary Algorithm on NHODE EA (Evolutionary Algorithm) is a kind of intelligent computation model, characterized by its imitating the evolution process of nature, especially biology, to resolve the complex problems via computers. By using chromosome to present the problem then executing genetic operations on the chromosomes, EA finally gets the results after evolution of a set of chromosomes according to the Darwin's Natural Selection Rule. GP (Genetic Programming) is a kind of EA based on tree structure [z]. In GP, the solution space of a problem is presented by tree in which leaves are the variables in the problem and non-leaf nodes are the operation of variables. The set of variables is called terminal set and the set of operation is called function set. Any differential equation can be presented as a parsing binary tree, for instance, a fourth order equation, x (4) = x (3) q- x" - 8 - x' + t - e~ , can be described as a binary tree in Fig. 1. In Fig. 1, Yl = x, y2 ~-- x ' , Y3 :--- X n , " ' , Y . = /(---1)
9
Binary tree expression of an equation
When given a tree expression and the relative initial values of the equation, we can calculate X" according to x" (t) , the solution to problem (2), and II x - x " II So when using II x - x " II as the fitness value of the tree model, we get the evolutionary algorithm in modeling NHODEs as follows: PROCEDURE : Initialize population P(0) ,including n trees generated randomly P(0) = {I1(0), I z ( O ) , I 3 ( 0 ) , "", In(0)}; t~0 ; Calculate the fitness of each individual in P (t) ; While (not terminated) do Begin Pc(t) = crossover{P(t) } ; Pro(t) = mutation{Pc(t) } ; Calculate the fitness of each individual in Pm(t) ; P ( t -t- 1) = select{P,.(t)} ; t~-t "Jr"1
end
The best individual can be used as the model of this problem. The method above is introduced in detail in Ref. [3-]. But in practical uses, there are some shortcomings in this sequential algorithm. Firstly, calculating the fitness is time-consuming, for there is an integral computation in this part of calculation. And that the time complexity of integral computation is in direct proportion to the quantity of observed data confines the scale of the practical problems to which this method is applied. Secondly, in Ref. [3-] a Genetic Algorithm is combined in this method to optimize the parameters of the model, which is promising to accelerate the convergent speed, but on the other hand, unfortunately, is time-consuming too. What's more, when a dynam-
No. 3
Kang Zhuo et al: Parallel Evolutionary Modeling for".
ic system needs to be modeled dynamically in realtime, this method is much far from reaching the goal. So parallelization is necessary to speed up the modeling process. In section 2, we propose a parallel algorithm based on CORBA.
2
A New Parallel Algorithm
With the development of OOP (Object Oriented Programming), CORBA begins to play an important role in parallel and distributed computation. CORBA defines a standard method to build software components based on distributed objects, so different applications on the network, especially heterogeneous network such as Internet, can share the objects produced by CORBA. Evolutionary algorithm, consisting of a large number of individuals that can be treated as distributed objects, is reasonably parallelized in Internet or Intranet characterized by cooperation among hosts in a large scale on the network. The proposed parallel algorithm is a kind of master-slave algorithm. In order to set up the communication of master and slaves, a CORBA object EC serves as the distributed object shared by master and slaves. Via interface function provided by CORBA, master and slave can access object EC as if it were in local. The master-slave structure is displayed in Fig. 2. In order to explain the parallel algorithm clearly, there are three parts introduced one by one as follows.
Master Object EC
Interface Function
Slave t Z
Slave N
Fig. 2
Structure of algorithm
661
2. 1 Master and Slave Procedure Firstly, we define a Tree, consisting all the nodes, is organized by two parts, Structure and Parameter. (~ Tree = Structure + Parameter Where Parameter is the set of leaf nodes, each of them is assigned a constant value,Structure is the set of rest nodes. Parameter = { constant-valued leaf node} Structure = Tree {constant-valued leaf node } According to the division of Tree, the building process of Tree consists of two steps that are Structure-making and Parameter-optimizing. For Parameter optimizing is a time-consuming process, each slave does this process for a given Tree with a determined Structure, while master is in charge of initialization and displaying the collected results. MASTER PROCEDURE 9 Initialize EC by creating population P(0) and Pm(O) , each including n Tree;, EC. P ( 0 ) = {Tree1(0), Treez(0), Trees(0), "", Treen (0) } ; EC. Pro(0) =EC. P(0) ; EC. TreeStatus[-i]=prepared, l<~i<~n; t~---0 while(not terminated) do begin Fori" = l t o n d o if EC. TreeStatus[i]= done then begin EC. Pm. Tree, =Mutate(EC. P(t). Tree,); {Change the Tree structure} EC. TreeStatus[i] " =prepared; end t~--0 end SLAVE PROCEDURE while(not terminated) do begin Fori' = ltondo begin if EC. TreeStatus I-i] = Prepared then begin optimizeParams (EC. P,, (t). Tree,); {OptimizeParams: calculate the best Tree parameter}
662
Wuhan University Journal o f Natural Sciences
if EC. Pro(t). Tree,. Fitness>EC. P ( t ) . Tree/. fitness, then EC. P ( t ) . Tree,=EC. Pro(t). Tree/; EC. TreeStatus [-i] : = done; end; end end Remarks : 9 Shared distributed object EC consists of two tree populations P and Pro. The Mutated individuals in P are stored in Pro. The array Tree Status indicates the calculation status of each individual in P . There are two status : Prepare, Done. The relationship between Master and Slave is the bi-directional consumer-producer, that is, Slave uses the mutated trees produced by Master while Master utilizes the trees optimized by Slave. Array TreeStatus is used to control the synchronization of Master and Slave. 9 With the introduction of Parameter-optimizing, untimely convergence can be avoided in the process of evolution because some trees with better structures, if not with optimized parameters, could be kicked out earlier from the population than other trees with worse structure but with better parameters. 9 Slave does two time-consuming processes, parameter optimizing and tree fitness evaluating, in parallel. So high speedup is promising. 9 This algorithm has good scalability for every Slave can take part in or quit the modeling process at any time without changing the results. 2.2 Non-linear Parameter Optimization The efficiency of algorithm OptimizeParams is essential to the whole parallel algorithm. But for a non-linear model, its parameter optimization is a hard work too. Some traditional methods such as local search, Gauss-Newton, Hooke-Jeeves, and Nelder-Mead usually need strict constrain to models such as models must be continuous, can be differentiated or are unimodal Ref. [4-]. Because many non-linear models generated in this parallel modeling algorithm that need to he optimized are often complex, with different number of parameter, the traditional methods are not feasible. Hereby, we use another evolutionary algorithm again to solve this hard problem. Recently a
Vol. 6
powerful algorithm is proposed in Ref. [4] for solving parameter optimization problems. We adopt this algorithm to optimize parameter and achieved satisfied results. 2.3 Synchronization of Shared Object CORBA adopts object-oriented multi-layer client/server model. Client access the CORBA object through generating a service request to ORB (Object Request Broker) while Server is ready to response to the request from ORB. When a request is arrived, BOA (Basic Object Adapter) in the server side manages to call the interface functions of CORBA object's instances and then return the results. In order to response to server requests from different clients on the network without delay, BOA creates an independent thread to access the shared object for each request. When multiple concurrent threads access a shared object, it brings the synchronization problem. For example, When one thread is writing data to the shared object, another thread that is reading the data in the same time may get the wrong data. To solve the synchronization problem is to promise that the shared data that are being modified would not be read until the modification is finished. According to the discussion in Ref. [-51, we add codes in the interface function of CORBA object to resolve the synchronization problem successfully. 3
Numerical Experiment
In order to verify the efficiency of the new parallel algorithm, we have done some numerical experiments. The experiment environment settings are as follows : S Intranet includes six PII-266 PCs running Windows 98 with 10M Ethernet networking. 9 CORBA system is the OSAgent provided by Inprise Corp. 9 Modeling settings are as follows: Function Set = { + , - , * , / , ^ , s i n , c o s , exp,ln} Terminal set = {yl,..., Y , , t , c } , where n is the order of the differential equation. In Master procedure, the population size is 50.
No. 3
Kang Zhuo et al: Parallel Evolutionary Modeling f o r ' "
In Slave procedure, the population size is 20, and the dimension of subspace 7.
Experiment 1 For predicting the results of a multi-dimensional function optimization problem called Bump Problem as its dimension getting larger and larger, we have solved the Bump problem series: n
n
iE os, maxf.(x) =
2Yicos
,=1
df -d~-
t
1' 984 757 - 8' 494 748 lnlfl f f(2) = O. 364 98
tff
-~t + 9. 029 666 f - - (7.222 158 + t) = 0
f(2) = O. 364 98 l d-~ f = ~.-,25.890 778 -- 21.818 601)in]f] f i f ( 2 ) = O. 364 98
z.s,,
l-ix,
0. 7s
t=l
n = 2,3,4,'",50 and get the best solutions listing as time-series fz, f 3 , f 4 , " " ,fso shown in Fig. 3,
E ,-i
t,
Fig. 3
= I
10
==========================
(9)
The last one (9) is the best we found. It not only fits the data very well, but also predict very good behaviour of maxf,(x) as the dimension n --CxD
That is maxf,(x) = 0. 849 730 n~oo
Experiment 2
I
I
I
I
18 26 34 42 Dimension of BUMP problem
I
50
Best results of maxfn(x) for n = 2 , 3 , ' " ,50
Taking the data as the observed data for model building and the following nonlinear second order ordinary differential equation is discovered by computers automatically .. df -
(8)
tl
i=1
0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30
(7)
r3.f
0~
n
(6)
[~- 15.645756df(t df)-~ q---~
,=1
~f(2) = O. 364 98 subjectto
663
f2 - f 15. 675 7 36(~-~)
i f ( 2 ) = 0. 364 98 /dfl 0. 515 785 5 -- 0. 364 98 [ ~ ,,=z = ~-~ = 15.080 55 Where At = 0.01 is the step-size in the numerical integral method which corresponds to the increment of dimension n. Other models found by computers as follows are nonlinear first or second order ODEs:
The data is introduced in Ref. [17, which is the record of temperature change in a chemical experiment. In Ref. [ 3 ] , the sequential algorithm needs about 4000 seconds to get a satisfied model. We use the first 220 observed data to set up a HODE model with which to predict the next 6 data and to verify the correctness of the model by comparing the observed data with the predicted data. By the algorithm, a third-order equation model in Table 1 with the minimum fitting and predicting standard deviation is selected through ten times experiments. And the speedup is also calculated. In Fig. 4, the abscissa axis indicates the time t, the ordinate axis indicates the function z , the line indicates the observed data and the series of points indicate the data produced by model. The following Fig. 5 is the speedup of parallel algorithm using different number of hosts. The abscissa axis indicates the host number and the ordinate axis indicates the speedup. The experiments above show that a nearly linear speedup ratio can be achieved by using the new parallel algorithm. And a satisfied model can be gotten in desirable time. The scalability of the parallel algorithm also provides much convenience.
Wuhan University Journal of Natural Sciences
664
Table 1
Vol. 6
Result Model x ~' = 0. 111 061 * ( ( x " * x ' ) -- x")
Model Fitting standard deviation
0.1195
Predict standard deviation
0.6083
Observed next 6 data
20.2
19.7
19.3
19.1
19.0
18.8
Predicted next 6 data
20.3266
19.9351
19.6181
19.3728
19.1973
19.0901
29
4
,~ 27
25 ,.o
o 23
.~ 19
15
[
I
I
I
I
50
100
150
200
226
Conclusion
In this paper, a new parallel algorithm we propose to model NHODEs shows a great promising ability to speed up the modeling process. Because of its high performance, in the next step, we plan to apply this algorithm to the KDD realm, that is, discovering knowledge, which is represented by HODE, from large database.
t/s Modeling data and observed data in experiment 2
Fig. 4
5.4 .2 ~
4.5
4
e~
3.2 ~
2
I
I
2
3
Fig. 5
I
4 numberof hosts
I
I
5
6
Speedup ratio of experiment 2
References : Et] Gan Ren-chu. The Statistical Analysis of Dynamic Data. Beijing:USTB Press, 1991(Ch). [2] Koza J R. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, MA: MIT Press, 1992. [-3] Cao Hong-qing. Evolutionary Modeling of Complex Systems Based on Genetic Programming. [-Ph.D. Thesis-] Wuhan: Wuhan University, 1999(Ch). [-4-] Guo Tao. Evolutionary Computation and Optimization. EPh. D. Thesis], Wuhan.. Wuhan University, 1999(Ch). ES] Liu Pu. PJVM: Obieet Oriented Parallel and Distributed System Based on Java. Computer Development and Research, 1998, 35(6) :491-495.