W u h a n University Journal o f N a t u r a l Sciences
Vol. 5 No. 4 2 0 0 0 , 4 0 6 ~ 4 1 2
Article ID: 1007-1202(2000)04-0.406-07
Asynchronous Parallel Evolutionary Algorithms for Constrained Optimizations Kang L i - s h a n , Liu Pu, Kang Zhuo, Li Yan, Chen Yu-ping State Key L a b o r a t o r y of Software Engineering, W u h a n U n i v e r s i t y , W u h a n 430072, China National Laboratory for Parallel and Distributed Processing
Abstract : Recently Guo Tao proposed a stochastic search algorithm in his P h D thesis for solving function optimization problems. He combined the subspace search method ( a general m u l t i - p a r e n t recombination strategy) with the population hill-climbing method. The f o r m e r keeps a global search for overall situation, and the latter keeps the convergence of the algorithm. G u o ' s algorithm has m a n y advantages ,such as the simplicity of its s t r u c t u r e ,the higher accuracy of its r e s u l t s , the wide range of its applications ,and the robustness of its use. In this p a p e r a p r e l i m i n a r y theoretical analysis of the a l g o r i t h m is given and some numerical experiments has been done by using G u o ' s a l g o r i t h m for d e m o n s t r a t i n g the theoretical results. Three asynchronous parallel evolutionary a l g o r i t h m s with d i f f e r e n t granularities for M I M D machines are designed by parallelizing G u o ' s Algorithm.
Key words : asynchronous parallel evolutionary a l g o r i t h m ; function optimization CLC number : T P 3 0 1 . 6
0
Introduction
There are efficient and effective algorithms for solving function optimization,such as genetic algorithms and evolution strategies [~'2]. Recently Guo Tao [3] proposed a new population random search algorithm which is based on the subspace search (a general multi-parent recombination search strategy)combining with population hill-climbing for solving function optimization problems with inequality constraints. Numerical experiments show that Guo's Algorithm is very efficient and effective. But there is a little theoretical analysis of the algorithm, that is unfavourable for its application. In this paper, the authors try to give a preliminary theoretical analysis of the algorithm and based on which three asynchronous parallel algorithms are designed with different granularities for MIMD machines, such as shared memory multiprocessors and message passing multicomputers.
1 GUO's Algorithm and Its Theoretical Analysis For the sake of simplicity, we consider the following function optimization problem: rain f ( X ) (1) XED
where f ( X ) is the objective function, X = (xl ,x2, ...,x,)T E R"andD = { X Ili <~ x i ~ ui, i = 1 , 2 , "H,l'll .
Select m points X';, i = 1 , 2 , ' " , m from P to form a subspace:
v = {x E D l X = ~,,a,X; } i=l
where~ai=l,and--0.5~ai~
1.5.
i=l
The Guo's algorithm is described as follows. ALGORITHM GUO-1 : Begin
Received date ~2000-04-29 Foundation item: Supported by the Natonal Natural Science Foundation of China(No. 70071042 ;60073043),the National 863 Hi-Tech Project of China(No. 863-306-ZT06-06-3) and the National Laboratory for Parallel and Distributed Processing. Biography : Kang Li-shan (1934-), male, Professor, research interests : parallel computing and evolutionary computation. E-mail z kang (~whu. edu. cn
Kang Li-shan et al:Asynchronous Parallel Evolu-..
No. 4
initialize P = {X1,Xz,"" ,XN} ; Xi E D t: = 0 ; X ~ t = arg min f ( X i ) ; 1~ i~ N X,,o~, = arg max f ( X i ) ; 1~ i~ N
to express that X1 is better than Xz. Then Guo's algorithm for solving problem (2) can be described as follows. ALGORITHM GUO-2 : Begin
f(Xb~,) :fi f(X,o~.t) do t select m points X'I, Xz , ' " , X', from
While
initializeP = {Xa,Xz,"',X.}; X~ E D t: :0; Xb,,, = arg better(Xbe.t, X ) , V X 6 P ; Xwom = arg better(X, Xworst), V X E P ; while not better (Xworst, X~,t) do select m points X'I, X'z,"" ,X'm from P randomly ; select one point X' from V randomly ; If better(X', X .... t) t h e n X .... ," = X' ; t: =t+l; Xb~, = arg b e t t e r ( X ~ , , , X ) , V X 6 P ; X,,or,~ = arg better(X,Xwo~,t),V X E P ; end do output:, P;
P randomly select one point X' from V randomly; If f ( X ' ) < f(Xwo,,t) then Xwo~,, : --- X' t: =t+l; Xu,,=arg rain f ( X i ) ; 1< i< N X,,o,,t = arg max f(Xi); 1~ i~ N
end do
output t, 19 ; end
where N is the population size and m - 1 is the dimenssion of subspace V if the vectors X'i,i = 1,2, 9..,m are linear independent, t is the number of iterations. X~,,t = arg min f ( X i ) denotes one of l~ ~< N the arguments (individuals) which satisfy f(Xbest) = rain f ( X i ) . l< ~< N
Remark 1 INITIALIZE P means selecting N points (individuals) from D randomly to form an initial population. Remark 2 N c a n be selected according to the dimenssion n of the problem and the complexity of its landscape. If n is large and f ( X ) is complex, then N is large too. In our experiments N is chosen as 20 ~ N ~ 150. Remark 3 To our experiments, m is chosen as 7,8, 9 or 10. Remark 4 For solving function optimization problems with inequality constraints. min f ( X ) (2) X ED ~
where D" = {X 6 D~gi(x) ~ 0,[ = 1,2,'" ,q}. 0, g,(z) <~ o Denote hi(x) = gi(x), otherwise q
and
407
H ( x ) = ~ d hi(x). i~l
Define a logic function. better(X1, Xz) =
H(X1) ~ H(Xz) true H(Xl) > H ( X O false (H(X1) = H ( X z ) ) A (f(X1) <~ f ( X z ) ) true (H(X1) = H(Xz)) A (f(X1) > f ( X z ) ) false
end
where X~s, = arg b e t t e r ( X u ~ , X ) , V X 6 P denotes the variable X~., E P which satisfies better (Xb~,, X ) , V X 6 P . A preliminary theoretical analysis of Guo's algorithm is as follows: 1) The algorithm adopts a population search strategy as the evolutionary algorithms, that is useful for finding the global optimum. 2) The algorithm adopts a stochastic search strategy in the random subspaces, especially the searches in the subspace are not convex recombination :
ai xi, i~l
ai = 1 i=l
and - 0 . 5 < ~ a i < ~ 1.5 such that the search subspace can cover the multiparent recombination space and guarantee the ergodicity of stochastic searches. It means that in the solution space there is no room which can not be searched by the algorithm. 3) The algorithm adopts the selection strategy: elimination of the worst. In every iteration, only the worst individual in the population is eliminated. In this case, the selection pressure is mininum. It ensures the diversity of the population and the long life of the fine individuals in the population. This is a population hill climbing strategy
Wuhan University Journal o f Natural Sciences
408
which keeps the whole individuals in the population getting the optimum simultaneously. If there exist many solutions, then this algorithm may find several solutions simultaneously in one run. In Ref. [ 4 ] , an extinction-based evolutionary algorithm is proposed which can be regarded as a population hill climbing strategy too, but the algorithm is totally different. According to above three characteristics of the algorithm, if the iterative number t is regarded as the generation of the population, then the sequence {P(t)} forms a Markov chain. Using the analysis method for the convergence of evolutionary algorithms D], one can get the convergence theorem of ALGORITHM GUO, that will be discussed in another paper.
2
Numerical Experiments
In order to examine the efficiency and effectiness of the algorithms, two kinds of function optimization problems (problems ( 1 ) and ( 2 ) ) are solved. We solve a Bump function as follows: Example 1 Minimize 50
50
"••jCOS
4 Xi - - 2 H C O S
fl(z)
=-
,=l
2 xi
i=1
~i
x2i
for 0 < x, ~< 10, i = 1 , 2 , " ' , 5 0 . subject to 50
~x, i~l
50
~< 375,
]--[ x, >~ O. 75 i=1
The landscape of the function is very complicated and solution X" is almost on the 49-D maniso fold: ]--[ x~ = 0. 75. i=l
This problem is hard to minimize for any algorithms including GAs and ESs. We minimize it by ALGORITHM GUO-2 with parameters N = 20 and m = 7 , and get the following result : m i n f 1(X) = f l ( X " ) = 0. 83526201238794 where X* = (6. 28324314967593,3. 16959135278874, 3. 15587932289134,3. 14221834166447,
3.12875141132280,3. 3.10135016319809,3. 3.07467526362337,3. 3.04886636373126,3. 3.02235650101037,3. 2.99431484723465,2. 2.96617535809182,2. 2.93826263614623,2. 0.48770508444941,0. 0.48388281752034,0. 0.47901882085524,0. 0.47501098097910,0. 0.47128211104035,0. 0.46726830592743,0.
Vol. 5
11533720178037 08867068084463 06187302198439
03519777233540
00791087929595 98078207510642 95213454834831 92347051874431 48608797212231 48168154669018 47712972009898 47259800785213 46871206605219 46606878481357 0.46376534354877,0. 46204633230610 0.45965521445116,0. 45884822857585 0.45717243025277,0. 45481532957423 0.45342381481910,0. 45166163748202 0.45021480667760,0. 44845064437796 0.44673980402160,0. 44526519733653 0.44338085480658,0. 44213030342910 0.44043946842174,0. 43904935652294 0. 43874135178293,0. 43634175968446 This result is better than all the reported results up to now. Example 2[11 Minimize
f2(x,y)
t
= - C21.5 + x sin (4~x) -+- y sin(20,~y)] D={(x,y)[-3<~x~<12.1,4.1~y~<5.8} We minimize it by ALGORITHM GUO-1 with parameters N = 20 and m = 7 , and get the following result : min f z ( x , y) (x,y)6D
= f 2 ( l l . 62554470417473,5. 72504424402209) ---- -- 38. 8502944794474 This result is better than that in Ref. [-17. Example 3 Minimize
f3(x,y) =-
(Ixl + lyl + Ixyl) D = {(x,y) [ - 10 ~< x , y <~ 10} We minimize it by ALGORITHM GUO-1 with parameters N = 20 and m -- 7 , and find four solutions in a run as follows. min f 3 ( x , y ) = f3(10,10) = f ~ ( - 10,10) (x,y)ED
= f3(10, -- 10) = f3(-- 10, -- 10) ---- 120. 00000000000000 For solving function optimization problems with inequality constraints, according to the problem -
-
No, 4
Kang Li-shan
et
a l : A s y n c h r o n o u s Parallel Evolu-.-
G , z ( ~ ) in Ref. [ 6 ] , we design a problem as fol-
lows : Example 4 Minimize f4(xl,x2,x3) [ ( x l -- 5) z ~- (xz -- 5) z + (xa -- 5) 2 -- 100]/100
t
=
D = {(x,,xz,xs)[04 x i 4 10, i = 1,2,3} subject to the following constraints 3
~--~(x,--j)2~
(0.25) 2 ,
j=
1,2,"',9
i=1
The feasible solutions are located on a multiply connected region. It is obvious there exist infinite solutions which are located on the spharical surface: 3
(x, - 5) 2 = (0. 25) ~ i=l
We minimize it by ALGORITHM GUO-2 with different parameters and get different solutions. Taking N = 30 and m = 3 we get min f 4 ( x l , x 2 , x 3 ) = f4
14595369413159 88996249317000 = - 0. 99937450000000 with 14 decimal precision, and the 30 individuals in the population are the same. The average of the iterative numbers is 1473. So the algorithm converges quickly. Taking N = 30 and m = 4 , we have the same conclusion, but the average of the iterative numbers is 3255. Taking N = 3 0 , but 6 4 m 4 1 0 , we may get fairly good approximate solutions. In these cases, the algorithm is always stopped by t = T m . x , where T,~x = 100000 , and the 30 individuals in the population are different. It means that we may get 30 approximate solutions in one run. When N = 30 and m = 5 , the algorithm is in a critical state. In most cases,the algorithm is stopped by t = T ~ x , and the 30 individuals in the papulation are different, in this case, the solutions are relatively poor. Sometimes t < Tmaxand the 30 individuals in the population are nearly the same (approach to the same point), in this case, the solution is fairly good. T a k i n g N = m = 10 , after 10000 iterations we always get 10 identical individuals in the population, but the result is still very poor. Example 5 Minimize
409
f s ( X ) = 5. 3578547 x z + 0. 8356891 xl xs
f
+ 37. 293239 x, 40792. 141 D = {X[78 4 xl 4 102, 33 4 x2 4 45, 27 4 xi 4 45, i = 3,4,5} subject to three double inequalities: 0485.334407 + 0.0056858 x2 xs + 0 . 0006262 x l x 4 - O . 0022053 x3xs492 90480. 51249 + 0. 0071317 x~xs + 0 . 0029955 xl x 2 + 0. 0021813 x3z4110 2049. 300961 + 0. 0047026 xa xs + 0 . 0012547 xl x s + 0 . 0019085 x3 x4425. We knew the optimum solution[S]: min fs (X) = f s ( 7 8 . 0 , 33.0, 29. 995, 45.0, 36. 776) = - 30665.5. Koziel and Michalewicz [s] recently provided with an excellent value of - 3 0 6 6 4 . 5 by a new evolutionary algorithm ALGg. We solve the problem by ALGORITHM GUO-2 with parameters N = 30 and m = 10 ,and get the best result: min f s ( X ) = f s ( X * ) = - - 30665. 53867178332800 where 78. 0000000000000033.00000000000000 X* = 29. 99525602568154 44.99999999999987 36. 77581290578836_ Example 6 E43 Minimize -
f6(X)
t
-
= x~ + x~ -- O. 3cos(3nxl)
0.4cos(4~x2) + 0.7 ( D = {XI -14 x, 4 1 , i = 1 , 2 } The globally minimum value is f s ( 0 , 0 ) = 0. We solve it by ALGORITHM GUO-1 with parameters N = 20 and m = 7 , and get the solution in t = 1266 iterations : f6 (0. 0000000004,0. 0000000009) -
-
= O. 0 0 0 0 0 0 0 0 0 0 0 0 0 0
It is better than the results in Ref. [-4]. Numerical experiments show that 9 ALGORITHM GUO can be used for solving very complicated function optimization problems such as f x ( x , y ) and f 2 ( x , y ) 9 9 ALGORITHM GUO can be used for solving function optimation problems with many solutions, such as f 3 ( x , y ) , and f ~ ( x l , x 2 ,
Wuhan University Journal of Natural Sciences
410
xs) and get many solutions in one run. 9 ALGORITHM GUO can be used for finding higher accurancy solutions. In all examples we get the solutions with 7 significance digits at least. 9 ALGORITHM GUO is very efficient comparing with other algorithms we used before, such as GAs. 3
Asynchronous
for Problem
Parallel
Algorithms
(1)
Guo's algorithm is a sequential algorithm, but it is very easy to parallelize it as asynchonous parallel algorithms. We define a parallel algorithm for multiprocessors (or muhicomputers) as a collection of concurrent processes that may operate simultaneously for solving a given problem [7]. An asynchronous parallel algorithm is a parallel algorithm with the following properues .. Is3 : 1) There is a set of global variables accessible to all processes. 2) When a stage of a process is complete, the process first reads some global variables. Then based-on the values of the variables together with the results just obtained from the last stage, the process modifies some global variables, and then activates the next stage or terminates itself. In many cases, to ensure logic correctness, the operations on global variables are programmed as critical sections. The global variables (shared data) are stored in the shared memory of the multiprocessor system or in .some local memory of the multicomputer system (or computer cluster). Parallelization Scheme I We assume that the shared memory muhiprocessors system Eg3consists of p + 1 processors, p;(i = 0 , 1 , " ' , p ) . The global variables P[1 : N] , ~[1 " P] and t are stored in the shared memory. The elements of P and X are individuals. P[1 " N] is used to store the population P = { X~, X z , ' " , XN} , and ~[1 " P-I is used to store the global variables (individuals)~ = { ~ x , ~ z , ' " , X'p} , where X; is the shared variable of Po and p,, i = 1 , 2 , ' " , p. t is the number of iteration.
Vol. 5
Processor Po is used to manage the global variables by running the following process Go: PROCESS Go Begin t " -----0~ initialize P = { X i , X z , ' " , XN} ; X~ E D initialize X~ = { ~ , , ~ z , " ' ,~p}~ X~; E D evaluate: f ( X D , i = 1 , 2 , ' " , N sorting P by exchanging; while f(P[1-]) r f ( P [ N ] ) do t " =t+l~ sorting P by insertion with ~ ; end flo output t , P ; end where 1) sorting P by exchanging means sorting the elements of P according to their values of fitness function 2) sorting P by insertion with ~ means inserting the elements (individuals) of ~ into the population P according to their values of the fitness function. There exist p elements which are eliminate~l from the population through competition. That is always the case. P[1] = Xs~.,and P [ N ] = X .... ~. Processor p#(j = 1 , 2 , ' " ,p) is used to update the variable X*i of the global variables .~ in the shared memory by running the following process Gi: PROCESS G~ Begin while (t ~ O) A ( f ( P [ 1 ] ) ~ f ( P E N ] ) ) do select m i points X'l, ' " , X'.,i from P randomly; select X' from V i randomly~ mj
mj
(that is X' = ~ , ai X'i,where ~>-]ai = 1, i~1
i~l
- - 0 . 5 ~ < a i < ~ 1.5) if f(X') < f ( ~ [ - j ] ) then ~ [ j ] : = X'; end do end The asynchronous parallel algorithm consists of p + t processes Gi,i = 0 , 1 , ' " , p . The distinguishing features of the algorithm are as follows. 1) The algorithm is asynchronous and parallel, it means that all the p -4- 1 processes
K a n g L i - s h a n et al:Asynchronous Parallel E v o l u " -
No. 4
never wait for inputs at anytime but continue or terminate according to whatever information is currently contained in the global variables, those are t, P[I'] (Xb~,,) and P[N](X,,,or,,). 2) Each process Gj(j = 1 , 2 , " ' , p) performs a hill-climbing method and updates the global variable ~ i which guarratees the covergence of the algorithm. 3) m l , m z , ' " , m p can be equal and can be changed with t dynamically,but relatively smaller than N. 4) Processor Po starts its process Go first for initializing the global variables, and finishes its process last for printing the results. 5) The processors Pl, Pz, " ' , Pp halt their processes simultaneously. 6) The terminal criterion f (X~,,) = f (Xwor,t) guarratees that all the individuals in population are solutions. If problem (1) has many solutions, the algorithm may get several solutions in one run. Parallelization Scheme II We assume that the distributed memory multicomputer system (message passing multicomputers) consists of p q- 1 computers (or nodes) p;, i = 0 , 1 , 2 , ' " , p . Each computer pi has local memory. The global variables(shared data) P[1 : N ] , X[1 : p] and t are stored in the local memory of computer Po for example, instead of the shared memory in Scheme I. In this case, computer Po is used to play the part of processor Po in Parallel ization Scheme I to manage the global variables by running process Go. The computers pj, j = 1 , 2 , ' " , p, are used to update the global variables ~ through message passing by running the following processes G ; , j = 1 , 2 , ' " , p. PROCESS G7 Begin while (t ~ O) A (f(P[1-] ~ f ( P [ N ~ ) ) do
P~: = P ;
(getP[l:N3frompo)
411
P~[N']: = X' ; sorting Pj by exchanging; end for
X'Ej] : = Pj[1] ;
(send ~*[-j-]to Po )
end do end
The asynchronous parallel algorithm consists of p-b 1 processes:Go, G{ ,Gz", "", G; . The distinguishing features of this algorithm are as follows : I) The algorithm is asynchronous and parallel, it means that all the p -+-i processes never wait for each other at anytime. 2) The computational granularity of the algorithm is determined by L, it means the cost of inter-process communicationcan be controlled by L. We call L the control parameter. 3) The amount of data in the inter-process communication between Po and pj by message pass" ing for every iterative cycle is get P[1 ,N3 from Po and send ~ j to Po. 4) In every iterative cycle, the process Gj(j = 1,2, "" ,p) always sends its best result to Po for update of ~[-j] . Parallelization Scheme III We assume that all conditions of the computing system are as in Parallelization Scheme II but instead of processes G; by Gj" " , j = 1 , 2 , ' " , p , as follows. PROCESS G; " Begin while (t>O) /~ ( f ( P r l ] =~ f ( P [ N 3 ) ) do
select m i points X'I ,X'z , " " ,X'm~ from p randomly; for k = 1 to L mj
select X' = ~
ai X'i from vi randomly
i=l
Xwors, = arg max f(X'i) l~i~mj
If f ( X ' ) < f(Xworst) then Xworst
for k = 1 to L
select ms points X'I,"" ,X',.~ from Pj
end for
randomly; select X' from V i randomly;
~[J]:
mi
rnl
(that is X ' = ~ , a~ X'~, where ~ i=l
0.5<~ai<~ 1.5) if f ( X ' ) < f ( P j [ N ] ) then
i=l
a , - 1,
= arg
:
=
X'
;
min f(X'i) ;
end do end
The asynchronous parallel algorithm consists of p + 1processes: Go,G1" " , ' " , Gp" " . The distinguishing features of this algorithm are as fol-
412
Vvruhan University Journal of Natural Sciences
lOWS :
1) The algorithm is asynchronous and parallel, it means that all the p + 1 processes never wait for each other at anytime. 2) The computational granularity of the algorithm is determined by L, itmeans the cost of inter-process communication can be controlled by L. We call L the control parameter. To our experiences, if rn~ = 10 then L ~ 10000. 3) The amount of data in the inter-process communication between Po and p~ by message-passing for every iterative cycle is get PE1],PEN], t and mj individuls from P0 and send XEj] to P0 , so the amount of exchange data is smaller than that in Scheme II. 4) In every iterative cycle, the process Gi( j = 1 , 2 , ' " , p) always sendsits best result to pofor update of ~ [ ' j ] .
In this paper, three asynchronous parallel algorithms we proposed have defferent granularities, but they are adaptive load-balance, it means that the processes can run in parallel for solving a given problem and need not to wait for each other. The Scheme I is designed for shared memory multiprocessor systems while Scheme II and Scheme III are designed for message passing multicomputer systems. Theoretically, these algorithms could get the highest speed-up rate. We plan to do experiments to exam the performance of them. References : [-1]
1-2]
[-3]
4
Conclusions
The Guo's algorithm is the best one we known which epitomizes all of the following advantages: 1) Simplicity of its structure: less than a hundred lines of C programs are used. 2) Wide range of its applications: it can be used for solving very complicated function optimization problems. 3) Robustness of its use: one can use it to solve different kinds of function optimization problems without modifying its parameters. 4) Higher accuracy of its results: one can use it for finding the global approximate optimums with more than 7 significant digits in accuracy. 5) Higher efficiency and effectiveness of its performance. 6) Easy to parallelize it for suiting different kinds of parallel computer systems.
Vol. 5
[4]
1"5]
[-6"]
1-7]
1-8]
[-9-]
Michalewicz Z. Genetic Algorithms q- Data Structures Evolution Programs. New York: Springer Verlag, 3rd edition, 1996. Schwefel H P. Introduction to Evolution Strategies. 1999 Genetic Evolutionary Computation Conference Tutorial Program, Orlando, Florida, July 14, 1999, 321-346. Guo Tao. Evolutionary Computation and Optimization: [ P h . D . Thesis]. Wuhan: Wuhan University, 1999. Greenwood G W, Fogel G B, Ciobanu M. Emphasizing Extinction in Evolutionary Programming. Proceedings of the 1999 Congress on Evolutionary Computation, 1999. 666-671. He Jun,Kang Li-shan. The Convergence Rate of Evolutionary Algorithms. Theoretical Computer Science. 1999, 229(1): 23-39. Koziel S, Michalewicz Z. Evolutionary Algorithms, Homomorphous Mappings, and Constrained Parameter Optimization. Evolutionary Computation, 1999, 7 (1): 19-44. Kung H T. Synchronized and Asynchronous Parallel Algorithms for Multiprocessors. In J F Trauh ed. Algorithms and Complexity. New York: Academic Press, 1976. 153-200. Dijkstra E W. Cooperating Sequential Processes. In: F Genuys ed. Programming Languages. New York: Academic Press, 1968. 43-12. Huang Kai. Advanced Computer Architecture. New York:McGraw-Hill, 1993.