Soft Comput DOI 10.1007/s00500-016-2208-9
METHODOLOGIES AND APPLICATION
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential energy function Mohamed A. Tawhid1,2 · Ahmed F. Ali1,3
© Springer-Verlag Berlin Heidelberg 2016
Abstract The minimization of the molecular potential energy function is one of the most important real-life problems which can help to predict the 3D structure of the protein by knowing the steady (ground) state of the molecules of the protein. In this paper, we propose a new hybrid algorithm between the social spider algorithm and the genetic algorithm in order to minimize a simplified model of the energy function of the molecule. We call the proposed algorithm by hybrid social spider optimization and genetic algorithm (HSSOGA). The HSSOGA comprises of three main steps. In the first step, we apply the social spider optimization algorithm to balance between the exploration and the exploitation processes in the proposed algorithm. In the second step, we use the dimensionality reduction process and the population partitioning process by dividing the population into subpopulations and applying the arithmetical crossover operator for each subpopulation order to increase the diversity of the search in the algorithm. In the last steps, we use the genetic mutation operator in the whole population to avoid the premature convergence and avoid trapping in local minima. The combination of three steps helps the proposed algorithm to solve the molecular potential energy function with different molecules size, especially when the problem dimension Communicated by V. Loia.
B
Mohamed A. Tawhid
[email protected]
1
Department of Mathematics and Statistics, Faculty of Science, Thompson Rivers University, Kamloops, BC V2C 0C8, Canada
2
Department of Mathematics and Computer Science, Faculty of Science, Alexandria University, Moharram Bey, Alexandria 21511, Egypt
3
Department of Computer Science, Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
D > 100 with powerful performance. We test it on 13 large-scale unconstrained global optimization problems to investigate its performance on these functions. In order to investigate the efficiency of the proposed HSSOGA, we compare HSSOGA against eight benchmark algorithms when we minimize the potential energy function problem. The numerical experiment results show that the proposed algorithm is a promising and efficient algorithm and can obtain the global minimum or near global minimum of the large-scale optimization problems up to 1000 dimension and the molecular potential energy function of the simplified model with up to 200 degrees of freedom faster than the other comparative algorithms. Keywords Social spider optimization · Genetic algorithm · Molecular energy function · Global optimization
1 Introduction The minimization of the potential energy function problem can be formulated as a global optimization problem. Founding the steady state (ground) of the molecules in the protein can help to predict the 3D structure of the protein, which helps to know the function of the protein. The difficulty of the problem arises from the number of local minimizer of the energy function that grows exponentially with the size the molecule (Wales and Scheraga 1999). Many researchers have applied their methods to solve this problem such as Barbosa et al. (2005), Dra˘zi´c et al. (2008), Floudas et al. (1999), Hedar et al. (2011), Kova˘cevi´c-Vuj˘ci´c et al. (2004), Pardalos et al. (1994), Wales and Scheraga (1999). The social spider optimization (SSO) algorithm is a population-based algorithm proposed by Cuevas et al. (2013).
123
M. A. Tawhid, A. F. Ali
In this paper, we propose a new hybrid social spider optimization and genetic algorithm (HSSOGA) in order to solve the molecular potential energy function minimizations. We call the proposed algorithm by HSSOGA with population partitioning. The proposed HSSOGA algorithm is comprised of three main steps. In the first step, we apply the social spider algorithm with its powerful performance with the exploration and the exploitation processes. In the second step, we divide the population into subpopulation and apply the arithmetical crossover operator on each subpopulation. The partitioning idea can improve the diversity search of the proposed algorithm. In the last step, we apply the genetic algorithm mutation operator in the whole population in order to avoid the premature convergence. The combination between these three steps accelerates the search and helps the algorithm to reach to the optimal or near optimal solution in reasonable time. In order to investigate the general performance of the proposed algorithm, we test it on a scalable simplified molecular potential energy function with well-known properties established in Lavor and Maculan (2004). The rest of the paper is organized as follows. We review the related work in Sect. 2. In Sect. 3, we present the definition of the molecular energy function, large-scale unconstrained optimization problem and the genetic algorithm. In Sect. 4, we overview the standard SSO algorithm and describe the proposed algorithm. In Sect. 5, we present the numerical experimental results. We end up by giving the conclusion and future works in Sect. 6.
2 Related work One of the most popular meta-heuristics algorithms to solve global optimization problems is genetic algorithm (GA) (Holland 1975). GA has various and efficient operators such as crossover and mutation, and due to the efficiency of these operators many researches have applied these operators in their works. GAs are combined with other swarm intelligence (SI) in order to improve the performance of these algorithms. Particle swarm optimization (PSO) algorithm is the most combined algorithm with GA; for example, the authors in Robinson et al. (2002) proposed two hybrid PSO with GA algorithms, called GA–PSO and PSOGA, respectively, in order to solve a particular electromagnetic application of profiled corrugated horned antenna. In GA– PSO algorithm, the GA generates the initial population for PSO, while in PSO-GA algorithm, the PSO is used to generate an initial population for GA. The final results showed that the PSOGA hybrid algorithm outperforms the simple PSO and GA versions. Also, in Krink and Lvbjerg (2002), PSO was combined with the GA and hill-climbing approaches and was applied to solve unconstrained global optimization problems. In that work, the population is partitioned into
123
subpopulation and the hybrid algorithm is applied to a different subpopulation of individuals in which each individual is dynamically assigned according to some pre-designed rules. In Grimaldi et al. (2004), the authors proposed a hybrid algorithm by combining GA and PSO algorithms to solve combinatorial optimization problems. The proposed algorithm was called genetical swarm optimization (GSO). In GSO, at each iteration, the population is divided into two parts and evolved with the GA and PSO algorithms. GSO applied a new parameter HC (or hybridization constant) that expresses the percentage of population that evolved with GA at every iteration. In Grimaccia et al. (2007), GSO used several hybridization strategies (static, dynamic, alternate, self-adaptive, etc.) for solving some multi-modal benchmark problems. Also, PSO and GA were combined in Juang (2004) work to design neural/fuzzy network, where the upper half of the best performing individuals in a population is regarded as elite solutions. In the proposed algorithm, the GA operators are enhanced by means of PSO, instead of being reproduced directly to the next generation. In Settles and Soule (2005), a new hybrid PSO and GA is proposed to solve four unconstrained optimization problems with various dimensions. The proposed algorithm was called breeding swarm (BS). The BS algorithm combines the standard velocity and position update rules of PSOs with the GAs operators. In Jian and Chen (2006), the authors introduced a hybrid PSO with the GA crossover operator and dynamic linkage discovery, called PSO-RDL, for solving 25 unconstrained test problems with various degrees of complexities. In Esmin et al. (2006), the authors proposed a hybrid PSO and GA algorithm to solve unconstrained global optimization problems by combining PSO algorithm with GA mutation operator only. To the best of our knowledge, most of the hybrid GA and other SI algorithms are applied to solve unconstrained optimization problems for low- and medium-dimensional problems (D < 100). In this work, we apply the proposed hybrid SSO algorithm and GA in order to solve largescale unconstrained optimization problems and minimize the molecular potential energy function as an application. The main difference between the SSO algorithm and the other SI algorithms is that the individuals (solutions) in SSO algorithm are processed separately according to their characteristics (gender). The female spiders achieve wide exploration process, while the male spiders achieve extensive exploitation. So part of the solutions is responsible for the exploration and the other part is responsible for the exploitation. The operators in SSO share the same communication mechanism with the important information of the evolutionary process. These information and the global modification modify the influence of each operator. These facts help SSO algorithm to avoid premature convergence and incorrect exploration–exploitation balance which present in other SI algorithm such as PSO and the ABC algorithms. The SSO
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
algorithm has a self-organization and labor division techniques which are used to update its solutions at each iteration. In PSO algorithm, the new position of each particle (solution) is updated in the next iteration by moving toward the position of the best particle found so far; however, in ABC algorithm, the new solution is updating by selecting random solution. These behaviors produce that the entire population concentrates around the best particle or diverges without control as the algorithm evolves (Banharnsakun et al. 2011; Wang et al. 2013). However, in SSO, the self-organization technique can help each solution to respond to local stimuli individually and work together to accomplish a global task, while a labor division technique can avoid a centralized supervision.
3 Definition of the problems and genetic algorithm In this section, we present the definitions of the molecular potential energy function as follows. 3.1 Molecular potential energy function The potential energy of a molecule is derived from molecular mechanics, which describes molecular interactions based on the principles of Newtonian physics. An empirically derived set of potential energy contributions is used for approximating these molecular interactions. The molecular model considered here consists of a chain of m atoms centered at a1 , . . . , am , in a three-dimensional space and is similar to the model described in Bansal et al. (2010), Barbosa et al. (2005), Dra˘zi´c et al. (2008), Hedar et al. (2011), Deep et al. (2012) and related references therein. In this paper, we consider the following expression for the potential energy as a function of the torsion angles: 1 + cos(3ωi,i+3 ) E= i
+
(−1)i 10.60099896 − 4.141720682(cos(ωi,i+3 ))
,
Finally, in order to be consistent with the notation for large-scale unconstrained global optimization problem in this paper, we rewrite E in (1) as f (x) which can be defined as n 1 + cos(3xi ) f (x) = i=1
(−1)i +√ 10.60099896 − 4.141720682(cos(xi ))
(2)
and 0 ≤ xi ≤ 5, i = 1, . . . , n. 3.2 Large-scale unconstrained global optimization problem We consider the large-scale unconstrained global optimization problem min f (x), x ∈ R n
(3)
where f : R n → R is an objective function and x represents a decision variable vector. Also, x ∈ S ⊂ R n and S denotes the search space, which is n dimensional and bounded by parametric constraints as follows: li ≤ x i ≤ u i
i = 1, 2, . . . , n
(4)
where li and u i are the lower and upper bounds of the decision variables xi , respectively. Many researchers have solved the problem in (3) via several algorithms such as steepest descent, Newton and conjugate gradient methods, see, e.g., Yang et al. (2005), where the differentiability of the objective function is required. Nonetheless, computing the derivative of the objective function (Jacobian) is difficult and expensive. Also, the objective function in most of the real-life applications might be nonsmooth. This is a motivation for many researchers to develop stochastic global optimization algorithms such as SI algorithms. 3.3 Genetic algorithm
(1) where i = 1, . . . , m − 3 and m is the number of atoms in the given system and ωi,i+3 be the angle, called the torsion angle, between the normal through the planes determined by the atoms ai , ai+1 , ai+2 and ai+1 , ai+2 , ai+3 as shown in Fig. 1c. Our goal is to find ω1,4 , ω2,5 , . . . , ωm−3,m where ωi, j ∈ [0, 5], i.e., the global minimum of the function E in (1). It is known that for small molecules E is a nonconvex function involving numerous local minimizers. Despite these simplification, the problem remains very difficult. A molecule with as few as 30 atoms has 227 = 134, 217, 728 local minimizers.
Genetic algorithms are stochastic optimization methods based on concepts of natural selection and genetics (Holland 1975; De Jong 1985; Goldberg 1989). Holland (1975) developed to understand the adaptive processes of natural systems. GAs have been applied to a wide range of problems in diverse fields such as engineering, mathematics, operations research for a variety of applications, see De Jong (1985), Goldberg (1989), Michalewicz (1996). A GA starts its search with a random set of solutions coded in binary strings. Every solution is assigned a fitness which is directly related to the objective function of the search and
123
M. A. Tawhid, A. F. Ali
Fig. 1 a Euclidean distance. b Bond angle. c Torsion (dihedral)angle
optimization problem. Thereafter, the population of solutions is modified to a new population by applying three operators similar to natural genetic operators selection (reproduction), crossover and mutation. It works iteratively by successively applying these three operators in each generation till a termination criterion is satisfied. The selection stage is important in GA in order to choose the individuals from the population for crossover operator. There are many kinds of selection operator such as roulette wheel selection, rank selection, tournament selection and elitist selections. The main role of the mutation in GA is to increase the exploration process of the algorithm and avoid the premature convergence in the population by associating a random number from (0, 1) with each gene in each individual in the population P and mutate the gene which their associated number less than Pm . The crossover operator is based on the n-point or uniform crossover while the mutation is a bit flipping. The general structure of GA is shown in Algorithm 1. Arithmetical Crossover operator An arithmetical crossover operator (Michalewicz 1996) is applied on each partition (subpopulation) in the proposed HSSOGA algorithm in order to explore the search space and increase the diversity of the search. Procedure 1 shows the main steps of the arithmetical crossover operator.
Algorithm 1 The structure of genetic algorithm Set the generation counter t := 0. Generate an initial population P 0 randomly. Evaluate the fitness function of all individuals in P 0 . repeat Set t = t + 1. Select an intermediate population P t from P t−1 . {Selection operator} 7: Associate a random number r from (0, 1) with each row in P t . 8: if r < pc then 9: Apply crossover operator to all selected pairs of P t . {Crossover operator} 10: Update P t 11: end if 12: Associate a random number r1 from (0, 1) with each gene in each individual in P t . 13: if r1 < pm then 14: Mutate the gene by generating a new random value for the selected gene with its domain. {Mutation operator} 15: Update P t 16: end if 17: Evaluate the fitness function of all individuals in P t . 18: until Termination criteria are satisfied. 1: 2: 3: 4: 5: 6:
4 The standard social spider algorithm and proposed algorithm We present the standard social spider algorithm and proposed algorithm.
Procedure 1 Crossover( p 1 , p 2 )
4.1 The standard social spider algorithm
1. Randomly choose λ ∈ (0, 1). 2. Two offspring c1 = (c11 , . . . , c1D ) and c2 = (c12 , . . . , c2D ) are generated from parents p 1 = ( p11 , . . . , p 1D ) and p 2 = ( p12 , . . . , p 2D ), where
We overview the main concepts and structure of the standard social spider optimizer algorithm as follows.
ci1 = λpi1 + (1 − λ) pi2 , ci2 = λpi2 + (1 − λ) pi1 , i = 1, . . . , D. 3. Return.
123
Main concepts and inspiration Social spider colony consists of two components, social members and communal web. The social members are divided into males and females. Each solution within the search space represents a spider position in the communal web. Every spider receives a weight according to the fitness
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
value of the solution that is symbolized by the social spider. The algorithm models two different search agents (spiders): males and females. Depending on gender, each individual is conducted by a set of different evolutionary operators which mimic different cooperative behaviors that are commonly assumed within the colony. An interesting characteristic of social spiders is the highly female-biased populations. In order to emulate this fact, the algorithm starts by defining the number of female and male spiders that will be characterized as individuals in the search space. Social spider algorithm assumes that entire search space is a communal web, where all the social spiders interact to each other. Each member in the colony cooperate has a very specific task to perform such as building and maintaining the communal web, prey capturing and mating (Eric and Yip 2008). Female spiders show a major tendency to socialize present an attraction or dislike over others. The attraction or dislike of a particular spider is developed over other spiders according to their vibrations based on the weight and distance of the members (Maxence 2010). Male spiders are divided into two classes, dominant and nondominant male spiders (Pasquet 1991). Dominant male spiders have better fitness characteristics in comparison with nondominant. Mating operation allows the information exchange among members, and it is performed by dominant males and female. A dominant male mates with one or all females within a specific range to produce offspring. In the SSO algorithm, the communal web represents the search space; each solution within the search space represents a spider position. The weight of each spider represents the fitness value of the solution. The algorithm starts by initializing the population S of N spider positions (solutions). The population consists of females f i and males m i . The number of females is randomly selected within the range of 65–90 % and calculated by the following equation:
While the male spider position m i is randomly generated as follows. high m 0k, j = plow + rand(0, 1). p j − plow j j k = 1, 2, . . . , Nm ; j = 1, 2, . . . , n
where zero signals the initial population and j, i and k are the parameter and individual indexes, respectively. The function rand(0,1) generates a random number between 0 and 1. f i, j is the jth parameter of the ith female spider position. In the SSO algorithm, the weight of each spider represents the solution quality. The function value of each solution i is calculated as follows. wi =
J (si ) − wor sts bests − wor sts
(5)
where rand is a random number between (0,1) and floor(.) maps a real number to an integer number. The number of male spiders Nm is calculated as follows. Nm = N − N f .
(6)
The female spider position f i is randomly generated between and the upper initial the lower initial parameter bound plow j high
parameter bound p j
as follows.
bests =
min
k∈{1,...,N }
J (sk ) and wor sts =
high
f i,0 j = plow + rand(0, 1). p j j
i = 1, 2, . . . , N f ; j = 1, 2, . . . , n,
(7)
J (sk ).
The information among the colony members is transmitted through the communal web. The information is encoded as small vibrations that are critical for the collective coordination of all individuals in the population. The vibrations depend on the weight and distance of the spider which has generated them. The information transmitted (vibrations) perceived by the individual i from member j is modeled as follows. V ibi, j = w j .e−di, j
(11)
where the di, j is the Euclidean distance between the spiders i and j. There are three special relationships of the vibrations between any pair of individuals as follows. – Vibrations V ibci . The transmitted information (vibrations) between the individual i and the member c (sc ) which is the nearest member to i with a higher weight can be defined as 2
− plow j
max
k∈{1,...,N }
(10)
V ibci = wc .e−di,c
(9)
where J (si ) is the fitness value obtained by the evaluation of the spider position si with regard to the substituted objective function J (·). The values wor sts and bests are the maximum and the minimum values of the solution in the population, respectively, and they are defined by considering the following minimization problem:
2
N f = f loor [(0.9 − rand(0, 1).0.25).N ]
(8)
(12)
– Vibrations V ibbi . The transmitted information (vibrations) between the individual i and the member b (sb )
123
M. A. Tawhid, A. F. Ali
wi
which is the best member in the population S can be defined as
Psi =
V ibbi = wb .e−di,b
4.1.1 Social spider optimization algorithm
2
(13)
– Vibrations V ib f i . The transmitted information (vibrations) between the individual i and the nearest female individual f (s f ) can be defined as V ib f i = w f .e−di, f . 2
(14)
We present the vibrations V ibci , vibrations V ibbi and vibrations V ib f i in Fig. 2a–c, respectively. The female spiders present an attraction or dislike over other irrespective of gender. The movement of attraction or repulsion depends on several random phenomena. A uniform random number rm is generated within the range [0,1]. If rm is smaller than a threshold P F, an attraction movement is generated; otherwise, a repulsion movement is produced as follows. ⎧ ⎪ f it + α.V ibci .(sc − f it ) + β.V ibbi .(sb − f it ) ⎪ ⎪ ⎪ ⎨ + δ.(rand − 0.5) with probabilityP F f it+1 = ⎪ f it − α.V ibci .(sc − f it ) − β.V ibbi .(sb − f it ) ⎪ ⎪ ⎪ ⎩ + δ.(rand − 0.5) with probability1 − P F where α, β, δ and rand are random numbers in [0,1], whereas t is the iteration number. A dominant D denotes the male spider with a weight value above the median value of the male population, and nondominant N D denotes the other males with weights under the median. i N f + m indexes the median weight. The following equation gives the position of the male spider.
m it+1
⎧ ⎪ m t + α.V ib f i .(s f − m it ) + δ.(rand − 0.5) ⎪ ⎪ ⎨ i if w N f +i > w N f +m = Nm m t .w N +h ⎪ ⎪ h=1 h f t ⎪ − m it ⎩m i + α. Nm
j∈T t
wj
.
(16)
We present in detail the main steps of the proposed SSO algorithm as shown in Algorithm 2. – Step 1. The algorithms start by setting the initial values of the number of solutions N in the population size S, threshold P F and maximum number of iterations maxitr . (Line 1) – Step 2. The number of females and males is setting as in (5) and (6). (Line 2) – Step 3. The initial iteration counter is initialized. (Line 3) – Step 4. The initial population is randomly generated for the females and the males solutions. (Lines 4–13) – Step 5. The following process is repeated until termination criteria are satisfied – Step 5.1. Each solution in the population is evaluated by calculating its weight (fitness function as in (9). (Lines 15–17) – Step 5.2. Move female spiders according to the female cooperative operator after calculating the vibrations of the local and global best spiders as in (12) and (13). (Lines 18–25) – Step 5.3. Move the male spiders according to the male cooperative operator after calculating the median male individual w N f +m from all male spiders. (Lines 27–34) – Step 5.4. Perform the mating operation after calculating the radius of matting as in (15). (Lines 35–46) – Step 6. The number of iterations is increased. (Line 47) – Step 7. The best obtained solution is produced. (Line 49) 4.2 The proposed HSSOGA algorithm
h=1 w N f +h
(15)
In the proposed HSSOGA algorithm, we used the SSO algorithm with its strong capability of performing the exploration and exploitation process and the GA operators to avoid premature convergence and trapped in local minima. The proposed HSSOGA algorithm is based on three mechanisms to minimize the molecules potential energy function which is an application of large-scale unconstrained global optimization problems. We can summarize these mechanisms as follow.
The spider holding a heavier weight is more likely to influence the new product. The roulette wheal method assigns the influence probability Psi of each member as the following.
– Space searching. The standard social spider algorithm has two types of solutions, the female and male spiders. The female spiders achieve wide exploration process,
where the individual s f represents the nearest female individual to the male member i. The dominant males and the female members perform the mating in a social spider colony. A dominant male m g spider locates a set E g of female members within a specific range r (range of mating) where r is defined in the next equation. n r=
j=1
123
high
pj
2.n
− plow j
.
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
Fig. 2 Configuration of each special relation: a V ibci , b V ibbi and c V ib f i
while the male spiders achieve an extensive exploitation process. – Dimension reduction. In this mechanism, the population is partitioned into subpopulation. we applied the arithmetical crossover in a limited number of variables in each population. This mechanism allows the proposed algorithm to divers the search process in each itera-
tion. Moreover, searching a limited number of variables prevents the proposed algorithm from wandering in the search space especially in large-dimensional spaces. – premature convergence avoiding. The proposed algorithm applied the GA mutation operator on the whole population in order to avoid the premature convergence at each iteration.
123
M. A. Tawhid, A. F. Ali
Algorithm 2 Social spider optimization algorithm 1: Set the initial value of total number of solutions N in the population size S, threshold P F and maximum number of iterations Maxitr . 2: Set the number of female spiders N f and number of males spiders Nm as in (5) and (6). 3: Set t := 0. {Counter initialization} 4: for (i = 1; i < N f + 1; i + +) do 5: for ( j = 1; j < n + 1; j + +) do high 6: f i,t j = plow + rand(0, 1).(P j − plow j j ) 7: end for 8: end for {Randomly initialize the male spider} 9: for (k = 1; k < Nm + 1; k + +) do 10: for ( j = 1; j < n + 1; j + +) do high + rand(0, 1).(P j − plow 11: m tk, j = plow j j ) 12: end for 13: end for {Randomly initialize the male spider} 14: repeat 15: for (i = 1; i < N + 1; i + +) do J (si )−wor sts 16: wi = best s −wor sts 17: end for {Evaluate the weight (fitness function) of each spider} 18: for (i = 1; i < N f + 1; i + +) do 19: Calculate the vibrations of the best local and best global solutions V ibci and V ibbi as in (12) and (13). 20: if (rm < P F) then 21: f it+1 = f it + α.V ibci .(sc − f it ) + β.V ibbi .(sb − f it ) + δ.(rand − 0.5) 22: else 23: f it+1 = f it − α.V ibci .(sc − f it ) − β.V ibbi .(sb − f it ) + δ.(rand − 0.5) 24: end if 25: end for 26: Find the median male individual (w N f +m ) from M. 27: for (i = 1; i < Nm + 1; i + +) do 28: Calculate V ib f i as in (14) 29: if (w N f i > w N f +m ) then 30: m it+1 = m it + α.V ib f i .(s f − m it ) + δ.(rand − 0.5) 31: else Nm t m h .w N f +h t 32: m it+1 = m it + α. h=1 − m Nm i h=1
33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49:
end if end for
.w N f +h
n
(p
high
− plow j )
i Calculate the radius of mating r , where r = j=1 2.n {Perform the mating operation} for (i = 1; i < Nm + 1; i + +) do if (m i ∈ D) then Find E i if E i is not empty then Form snew using the roulette method if wnew > wwo then Set swo = snew . end if end if end if end for t =t +1 until (t > Maxitr ). {Termination criteria are satisfied} Produce the best solution.
.
We present the main structure of the proposed HSSOGA algorithm in Algorithm 3 and the main steps of the proposed algorithm as follows.
123
Algorithm 3 Hybrid social spider optimization and genetic algorithm 1: Set the initial values of the population size S with N solutions, threshold probability P F, crossover probability Pc , mutation probability Pm , partition number par tno , number of variables in each partition ν, number of solutions in each partition η and the maximum number of iterations Maxitr . 2: Set t := 0. {Counter initialization} 3: for (i = 1 : i ≤ N ) do (t) 4: Generate an initial population contains individuals si randomly. 5: Evaluate the fitness function of each search agent (solution) (t) f (si ). 6: end for 7: repeat 8: Apply the standard social spider algorithm as shown in Algorithm 2 on the whole population. 9: Apply the selection operator of the GA on the whole population. 10: Partition the population S (t) into par tno sub-partitions, where each sub-partition S t size is ν × η. 11: for (i = 1 : i ≤ par tno ) do 12: Apply the arithmetical crossover as shown in procedure 1 on each sub-partition S t. 13: end for 14: Apply the GA mutation operator on the whole population S t . 15: Update the solutions in the population S t . 16: Set t = t + 1. 17: until (t > Maxitr ). {Termination criteria are satisfied} 18: Produce the best solution.
– Step 1. The proposed HSSOGA algorithm starts by setting it’s parameter values such as the population size S number N , the threshold probability P F, crossover probability Pc , mutation probability Pm , partition number par tno , number of variables in partition ν, number of solutions in partition η and the maximum number of iterations Maxitr . (Line 1) – Step 2. The iteration counter t is initialized, the initial population is generated randomly, and each solution in the population is evaluated. (Lines 2–6) – Step 3. The following steps are repeated until termination criteria are satisfied Step 3.1. The new solutions si(t) are generated by applying the standard SSO algorithm on the whole population. (Line 8) Step 3.2. Select an intermediate population from the current one by applying GA selection operator. (Line 9) Step 3.3. In order to increase the diversity of the search and overcome the dimensionality problem, the current population is partitioned into par tno subpopulation, where each subpopulation S t size is ν × η. (Line 10) Step 3.4. The arithmetical crossover operator is applied on each subpopulation. (Lines 11–13) Step 3.5. The genetic mutation operator is applied in the whole population in order to avoid the premature convergence. (Line 14)
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
– Step 7. The solutions in the population are evaluated by calculating its fitness function, the iteration counter t is increased, and the overall processes are repeated until termination criteria are satisfied. (Lines 15–17) – Step 8. Finally, the best found solution is presented. (Line 18)
– Probability of mutation Pm . In order to avoid the premature convergence, we apply a mutation on the whole population with value 0.01. – Partitioning variables ν and η. It turns out that the best subpopulation size is to be ν × η, where ν and η equal to 5. 5.2 Unconstrained test problems
5 Numerical experiments We test the HSSOGA on a simplified model of molecular potential energy function with different size and compare the results of the proposed algorithm against nine benchmark algorithms. We program HSSOGA via MATLAB and take the results of the comparative algorithms from their original papers. In the following subsections, we report the parameter setting of the proposed algorithm with more details. Also, we present the performance analysis of the proposed algorithm with the comparative results between it and the other algorithms.
5.1 Parameter setting In Table 1, we summarize the parameters of the HSSOGA algorithm with their assigned values. These values are based on the common setting in the literature or figured out through our preliminary numerical experiments.
– Population size S. The experimental tests show that the best population size is S = 25; increasing this number will increase the evaluation function values without any improvement in the obtained results. – Threshold probability P F. We apply the standard SSO value of the threshold probability. – Probability of crossover Pc . We apply arithmetical crossover operator for each partition in the population, and we found that the best value of the probability of crossover is to set to 0.6.
Table 1 Parameter setting Parameters
Definitions
Values
S
Population size
20
PF
Threshold probability
0.7
Pc
Crossover rate
0.6
Pm
Mutation rate
0.01
ν
Number of variables in each partition
5
η
Number of solutions in each partition
5
Before testing the general performance of the proposed algorithm with different molecules size, we test it on 11 benchmark functions and report the results in Tables 2 and 3, respectively. There are seven uni-modal functions in Table 2 and six multi-modal functions in Table 3. 5.3 The general performance of the proposed HSSOGA on large-scale unconstrained global optimization problems In order to investigate the general performance of the proposed algorithm, we test it on 13 large-scale unconstrained global optimization problems and report the results in Tables 2 and 3. In Tables 4 and 5, we report the average and the standard deviation of the function evaluations of theses functions at dimensions 30, 100, 400, 1000 over 30 runs. The run is successful if the algorithm reaches the global minimum of the solution within an error of 10−4 before the 25,000, 50,000, 125,000 and 300,000 function evaluation value for dimensions 30, 100, 400, and 1000, respectively. The results in parentheses are the mean and the standard deviations of the function values when the algorithm reaches to the desired number of function evaluations without obtaining the desired optimal solutions. The results in Tables 4 and 5 show that the proposed algorithm can obtain the optimal or near optimal solution for most of the functions except functions f 5 and f8. Also, we present the general performance of the proposed algorithm in Figs. 3 and 4 by plotting the function values versus the iterations number for functions. f 1 and f 4 represent the uni-modal functions, and f 9 f 10 represent the multi-modal functions with dimensions 30, 100, 400 and 1000. Figures 3 and 4 show that the proposed algorithm is a promising algorithm and can solve large-scale unconstrained optimization problems in efficient way and reasonable time. 5.4 The general performance of the proposed HSSOGA for minimizing the potential energy function We test the general performance of the proposed algorithm on a simplified model of the molecule with different dimensions from 20 to 200 dimension by plotting the number of function values (mean error) versus the number of iterations (function evaluations) as shown in Fig. 5. The solid line represents the
123
M. A. Tawhid, A. F. Ali Table 2 Uni-modal test functions
Test function f 1 (X ) = f 2 (X ) = f 3 (X ) =
d
2 i=1 x i
d
i=1
d
i=1
| xi | + i
d
j=1 x j
i=1
2
| xi |
f 4 (X ) = maxi | xi |, 1 ≤ i ≤ d d−1 [100(xi+1 − xi2 )2 + (xi − 1)2 ] f 5 (X ) = i=1 d f 6 (X ) = i=1 ([xi + 0.5])2 d i xi4 + random[0, 1) f 7 (X ) = i=1
S
f opt
[−100, 100]d
0
[−10, 10]d
0
[−100, 100]d
0
[−100, 100]d
0
[−30, 30]d
0
[−100, 100]d
0
[−1.28, 1.28]d
0
Table 3 Multi-modal test functions Test function f 8 (X ) = f 9 (X ) =
d
√
i=1 −x i sin( | x i |) 2 i=1 [x i − 10 cos(2π x i ) + 10]
d
d 1
2
1 d
f 10 (X ) = −20 exp −0.2 d i=1 xi −exp d i=1 cos(2π xi ) +20+e
d 1 d 2 √xi +1 f 11 (X ) = 4000 i=1 x i − i=1 cos (i) m−1 π 2 2 2 2 f 12 (X )m= d 10sin (π y1 ) + i=1 (yi − 1) [1 + 10sin (π yi+1 )] + (yd − 1) + i=1 u(xi , 10, 100, 4) where yi = 1 +
S
f opt
[−500, 500]d
−418.9829 × d
[−5.12, 5.12]d
0
[−32, 32]d
0
[−600, 600]d
0
[−50, 50]d
0
[−50, 50]d
0
xi +1 , ⎧4
m ⎪ xi > a ⎨k(xi − a) 0 −a < xi < a ⎪ ⎩k(−x − a)m x < −a i i d f 13 (X ) = {0.1 sin2 (3π x1 ) + i=1 (xi − 1)2 [1 + sin2 (3π xi + 1)] + d 2 2 (xd − 1) [1 + sin (2π xd )]} + i=1 u(xi , 5, 100, 4)
u(xi , a, k, m) =
Table 4 General performance of the proposed HSSOGA on uni-modal functions
f1 30 100 400 1000
Table 5 General performance of the proposed HSSOGA on multi-modal functions
30 100 400 1000
123
f2
f3
f4
Ave
1200.35
1515.14
1875.25
1917.27
SD
60.14
85.45
335.26
64.12
Ave
1525.23
1625.34
2450.36
2125.16
SD
45.21
71.15
385.16
59.17
Ave
1725.16
1919.23
3355.36
2775.16
SD
51.13
65.25
391.16
55.15
Ave
2115.39
3716.18
4715.49
3925.48
SD
61.25
75.13
344.59
51.34
f5
f6
(28.23)
650.23
(0.65) (112.45)
1.15 925.35
(1.19) (529.05)
2.14 1150.23
(27.16) (1411.79)
f7
5.17 2516.41
(35.17)
6.15
4200.18 101.17 7300.39 95.17 10,050 (112.37) 19,825 (215.31)
f8
f9
f 10
f 11
f 12
f 13
Ave
(−11372.9)
1275.15
1634.48
1227.34
1258.24
2149.65
SD
(215.15)
251.17
139.15
36.15
35.15
41.31
1656.23
1815.37
1475.23
3722.15
4151.16
35.15
35.75
45.16
291.17
445.17
2215.47
1725.46
1836.48
4816.63
5513.17
31.18
37.26
41.59
412.07
571.45
3175.47
3715.61
3975.46
6914.29
8517.19
45.16
97.25
51.75
456.16
395.24
Ave
(−10817.14)
SD
(325.19)
Ave
(−29.571.15)
SD
(415.15)
Ave
(−20,858.1)
SD
(545.26)
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
Fig. 3 General performance of HSSOGA on large-scale uni-modal global optimization problems
performance of the standard SSO algorithm, while the dotted line represents the proposed HSSOGA algorithm. The results in Fig. 5 show that the function values rapidly decrease while the number of iterations slightly increases and the combination between the SSO and the genetic algorithms with the partitioning idea can accelerate the search. We can conclude from Fig. 5 that the proposed HSSOGA can obtain the optimal or near optimal solutions within reasonable time.
5.5 HSSOGA and other algorithms We compare HSSOGA algorithm against two sets of benchmark methods. The first set of methods consists of four various real-coded genetic algorithms (RCGAs), WX-PM, WX-LLM, LX-LLM (Deep et al. 2012) and LX-PM (Deep and Thakur 2007). These four methods are based on two real-coded crossover operators, Weibull crossover WX and
123
M. A. Tawhid, A. F. Ali
Fig. 4 General performance of HSSOGA on large-scale multi-modal global optimization problems
LX (Deep and Thakur 2007), and two mutation operators, LLM and PM (Deep and Thakur 2007). The second set of methods consists of five benchmark methods, variable neighborhood search-based method (VNS), (VNS-123), (VNS-3) methods (Dra˘zi´c et al. 2008). In Dra˘zi´c et al. (2008), four variable neighborhood search methods VNS-1, VNS-2, VNS-3, VNS-123 were developed. They differ in the choice of random distribution used in the shaking step for minimization of
123
a continuous function subject to box constraints. The description of these four methods is as follows: – VNS-1. In the first method, a random direction is uniformly distributed in a unit ∞ sphere. Random radius is chosen in such a way that the generated point is uniformly distributed in Nk , where Nk is the neighborhood structures, and k = 1, . . . , kmax .
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . .
Fig. 5 General performance of HSSOGA for minimizing the molecular potential energy function
123
M. A. Tawhid, A. F. Ali
– VNS-2. In the second method, a random direction is determined by random points uniformly distributed on a 1 sphere. – VNS-3. In the third method, a random direction x = (x1 , x2 , . . . , xn ) is determined by a specially designed hypergeometric random point distribution on a unit 1 sphere as follows: 1. x1 is uniformly taken on [−1, 1], xk is uniformly taken from [−Ak , Ak ], where Ak = 1 − |x1 | − · · · − |xk−1 |, k = 2, . . . , n − 1, and the last xn takes An with random sign. 2. coordinates of x are randomly permuted. – VNS-123. In the fourth method, the combination of the three previously described methods is made to diversify the search. rHYB method (Barbosa et al. 2005) denotes the staged hybrid genetic algorithm (GA) with a reduced simplex and a fixed limit for simplex iterations, and qPSO method (Bansal et al. 2010) is a hybrid particle swarm optimization (PSO) in which quadratic approximation operator is hybridized with PSO. The function E in (1) is minimized in the specified search space [0, 5]d . The function E grows linearly with d as E ∗ (d) = −0.0411183d (Lavor and Maculan 2004) as shown in Table 6. 5.6 Comparison results between WX-PM, LX-PM, WX-LLM, LX-LLM and HSSOGA. In this subsection, we present the comparison results between our HSSOGA algorithm and other four variant genetic algorithms. We test the five comparative algorithms on different molecule size from 20 to 200 dimension. We take the results of the other comparative algorithms from their original paper (Deep et al. 2012). In Table 7, we report the mean number of the evaluation function values over 30 runs and the best Table 6 Global minimum value E ∗ for chains of different sizes
d 20
123
E∗ −0.822366
40
−1.644732
60
−2.467098
80
−3.289464
100
−4.111830
120
−4.934196
140
−5.756562
160
−6.578928
180
−7.401294
200
−8.22366
Table 7 Comparison results (mean number of function evaluations) between WX-PM, LX-PM, WX-LLM, LX-LLM and HSSOGA D
WX-PM
20
15,574
40
59,999
60
175,865
80
302,011
100
369,376
LX-PM
WX-LLM
LX-LLM
HSSOGA
23,257
28,969
14,586
10,125.46
71,336
89,478
39,366
21,245.17
280,131
225,008
105,892
30,290.34
326,287
372,836
237,621
41,345.27
379,998
443,786
320,146
52,416.39
Table 8 Comparison results (mean number of function evaluations) between VNS-123, VNS-3, GA, rHYB and HSSOGA D
VNS-123
VNS-3
20
23,381
9887
GA
rHYB
36,626
35,836
HSSOGA 10,125.46
40
57,681
25,723
133,581
129,611
21,245.17
60
142,882
39,315
263,266
249,963
30,290.34
80
180,999
74,328
413,948
387,787
41,345.27
100
254,899
79,263
588,827
554,026
52,416.39
120
375,970
99,778
–
–
60,416.39
140
460,519
117,391
–
–
71,691.29
160
652,916
167,972
–
–
80,826.11
180
663,722
173,513
–
–
91,921.14
200
792,537
213,718
–
–
102,129.32
results between the comparative algorithms in boldface text. The results in Table 7 show that the proposed HSSOGA algorithm succeeds to obtain the desired objective value of each function faster than the other algorithms in all cases. 5.7 Comparison results between VNS-123, VNS-3, GA, rHYB and HSSOGA. We give another comparison results between our HSSOGA algorithm and other four benchmark methods. We report the results in Table 8. We take the results of the other comparative algorithms from their original papers (Barbosa et al. 2005; Dra˘zi´c et al. 2008). We report the mean number of the evaluation function values over 30 runs as in Table 8. We report the best results between the comparative algorithms in boldface text. The results in Table 8 show that the proposed HSSOGA algorithm succeeds and obtains the desired objective value of each molecular size faster than the other algorithms except in the first dimension n = 20; the VNS-3 obtains the desired optimal value faster than the proposed algorithm. 5.8 Wilcoxon signed-rank test Wilcoxon’s test is a nonparametric procedure employed in a hypothesis testing situation involving a design with two samples (Garcia et al. 2009; Sheskin 2003; Zar 1999). It
A hybrid social spider optimization and genetic algorithm for minimizing molecular potential. . . Table 9 Wilcoxon test for comparison results in Table 7 Compared methods
Solution evaluations
Method 1
Method 2
R−
HSSOGA
WX-PM
15
0
0.043114
HSSOGA
HSSOGA
LX-PM
15
0
0.043114
HSSOGA
HSSOGA
WX-LLM
15
0
0.043114
HSSOGA
HSSOGA
LX-LLM
15
0
0.043114
HSSOGA
R+
ρ-Value
Best method
Table 10 Wilcoxon test for comparison results in Table 8 Compared methods
Solution evaluations
Method 1
Method 2
R−
R+
ρ-Value
Best method
HSSOGA
VNS-123
55
0
0.005062
HSSOGA
HSSOGA
VNS-3
54
1
0.006910
HSSOGA
HSSOGA
GA
15
0
0.043114
HSSOGA
HSSOGA
rHYB
15
0
0.043114
HSSOGA
the genetic mutation process on the whole population in order to avoid the premature convergence of the solutions. The proposed algorithm has been tested on 13 large-scale unconstrained global optimization problems and compared against nine benchmark algorithms in order to verify its efficiency to minimize molecular potential energy function problem. The experimental results show that the proposed algorithm is a promising algorithm and can obtain the optimal or near optimal global minimum of the large-scale unconstrained optimization problems and the molecular energy function faster than the other comparative algorithms. In the future work, we will improve the performance of the proposed algorithm to be able to solve the CEC13, CEC14 test functions. Acknowledgments We are grateful to the anonymous reviewers for constructive feedback and insightful suggestions which greatly improved this article. The research of the first author is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). The postdoctoral fellowship of the second author is supported by NSERC. Compliance with ethical standards
is a pairwise test that aims to detect significant differences between the behavior of two algorithms. ρ is the probability of the null hypothesis being true. The result of the test is returned in ρ < 0.05 indicates a rejection of the null hypothesis, while ρ > 0.05 indicates a failure to reject the null hypothesis. The R + is the sum of positive ranks, while R − is the sum of negative ranks. In Tables 9 and 10, we present the results of the Wilcoxon signed-rank test for HSSOGA compared against WX-PM, LX-PM, WX-LLM, LX-LLM, VNS-123, VNS-3, rHYB and GA. We can conclude from Tables 9 and 10 that the proposed HSSOGA is a significant algorithm and it is better than the other compared algorithms.
6 Conclusion and future work In this article, we propose a new hybrid social spider algorithm and genetic algorithm in order to minimize the energy function of a simplified model of the molecular. The problem of finding the global minimum of the molecular energy function is difficult to solve since the number of the local minima increases exponentially with the molecular size, which can trap any algorithm in the local minima. The proposed algorithm is called HSSOGA. In the proposed algorithm, we applied three mechanisms. The first mechanism is based on the balancing between the exploration and the exploitation processes in the SSO algorithm. The second mechanism is based on the dimensionality reduction and the population partitioning processes in order to increase the diversity of the search in the proposed algorithm. The third mechanism is based on escaping from trapping in local minima by applying
Conflict of interest The authors declare that they have no conflict of interest. Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
References Avils L (1997) Causes and consequences of cooperation and permanentsociality in spiders. In: Choe BC (ed) The evolution of social behavior in insects and arachnids. Cambridge University Press, Cambridge, p 476498 Banharnsakun A, Achalakul T, Sirinaovakul B (2011) The best-so-far selection in Artificial Bee Colony algorithm. Appl Soft Comput 11:2888–2901 Bansal JC, Deep Shashi K, Katiyar VK (2010) Minimization of molecular potential energy function using particle swarm optimization. Int J Appl Math Mech 6(9):1–9 Barbosa HJC, Lavor C, Raupp FM (2005) A GA-simplex hybrid algorithm for global minimization of molecular potential energy function. Ann Oper Res 138:189–202 Cuevas E, Cienfuegos M, Zaldvar D, Prez-Cisneros M (2013) A swarm optimization algorithm inspired in the behavior of the socialspider. Expert Syst Appl 40(16):6374–6384 De Jong KA (1985) Genetic algorithms: a 10 year perspective. In: International conference on genetic algorithms, pp 169–177 Deep K, Thakur M (2007a) A new mutation operator for real coded genetic algorithms. Appl Math Comput 193(1):211–230 Deep K, Thakur M (2007b) A new crossover operator for real coded genetic algorithms. Appl Math Comput 188(1):895–912 Deep K, Barak S, Katiyar VK, Nagar AK (2012) Minimization of molecular potential energy function using newly developed real coded genetic algorithms. Int J Optim Control Theor Appl (IJOCTA) 2(1):51–58 Dra˘zi´c M, Lavor C, Maculan N, Mladenovi´c N (2008) Acontinuous variable neighborhood search heuristic for finding thethree-
123
M. A. Tawhid, A. F. Ali dimensional structure of a molecule. Eur JOper Res 185:1265– 1273 Eric C, Yip KS (2008) Cooperative capture of large prey solves scaling challenge faced by spider societies. Proc Natl Acad Sci USA 105(33):11818–11822 Esmin AA, Lambert-Torres G, Alvarenga GB (2006) Hybrid evolutionary algorithm based on PSO and GA mutation. In: Proceedings of 6th international conference on hybrid intelligent systems, p 5762 Floudas CA, Klepeis JL, Pardalos PM (1999) Global optimization approaches in protein folding and peptide docking, DIMACS series in discrete metjematics and theoretical computer science. American Mathematical Society, Providence Garcia S, Fernandez A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning, accuracy and interpretability. Soft Comput 13:959–977 Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading Grimaccia Gandelli F, Mussetta M, Pirinoli P, Zich RE (2007) Development and validation of different hybridization strategies between GA and PSO. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp 2782–2787 Grimaldi EA, Grimacia F, Mussetta M, Pirinoli P, Zich RE (2004) A new hybrid genetical swarm algorithm for electromagnetic optimization, In: Proceedings of international conference on computational electromagnetics and its applications, Beijing, China, pp 157–160 Hedar A, Ali AF, Hassan T (2011) Genetic algorithm and tabu search based methods for molecular 3D-structure prediction. Numer Algebra Control Optim 1(1):191–209 Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor Jian M, Chen Y (2006) Introducing recombination with dynamic linkage discovery to particle swarm optimization. In: Proceedings of the genetic and evolutionary computation conference, pp 85–86 Juang CF (2004) A hybrid of genetic algorithm and particle swarm optimization for recurrent network design. IEEE Trans Syst Man Cybern Part B Cybern 34:997–1006 Kova˘cevi´c-Vuj˘ci´c V, cˇ angalovi´c M, Dra˘zi´c M, Mladenovi´c N (2004) VNS-based heuristics forcontinuous global optimization. In: Hoai An LT, Tao PD (eds) Modeling. Computation and optimization in information systems andmanagement sciences. Hermes Science, Oxford, pp 215–222
123
Krink T, Lvbjerg M (2002) The lifecycle model: combining particle swarm optimization, genetic algorithms and hill climbers. In: Proceedings of the parallel problem solving from nature, pp 621-630 Lavor C, Maculan N (2004) A function to test methods applied to global minimization of potential energy of molecules. Numer Algorithms 35:287–300 Maxence S (2010) Social organization of the colonial spider Leucauge sp. in the Neotropics: vertical stratification within colonies. J Arachnol 38:446–451 Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs. Springer, Berlin Pardalos PM, Shalloway D, Xue GL (1994) Optimization methods for computing global minima of nonconvex potential energy function. J Glob Optim 4:117–133 Pasquet A (1991) Cooperation and prey capture efficiency in a social spider, Anelosimus eximius (Araneae, Theridiidae). Ethology 90:121– 133 Pogorelov A (1987) Geometry. Mir Publishers, Moscow Robinson J, Sinton S, Samii Y.R (2002) Particle swarm, genetic algorithm, and their hybrids: optimization of a profiled corrugated horn antenna. In: Proceedings of the IEEE international symposium in Antennas and Propagation Society, pp 314–317 Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca Raton Settles M, Soule T (2005) Breeding swarms: a GA/PSO hybrid. In: Proceedings of Genetic and Evolutionary Computation Conference, pp 161–168 Wales DJ, Scheraga HA (1999) Global optimization of clusters, crystals and biomolecules. Science 285:1368–1372 Wang H, Sun H, Li C, Rahnamayan S, Jeng-shyang P (2013) Diversity enhanced particle swarm optimization with neighborhood. Inf Sci 223:119–135 Yang WY, Cao W, Chung T-S, Morris J (2005) Applied numerical methods using MATLAB. Wiley, Hoboken Zar JH (1999) Biostatistical analysis. Prentice Hall, Englewood Cliffs