J. Oper. Res. Soc. China DOI 10.1007/s40305-016-0149-8
A New Restarting Adaptive Trust-Region Method for Unconstrained Optimization Morteza Kimiaei1 · Susan Ghaderi2
Received: 3 August 2016 / Revised: 28 October 2016 / Accepted: 22 December 2016 © Operations Research Society of China, Periodicals Agency of Shanghai University, Science Press, and Springer-Verlag Berlin Heidelberg 2017
Abstract In this paper, we present a new adaptive trust-region method for solving nonlinear unconstrained optimization problems. More precisely, a trust-region radius based on a nonmonotone technique uses an approximation of Hessian which is adaptively chosen. We produce a suitable trust-region radius; preserve the global convergence under classical assumptions to the first-order critical points; improve the practical performance of the new algorithm compared to other exiting variants. Moreover, the quadratic convergence rate is established under suitable conditions. Computational results on the CUTEst test collection of unconstrained problems are presented to show the effectiveness of the proposed algorithm compared with some exiting methods. Keywords Unconstrained optimization · Trust-region methods · Nonmonotone technique · Adaptive radius · Theoretical convergence Mathematics Subject Classification 90-08 · 90C30 · 90C90
B
Morteza Kimiaei
[email protected] Susan Ghaderi
[email protected]
1
Department of Mathematics, Asadabad Branch, Islamic Azad University, Asadabad, Iran
2
Department of Mathematics, Faculty of Science, Razi University, Kermanshah, Iran
123
M. Kimiaei, S. Ghaderi
1 Introduction This paper is concerned with the unconstrained optimization min f (x) s.t. x ∈ Rn ,
(1.1)
where f : Rn → R is a real-valued continuously differentiable function, which is bounded below. We assume that f is nonlinear and its first-order information (function value and gradient) is available. 1.1 Motivation and History Trust-region methods are among the first schemes introduced by Powell [1] in 1984 for solving nonlinear unconstrained optimization problems. Very soon it was realized that these methods are much more reliable in both numerical results and convergence theory. Up to now lots of variants of these methods have been suggested by many researchers, e.g., [2–11]. In the classical trust-region methods, a trial step dk is computed by minimizing a (quadratic) model of the objective function at the current point over a region around this point, i.e., min m k (d) := f k + gkT d + 21 d T Bk d (1.2) s.t. d k , where · denotes the Euclidean norm, f k := f (xk ), gk := ∇ f (xk ), Bk is the exact Hessian ∇ 2 f (xk ) or its symmetric approximation, and k stands for the trust-region radius. To evaluate an agreement between the model and the objective function to accept or reject the trial step dk , a criterion based on actual and model reductions is required. This is done in the traditional monotone trust-region methods by the ratio rk :=
f (xk ) − f (xk + dk ) , m k (xk ) − m k (xk + dk )
(1.3)
where the numerator and the denominator are called the actual and the predicted reductions, respectively. The trust-region radius will be expanded whenever there is a good agreement between the objective function and the approximated model; otherwise it will be diminished. In trust-region methods if the sequence of the objective values is monotonically decreasing, it is called monotone trust-region method; otherwise it is called nonmonotone trust-region method. One of the crucial factors in trust-region methods is the radius updating, e.g., [3,5, 10–14]. Classical trust-region methods cannot properly give the suitable radius which will increase the total number of iterations and function evaluations (e.g., [3,5,10–14]). Several strategies have been introduced in the literature for radius updating and initial radius choosing, e.g., [3,5,6,9–14], where some of them only update the radius based on the current point information (gradient and Hessian) and some of them are based on information of the last two iterates. The main advantage of these algorithms is their
123
A New Restarting Adaptive Trust-Region Method for…
ability to control the size of trust-region radius and also the fact that iterates do not introduce very small radius near the optimizer and very large radius far away from the optimizer, as often happens when traditional methods are applied to strongly nonlinear problems. The main disadvantage of these approaches is the possible dependence of the parameters to the problem information. Many of other mentioned methods are only suitable for small- and medium-scale unconstrained optimization. Indeed, they use both the information of gradient and Hessian. It is well known that far away from the optimizer, a large trust-region radius will be used while close to it, a smaller one can be more efficient. On the basis of this idea and the idea of nonmonotone techniques introduced in [15,16], and motivated by [2], to determine the trust-region radius k , Esmaeili and Kimiaei [5] introduced a gradient-based formula using the 2-norm gradient vectors as follows: Rk = ηk gl(k) + (1 − ηk )gk ,
(1.4)
max {gk− j }, k ∈ N0 := N ∪ {0},
(1.5)
and gl(k) =
0 jm(k)
where ηk ∈ [ηmin , ηmax ], ηmin ∈ [0, 1), ηmax ∈ [ηmin , 1] and m(k) ∈
{0}, if k = 0, 0, min{m(k − 1) + 1, N } , if k 1,
(1.6)
for which N 0. They presented the adaptive radius technique as follows: ⎧ γ 1 k , if rk < μ1 , ⎪ ⎪ ⎨ max{γ2 Rk , k }, if rk ∈ [μ1 , μ2 ), k = if rk ∈ [μ2 , μ3 ), Rk , ⎪ ⎪ ⎩ max{γ3 Rk , k }, if rk μ3 ,
(1.7)
where γ1 , γ2 , and γ3 are constants of the trust-region scaling parameter. The formula (1.7) is based on the concept comes from dominance in multicriteria optimization (see [17–19]). It is said that gk+1 dominates gl(k) , whenever gk+1 gl(k) . The definition of gl(k) has some drawbacks as follows: (a) Ifgk+1 gk , then gk+1 adds to gl(k+1) when it does not dominate on gl(k) . This leads to produce very large radius near the optimizer, which will increase computational cost. (b) If the gl(k) size increases, then computational costs will be increased. (c) The gl(k) does not use the information of Hessian. 1.2 Content In this paper, we develop a new adaptive trust-region algorithm for solving (1.1) based on restarting the nonmonotone formula using Hessain approximation in (1.4). We modify the nonmonotone technique and then apply it to generate the new
123
M. Kimiaei, S. Ghaderi
trust-region radius that truly decreases the total number of iterations and function evaluations. In contrast with the previously presented adaptive trust-region algorithms for which the determination of the radius depends on costly process of storing Hessian matrix, our algorithm determines only the radius based on the 2-norm gradient that has not any extra computation which is available information. In addition, we investigate the global convergence to first-order stationary points of the new proposed algorithm and also establish the quadratic convergence properties. To illustrate the efficiency of the new proposed algorithm in practice, we report some numerical experiments. The rest of this paper is organized as follows. In Sect. 2, we describe the new idea with the structure of algorithm. In Sect. 3, we investigate the global and quadratic convergence properties of the proposed algorithm. Numerical results are provided in Sect. 4 to show the encountering behavior of the proposed approach for unconstrained optimization problems. Finally, some conclusions are given in Sect. 5.
2 The New Adaptive Trust-Region Method In this section, we first introduce a new adaptive trust-region algorithm which takes advantages of the gradient vector to produce the radius and then the new algorithm is presented. Before defining the new adaptive radius strategy, let us define Φkυ (i) := υΦk (i) + (1 − υ)ϕk , where υ ∈ [υmin , υmax ], υmin ∈ [0, 1), υmax ∈ [υmin , 1], and Φk is a data structure which stores pertinent information and also removes unsuitable information in each iteration as follows: (2.1) Φk := {ϕk− j }0 jm(k) , k ∈ N0 , for which ϕk := Hk gk , Hk is inverse of Bk , and m(k) satisfies (1.6). Note that ϕk will compute by using advantages of compact limited memory Broyden-FletcherGoldfarb-Shanno (BFGS) in Sect. 4. The ith element of the data structure (2.1) is determined by if k < N , ϕk , Φk (i) = ϕk−N +1+i , if k N ,
where i∈
[0, k], [0, N − 1],
if k < N , if k N .
We now propose the adaptive radius based on the following definition:
m(k)−1 k :=
i=0
m(k)−i
ηk
m(k)−1 i=0
Φkυ (i) + ϕk
m(k)−i
ηk
+1
,
(2.2)
in which m(k) satisfies (1.6), ηk ∈ [ηmin , ηmax ], ηmin ∈ [0, 1), and ηmax ∈ [ηmin , 1]. Before presenting a restarting procedure to decrease the computational cost of the new formula, we need to define
123
A New Restarting Adaptive Trust-Region Method for…
N Φ l(k) := max Φk , k ∈ N0 .
(2.3)
Let us propose a restarting procedure, similar to [5], which is applied as follows: in iteration k + 1, if ϕk+1 dominates N Φ l(k) then we save ϕk+1 as an element of Φk ; otherwise we will restart Φk . As a result, we reset Φk when the descent condition ϕk+1 ϕk is not satisfied that prevents producing very small trust-region radius near the optimizer. Because the sequence N Φ l(k) contains the collection of convergent subsequences, the trust-region radius slowly decreases and this feature of Φk is the key to avoiding production of very large trust-region radius whenever iterations are not far away from the optimizer. Let us define Ik , k ∈ N0 , as a subsequence generated in kth iteration that uses the proposed restarting procedure. These subsequences define as a list {N Φ l(k) }k∈Ik of N -tuples N Φ l(k) so that if N Φ l(i) belongs to {N Φ l(k) }k∈Ik , then ϕk+1 N Φ l(i) , for all i ∈ Ik , and this process terminates when ϕk+1 > N Φ l(i) . In order to analyze the properties of the new sequence k and the convergence results of the proposed method in the next section, we consider the following assumptions: H1 The objective function f (x) is twice continuously differentiable and has a lower bound on the level set L(x0 ) := {x ∈ Rn | f (x) f (x0 ), x0 ∈ Rn } for starting point x0 . H2 The approximation Hessian matrix Bk is uniformly bounded, i.e., there exists a constant M > 0 such that Bk M, for all k ∈ N0 . The restarting procedure leads to the decreasing subsequences {N Φl(k) }k∈Ik which is studied in the next result. Lemma 2.1 Suppose that the sequence {xk } is generated by the new method and (H1) holds. Then, the subsequence {N Φ l(k) }k∈Ik is a decreasing subsequence of {N Φ l(k) }. Proof See [5]. Lemma 2.2 Suppose that the sequence {xk } is generated by the new method and (H1) holds. Then, {k }k∈Ik is a decreasing subsequence of {k }. Proof First, we show that ϕk k . By the definition Φk , we have Φk (i) ϕk , which implies Φkυ (i) ϕk , and leads to
m(k)−1 k =
i=0
ηkm(k)−i Φkυ (i) + ϕk
m(k)−1 i=0
m(k)−i
ηk
+1
m(k)−1
i=0
ηkm(k)−i ϕk + ϕk
m(k)−1 i=0
m(k)−i
ηk
+1
ϕk .
123
M. Kimiaei, S. Ghaderi
Now, we prove that the subsequence {k }k∈Ik is a decreasing one. First, by the removing procedure, we simply obtain ϕk+1 ϕk . This fact together with definition Φk suggests that Φk+1 (i) Φk (i), ∀ i = 0, · · · , m(k), and consequently υ Φk+1 (i) Φkυ (i), ∀ i = 0, · · · , m(k).
Based on current information, for k ∈ Ik , we have
m(k) k+1 =
i=0
m(k)−i+1
ηk
m(k)
υ (i) + ϕ Φk+1 k+1
m(k)
ηkm(k)−i+1 Φkυ (i) + ϕk+1
m(k) m(k)−i+1 +1 i=0 ηk
i=0
m(k)−i+1 +1 i=1 ηk m(k)−i+1 υ ηk Φk (i) + ηk Φk (m(k)) + ϕk+1 i=0
m(k) m(k)−i+1 +1 i=0 ηk
m(k)−1 m(k)−i υ ηk η Φ (i) + ϕ + ϕk+1 k i=0 k k
m(k)−1
=
=
=
=
=
m(k)
ηk
m(k)−i+1 +1 i=0 ηk
m(k)−1 m(k)−i k ( i=0 ηk + 1) + ϕk+1
m(k)
m(k)−i+1 +1 i=0 ηk
m(k)−1 m(k)−i+1 k ηk + ηk + ϕk+1 i=0
m(k)
k
k
m(k)−i+1 +1 i=0 ηk m(k)−i+1 + ϕk i=0 ηk
m(k)
m(k) i=0 m(k) i=0
m(k) i=0
m(k)−i+1
ηk
+1 m(k)−i+1 ηk +1
ηkm(k)−i+1 + 1
k
m(k) i=0
m(k) i=0
m(k)−i+1
ηk
m(k)−i+1
ηk
+ k
+1
= k .
This inequality shows that the {k }k∈Ik is a decreasing subsequence. We define a new adaptive radius by ⎧ ⎨ γ2 k , k+1 = max{k , k }, ⎩ max{γ3 k , k },
if rk ∈ [μ1 , μ2 ), if rk ∈ [μ2 , μ3 ), if rk μ3 ,
(2.4)
where γ2 , γ2 , γ3 , μ1 , μ2 , and μ3 are some constants. The proposed adaptive radius has some benefits, where we list some of them as follows:
123
A New Restarting Adaptive Trust-Region Method for…
• In each iteration, the good value of ϕk is stored and has the most effect in the sequence {k }k∈Ik . • For i = 0, · · · , m(k) − 1, Φkυ (i) is convex combination Φk (i) and ϕk with m(k)−i . With increment of i, the effect of Φkυ (i) will decrease. impact factors ηk • Due to the decreasing subsequences of {k }, k will not stay too large, and so it prevents increasing the number of solving subproblems. Now, we can give our new adaptive trust-region algorithm as follows: Algorithm 1: ATRN (adaptive trust-region algorithm) Input: x0 ∈ Rn , B0 , H0 ∈ Rn×n , kmax , ηmin ∈ (0, 1], ηmax ∈ [ηmin , 1], η0 ∈ [ηmin , ηmax ], υmin ∈ (0, 1] υmax ∈ [υmin , 1], υ ∈ [ηmin , ηmax ], 0 < μ1 μ2 μ3 < 1, 0 < γ1 γ2 < 1, γ3 1, N > 0, ε > 0; Output: xb , f b ; Step 1: begin Step 2: k = 0; m(0) = 0; ϕ0 = g0 ; Φ0 = ϕ0 ; Φ0υ = Φ0 ; N Φ l(0) = Φ0 ; 0 = N Φ l(0) ; 0 = 0 ; Step 3: while (gk ε and k kmax ) do Step 4: rk = 0; Step 5: while rk < μ1 do Step 6: specify the trial point dk by solving the subproblem (1.2); Step 7: determine the trust-region ratio rk using (1.3); Step 8: if rk < μ1 then Step 9: update the trust-region radius by k = γ1 k ; Step 10: end Step 11: end Step 12: xk+1 = xk + dk ; update Hk+1 and Bk+1 ; compute ϕk+1 = Hk+1 gk+1 ; Step 13: if ϕk+1 > N Φ l(k) then Step 14: m(k + 1) = 1; N Φ l(k+1) = ϕk+1 ; k+1 = N Φ l(k+1) ; Step 15: else Step 16: find m(k + 1) by (1.6) and calculate N Φ l(k+1) using (2.3); Step 17: compute k+1 using (2.2) and choose ηk+1 ∈ (ηmin , ηmax ); Step 18: end Step 19: update k+1 using (2.4), k ← k + 1; Step 20: end Step 21: xb = xk ; f b = f k ; Step 22: end In Algorithm 1, Step 13 to Step 15 is called the restarting procedure. In addition, in the algorithm, the loop started from Step 3 to Step 20 is called the outer cycle, and the loop started from Step 5 to Step 11 is called the inner cycle.
3 Convergence Analysis In this section, we give convergence analysis of ATRN.
123
M. Kimiaei, S. Ghaderi
Remark 3.1 If the objective function f (x) is twice continuously differentiable and the level set L(x0 ) is bounded, then (H1) implies that ∇ 2 f (x) is uniformly continuous and bounded above on an open bounded convex set Ω containing L(x0 ). As a result, there exists a constant L 2 > 0 such that ∇ 2 f (x) L 2 , ∀ x ∈ Ω. Therefore, using the mean-value theorem, one can conclude that g(x) − g(y) L 2 x − y, ∀ x, y ∈ Ω, which means that the objective function f is Lipschitz continuous on the open bounded convex set Ω. Remark 3.2 To establish the global convergence of ATRN, we assume that the reduction of the model m k is at least as much as a fraction of that obtained by Cauchy point, i.e., there exists a constant β ∈ (0, 1) such that, for all k,
gk . (3.1) m k (xk ) − m k (xk + dk ) βgk min k , Bk This inequality is called the sufficient reduction condition and has been investigated by many authors, where they extended some inexact methods to solve subproblem (1.2), see, e.g., [10,20,21]. Formula (3.1) implies that dk = 0 whenever gk = 0. Lemma 3.3 ([5], Lemma 1) Suppose that (H1) and (H2) hold and the sequence {xk } is generated by Algorithm 1. Then, we have | f (xk ) − f (xk + dk ) − (m k (xk ) − m k (xk + dk ))| ωdk 2 , where ω := L 2 + M. The following lemma shows that {N Φl(k) }k∈Ik is a convergent subsequence. Lemma 3.4 ([5], Lemma 3) Suppose that (H1) and (H2) hold and there exists a constant κ > 0 such that κgk > dk . Suppose, furthermore, the sequence {xk } is generated by ATRN. Then we have lim N Φ l(k) = lim ϕk .
k→∞
k→∞
Corollary 3.5 Suppose that the sequence {xk } is generated by ATRN. Then lim k = lim ϕk .
k→∞
k→∞
Proof From Eq. (2.2), we have ϕk N Φ l(k) for k ∈ Ik , then
m(k)−1 lim k = lim
k→∞
k→∞
ηm(k)−i Φkυ (i) + ϕk
m(k)−1 m(k)−i η +1 i=0
i=0
m(k)−1 lim
k→∞
123
i=0
ηm(k)−i N Φl(k) + N Φ l(k) = lim N Φ l(k) ,
m(k)−1 m(k)−i k→∞ η +1 i=0
A New Restarting Adaptive Trust-Region Method for…
for each k ∈ Ik . Therefore, using ϕk k for each k ∈ Ik and Lemma 3.4, we can conclude that lim k = lim ϕk . k→∞
k→∞
If the current iterate is not first-order critical and the trust-region radius is small enough, then we show that (k + 1)th iteration must be successful and the trust-region radius must increase . Lemma 3.6 Suppose that (H1) and (H2) hold. Also, suppose that gk = 0 such that k
βgk (1 − μ3 ). ω
(3.2)
Then iteration k is very successful, i.e., rk μ3 , and k+1 k . Proof Since β ∈ (0, 1) and 1 − μ3 ∈ (0, 1), β(1 − μ3 ) ∈ (0, 1). Hence, (3.2) abbreviates as k
gk . ω
This fact along with (3.1), for all k ∈ N0 , implies that m k (xk ) − m k (xk + dk ) βgk k , and so
ωdk 2 f (xk + dk ) − m k (xk + dk ) m k (xk ) − m k (xk + dk ) βgk k 2 13 ωk ω k 1 − μ3 . = βgk k βgk Therefore, rk μ3 and k+1 = max{γ3 k , k } k . 1 − rk =
We first obtain that the radius cannot become too small as long as a first-order critical point is not approached, which is crucial for the progress of the algorithm. Then we prove the crucial result that the number of successful iterations must be finite unless a first-order critical point is approached. Lemma 3.7 Suppose that (H1) and (H2) hold. If gk ε, for all k, then 1. lim inf k→∞ k > 0. 2. There can only be finitely many successful iterations. Proof (1) By contradiction, for all sufficiently large k, assume that lim k = 0.
k→∞
(3.3)
If there is a subsequence {gk }k∈J such that gk k M for all k ∈ J . Then we obtain k
gk ε , ∀k ∈ J , M M
123
M. Kimiaei, S. Ghaderi
which is a contradiction to (3.3). Hence, we can suppose that there is a positive index k0 such that (3.4) gk k M, ∀k k0 . By replacing (3.4) in (3.1), for all k k0 , we get m k (xk ) − m k (xk + dk ) βεk , where f (xk ) = m k (0), so f (xk ) = m k (xk ) and so 1 − rk =
ω2k ωdk 2 f (xk + dk ) − m k (xk + dk ) ω k . = m k (xk ) − m k (xk + dk ) βεk βεk βε
Now, we have the following two cases: (a) If k Hence,
βε ω (1
− μ3 ), for all sufficiently large k, 1 − rk < 1 − μ3 , or rk μ3 .
k+1 = max{γ3 k , k } γ3 k k gk ε > 0. This contradicts with (3.3). (b) If k > βε ω (1 − μ3 ), the update rule of k implies k+1 γ1 k > γ1
βε (1 − μ3 ) > 0, ω
which is a contradiction to (3.3). (2) Suppose, for the purpose of obtaining a contradiction, that there are many finitely successful iterations. Let us define K := {ki } with |K| = ∞. From item (i) there exists a constant > 0 such that k for all k ∈ K. Hence, for all k ∈ K, we have f k − f (xk+1 ) μ1 m k (xk ) − m k (xk + dk ) μ1 gk min{gk /M, k } μ1 ε min ε/M, := δ > 0. Hence, f 0 − f (xk+1 ) =
k
( f j − f j+1 ) ζk δ,
j=0, j∈K
where ζk := |{1, 2, · · · , k} ∩ K|. Since |K| = ∞, we have that lim ζk = ∞,
k→∞
and therefore f 0 − f (xk+1 ) is unbounded above, which is a contradiction to (H1). Our initial assumption have to then be false, the set K of successful iterations must be finite.
123
A New Restarting Adaptive Trust-Region Method for…
We now establish the criticality of the limit point of the sequence of iterates when there are only finitely many successful iterations. Lemma 3.8 Suppose that (H1) and (H2) hold. If there are only many finitely successful iterations, then xk = x∗ for all sufficiently large k and g(x∗ ) = 0. Proof Suppose that m is the index of the last successful iterate. Then, for all i > 0, we have x∗ = xm+1 = xm+i leading to rk < μ1 , ∀ i > 1. Step 9 of Algorithm 1, for all unsuccessful iterations k, implies lim k = 0.
(3.5)
k→∞
For the purpose of establishing a contradiction, assume that g(xm+1 ) ε, for some ε > 0. Item (1) of Lemma 3.7 implies that (3.5) is impossible and we deduce that g(xm+1 ) = 0, for i > 0, which gives the result. We only need to restrict our attention to the case where there are many infinitely successful iterations. Lemma 3.9 Suppose that (H1) and (H2) hold. If ATRN has many infinitely successful iterations, then lim inf k→∞ gk = 0. Proof By contradiction, suppose that lim inf k→∞ gk = 0 does not satisfy. Hence, we can assume that there exist constants ε > 0 and a set index K ⊂ N0 such that gk ε, ∀k ∈ K . Let us define K1 := {k | rk μ1 } with |K1 | = ∞ and K2 := {k ∈ K1 | k gk /M}. By (3.1), for k ∈ K2 , f k − f k+1 μ1 (m k (xk ) − m k (xk + dk )) μ1
βgk 2 > σ1 ε2 , M
where σ1 := μ1 β/M. By (H1), for k ∈ K2 , the sequence { f k } is convergent, which implies σ1 ε2 ( f k − f k+1 ) ( f k − f k+1 ) < ∞. k∈K2
k∈K2
k
123
M. Kimiaei, S. Ghaderi
Therefore, K2 must be finite. As a result, there exists k0 ∈ N0 such that k gk /M for all k k0 and k ∈ K1 . By setting K3 := {k k0 | k gk /M , k ∈ K1 }, for k ∈ K3 , we have f k − f k+1 μ1 (m k (xk ) − m k (xk + dk )) μ1 βεk = σ2 k , where σ2 := μ1 βε. By (H1), for k ∈ K3 , we get that the sequence { f k } is convergent, which leads to k∈K3
k
1 1 ( f k − f k+1 ) ( f k − f k+1 ) < ∞. σ2 σ2 k∈K3
k
This leads to lim
k→∞,k∈K3
k = 0,
which is a contradiction to item (1) of Lemma 3.7. So the result is valid. Hk Hk Let us define λmin and λmax as the smallest and largest eigenvalues of Hk .
Lemma 3.10 Suppose that (H2) holds, the sequence {xk } is generated by ATRN and dk is a solution of the subproblem (1.2). Then we have m k (xk ) − m k (xk + dk ) θ gk 2 , Hk where θ := β min λmin ,
1 M
(3.6)
.
Proof We consider the following two cases: (i) If the restarting procedure does not satisfy in iteration k, then k k−1 . Using (H2) and (2.4), for each k ∈ Ik , we have
gk gk βgk min γ3 k−1 , m k (xk ) − m k (xk + dk ) βgk min k , Bk M
gk gk βgk min k , βgk min γ3 k , M M
gk gk Hk βgk min λmin gk , βgk min ϕk , M M = θ gk 2 ,
Hk where θ = β min λmin , M1 . (ii) If the restarting procedure satisfies in iteration k, then k = ϕk . By (H2) and (3.5), we get
123
A New Restarting Adaptive Trust-Region Method for…
gk gk βgk min γ3 k , m k (x) − m k (xk + dk ) βgk min k , Bk M
gk gk βgk min ϕk , = βgk min γ3 ϕk , M M
gk Hk = θ gk 2 , gk , βgk min λmin M
Hk where θ = β min λmin ,
1 M
, which gives the results.
We now prove the global convergence of the sequence {xk } generated by ATRN. Theorem 3.11 Suppose that (H1) and (H2) hold and {xk } generated by ATRN. Then lim gk = 0.
k→∞
Proof By contradiction, we assume a particular positive index l with gl = ε1 = 0. Let us define the scalar δl := gl /2 and L 2 > 1 as Lipschitz constant for g(x) over L(x0 ). Consider the 2 -ball B (xl ) of radius > 0 around the given point xl . Then, for all x ∈ B (xl ) ∩ L(x0 ) and by setting = Lδl2 , we have g(x) gl − g(x) − gl gl − L 2 x − xl δl . Hence, if {xk }kl ⊂ B (xl ) ∩ L(x0 ), then g(xk ) δl > 0 for all k l, and this is a contradiction by Lemma 3.9. Therefore, the sequence {xk } eventually leaves B (xl ) ∩ L(x0 ). Suppose that xm+1 (m l) is the first successful iteration of the trust-region method after xm outside B (xl ) ∩ L(x0 ). Let us define k := {k | l k < m + 1 s.t xk = xk+1 }. It is clear that, for all k ∈ k , gk δl . We therefore, get fl − f m+1 =
k∈k
μ1 θ
f k − f k+1
μ1 [m k (xk ) − m k (xk + dk )]
k∈k
gk 2 μ1 θ δl .
k∈k
By taking a limit of both sides of this inequality, as l → ∞, we get lim δl = lim gl /2 = 0,
l→∞
l→∞
which implies lim gl = 0.
l→∞
This is a contradiction to gl = ε1 = 0. This implies the result is valid.
123
M. Kimiaei, S. Ghaderi
The next theorem shows that ATRN can be reduced to Newton method, where the sequence {xk } generated by this scheme can attain quadratic convergence rate under some conditions. Theorem 3.12 Suppose that (H1) and (H2) hold and assume the sequence of {xk } is generated by the ATRN, which converges to x∗ . Then for sufficiently large k, we have xk+1 = xk + dk , where dk is a solution of (1.2). Furthermore, if Bk := ∇ 2 f (xk ) and dk = −Bk−1 gk , then the sequence {xk } is converged to x∗ quadratically. Proof If dk be a solution of (1.2), we first show that xk+1 = xk + dk , for sufficiently large k. This shows the fact that Hk ϕk λmax gk ,
Corollary 3.5, and Theorem 3.11 imply k → 0, as k → ∞, which leads to dk k max{γ3 k , k−1 } → 0, as k → ∞,
(3.7)
because dk is feasible for the subproblem (1.2). Since ATRN is not stopped, we have gk ε. This, Lemma 3.3, Lemma 3.10, and (3.7) suggest that m k (xk + dk ) − f (xk + dk ) f k − f (xk + dk ) m (x ) − m (x + d ) − 1 = m (x ) − m (x + d ) k k k k k k k k k k ωdk 2 ωdk 2 → 0, as k → ∞. θ gk 2 θ ε2 Hence, for sufficiently large k, we have rk μ1 , and consequently, for all sufficiently large k, the trial point dk is accepted by ATRN, i.e., xk+1 = xk + dk . Finally, ATRN is reduced to Newton method, the quadratically [21].
4 Preliminary Numerical Experiments In this section, we report numerical results for Algorithm 1 proposed in Sect. 2 for solving unconstrained optimization problems. In our experiments, we compare Algorithm 1 with some state-of-the-art algorithms. In detail, we consider
123
A New Restarting Adaptive Trust-Region Method for…
• TTR: traditional trust-region algorithm. • ATRS: adaptive trust-region algorithm of Shi and Guo in [10]. • ATRG: adaptive trust-region algorithm of Esmaeili and Kimiaei in [5]. In the experiments we used 119 test problems of the CUTEst test collections [22] from dimension 2 to 10 000. All of the codes are written in MATLAB using the same subroutine, and they are tested on a PC with a 2.7-GHz Pentium(R) Dual-core CPU and 1 GB of memory under ubuntu 10.04 Linux. The initial points are standard ones used in CUTEst. To solve the quadratic subproblem (1.2) in all algorithms, we use the Steihaug-Toint scheme [[23] Chap. 7, pp 205] CoGT 02 where the scheme is terminated if 1 ∇m(xk + d) min 0.01, ∇m k (xk ) 2 ∇m k (xk ) or d = k . In our experiments, the algorithms are stopped whenever the total number of iterates exceeds 20 000 or √ (4.1) gk 10−6 n. All of our algorithms were failed for CURLY10, DJTL, EIGENALS, NONCVXU2, PENALTY3, so we did not consider them in Table 1. Since the computation of ϕk , for large-scale problems, has high cost, we use the compact limited memory BFGS formula proposed by Byrd et al. in [24] to update the matrix Bk and then compute ϕk by Algorithm QN in [25]. In all algorithms, the matrix Bk is updated by the subsequent compact limited memory BFGS formula Bk = (0)
where Bk follows:
Bk(0)
−
Yk Bk(0) Sk
−D
k
Lk
L Tk (0) T Sk Bk Sk
−1
YkT (0) , SkT Bk
= λk I , for some positive scalar λk . Sk , Yk , Dk , and L k are defined as Sk = [sk−m , · · · , sk−1 ], Yk = [yk−m, · · · , yk−1 ], T T y Dk = diag sk−m yk−m , · · · , sk−1 k−1 , y , if i > j, sT (L k )i, j = k−m+i−1 k−m+ j−1 0, otherwise,
where sk = xk+1 − xk ,
yk = gk+1 − gk , m = min{k, m 1 }.
In our implementation, we use λk :=
y km 2 T
y km s km
suggested by Shanno and Phua [26]. However, we do not update Bk whenever the curvature condition, i.e., (skTi yki > 0 for i = 1, · · · , m), does not hold (cf. [24]). The
123
M. Kimiaei, S. Ghaderi Table 1 Numerical results for adaptive trust-region methods Problem name
AIRCRFTB ALLINITU ARGLINA ARWHEAD BARD BEALE BIGGS3 BIGGS5 BIGGS6 BOX2 BOX3 BRKMCC BROWNBS BRYBND CHAINWOO CHNROSNB CLIFF COSINE CRAGGLVY CUBE DECONVU DENSCHNA DENSCHNB DENSCHNC DENSCHND DENSCHNE DENSCHNF DIXMAANA DIXMAANB DIXMAANC DIXMAAND DIXMAANE DIXMAANF DIXMAANG DIXMAANH DIXMAANI DIXMAANJ DIXMAANK DIXMAANL DIXON3DQ DQDRTIC DQRTIC EDENSCH EIGENBLS EIGENCLS EG2 ENGVAL1 ENGVAL2 ERRINROS EXPFIT EXTROSNB
123
Dim
8.00 4 200 10 000 3 2 6 6 6 3 3 2 2 10 000 10 000 50 2 10 000 5 000 2 63 2 2 2 3 3 2 9 000 9 000 9 000 9 000 9 000 9 000 9 000 9 000 9 000 9 000 9 000 9 000 10 000 5 000 5 000 2 000 2 550 2 652 1 000 5 000 3 50 2 1 000
ATRS
TTR
ATRG
ATRN
Ni
Nf
Ni
Nf
Ni
Nf
Ni
Nf
43 13 2 9 29 24 111 111 47 20 20 7 17 32 420 295 45 14 79 46 869 11 8 20 60 36 12 9 7 8 11 284 127 192 142 806 191 172 499 12 603 14 76 28 9 140 2 076 4 19 43 2 538 16 35
96 25 4 45 90 48 336 336 134 47 47 20 21 96 835 465 131 25 133 121 1369 18 12 42 121 119 35 17 17 20 25 310 167 219 168 983 212 226 580 15 931 44 140 55 11 554 2 635 27 42 130 5 295 40 67
53 16 2 9 21 17 365 365 69 22 22 10 20 24 20 001 310 43 9 20 001 41 255 13 6 17 62 45 14 8 7 11 8 320 224 189 187 1 153 178 151 720 9 386 16 76 26 9 809 2 082 4 20 001 47 5 850 17 1 172
89 20 4 18 31 27 507 507 97 34 34 14 23 33 26 721 430 76 12 20 308 65 339 27 8 29 74 61 21 11 10 15 12 406 290 242 253 1 424 236 202 852 10 890 28 90 47 11 564 2 496 10 20 278 74 8 204 36 1 605
39 16 2 9 21 16 3 455 3 455 9 803 240 240 10 18 26 20 001 294 20 001 9 91 38 227 13 6 18 3 558 539 14 8 7 11 8 2 486 1 913 493 1 843 13 513 878 798 7 143 20 001 16 1 302 25 8 970 2 050 4 16 37 2 193 16 2 963
98 23 4 18 31 26 3 670 3 670 9 956 247 247 14 22 70 34 142 451 20 145 12 161 97 242 24 8 31 3 570 545 25 11 10 15 12 2 489 1 917 497 1 849 13 518 893 807 7 147 20 653 29 1 316 36 9 303 2 274 10 43 95 3 580 35 3 936
31 15 2 9 23 17 315 315 53 18 18 10 18 24 20 001 283 21 15 84 39 261 13 6 18 62 54 15 8 7 11 8 294 204 181 176 941 182 161 359 11 353 21 76 25 10 746 1 788 4 17 42 2 263 14 1 592
57 20 4 18 38 34 476 476 100 24 24 14 21 43 25 175 353 134 25 128 76 319 25 8 29 74 64 26 11 10 15 12 315 221 191 185 1 017 201 178 390 12 179 40 92 43 11 548 1 921 10 31 84 3 234 18 3 248
A New Restarting Adaptive Trust-Region Method for… Table 1 continued Problem name
FMINSRF2 FMINSURF FREUROTH GENROSE GROWTHLS GULF HAIRY HATFLDD HATFLDE HEART6LS HEART8LS HELIX HILBERTA HILBERTB HIMMELBB HIMMELBF HIMMELBG HIMMELBH JENSMP KOWOSB LIARWHD LMINSURF MANCINO MARATOSB MEXHAT MINSURF MOREBV MSQRTALS MSQRTBLS NCB20 NCB20B NONDIA NONDQUAR OSBORNEA OSBORNEB PARKCH PENALTY1 PENALTY2 POWELLSG POWER QUARTC ROSENBR RAYBENDL S308 SCHMVETT SINEVAL SINQUAD SISSER SNAIL
Dim
1 024 1 024 10 000 500 3 3 2 3 3 6 8 3 2 10 2 4 2 2 2 4 10 000 1 024 100 2 2 64 10 000 1 024 1 024 1 010 2 000 10 000 1 000 5 11 15 500 200 10 000 10 000 10 000 2 1 026 2 1 000 2 10 000 2 2
ATRS
TTR
ATRG
ATRN
Ni
Nf
Ni
Nf
Ni
Nf
Ni
Nf
375 535 20 001 1 229 361 71 22 22 38 2 553 762 34 5 6 21 101 9 10 26 33 34 371 11 20 001 38 0 12 15 060 17 651 538 2 886 11 510 2 580 269 1 183 492 52 377 85 65 1 525 11 35 120 30 26 12
412 653 24 059 225 2 625 699 185 121 31 87 14 113 1 797 107 6 14 65 455 10 13 84 95 140 412 63 21 401 260 134 1 24 18 655 21 954 894 3 948 58 931 8 440 432 2 576 1 134 208 512 159 191 2 834 24 56 458 116 34 14
962 1 196 20 001 1 283 1 63 20 001 37 95 20 001 731 39 5 6 12 20 001 9 9 1 41 45 943 10 2 486 41 0 11 12 049 15 164 632 1 806 9 430 7 640 239 1 176 446 82 376 82 66 2 436 10 34 127 36 24 11
986 1 237 20 283 1 706 7 80 20 290 61 144 29 201 1 011 70 6 9 21 20 391 10 11 5 49 67 962 22 3 462 84 1 14 14 946 18 953 853 2 456 20 603 10 770 306 2 251 600 119 480 122 102 3 000 14 39 159 63 27 14
1 913 5 641 22 1 208 1 20 001 55 1 593 387 20 001 743 38 5 6 12 20 001 9 9 1 1 054 36 1 905 10 2 261 32 0 11 14 785 13 957 540 2 006 9 2 743 2 135 304 1 1 582 3 471 68 382 1 342 70 1 620 10 35 109 30 905 8
1 915 5 642 53 2 564 7 20 017 154 1 704 466 33 584 1 433 86 6 9 21 21 029 10 11 5 1 074 72 1 907 22 9 778 100 1 14 16 152 15 414 709 2 484 20 2 776 4 135 374 2 1 601 3 770 103 457 1 357 161 3 044 14 43 232 73 908 12
365 548 22 1 209 1 74 20 001 29 50 20 001 805 38 5 6 12 185 9 8 1 39 35 377 10 20 001 32 0 13 17 105 17 021 503 1 772 9 504 2 294 266 1 165 347 51 369 81 63 1 258 10 35 103 20 001 24 10
379 568 55 1 732 7 97 4 999 818 41 76 25 621 1 227 67 6 9 21 358 10 10 5 60 70 393 22 4 435 705 74 1 16 18 278 18 389 629 2 062 20 607 4 089 324 2 314 548 99 402 119 110 1 557 14 44 189 5 151 322 27 15
123
M. Kimiaei, S. Ghaderi Table 1 continued Problem name
SPARSQUR SPMSRTLS SROSENBR TESTQUAD TOINTGSS TOINTGOR TOINTPSP TOINTQOR TRIDIA VARDIM VAREIGVL WATSON YFITU ZANGWIL2
Dim
10 000 10 000 10 000 10 000 10 000 50 50 50 10 000 200 50 12 3 2
ATRS
TTR
ATRG
ATRN
Ni
Nf
Ni
Nf
Ni
Nf
Ni
Nf
44 184 15 5 858 14 139 130 45 1 943 64 24 1 158 248 2
98 206 39 7 955 24 191 253 58 2 389 186 34 2 436 689 3
43 187 14 8 443 14 135 129 43 1 719 62 22 3 188 424 2
52 216 20 11 406 16 188 177 57 2 219 88 38 4 452 620 3
66 181 14 6 043 14 137 130 43 1 899 62 23 1 954 214 2
75 194 20 8 067 16 151 230 58 2 350 88 31 2 926 388 3
43 186 14 5 722 14 138 134 44 1 671 62 22 1 255 179 2
52 203 20 6 424 16 160 178 56 1 816 88 25 1 878 262 3
code of the compact limited memory BFGS updating formula is rewritten based on ASTRAL code [27]. The common parameters of the algorithms TTR, ATRG, and ATRN are set to μ1 = 10−5 , μ2 = 0.2, μ3 = 0.8, γ1 = 0.25, γ2 = 0.5, γ3 = 2, and m 1 = 5 (see [27]). Also, for ATRG and ATRN, we choose η0 = 0.25 and update m(k) by 0, if k = 0, m(k) := min{m(k − 1) + 1, N }, if k 1, for which N = 10. For all algorithms, we set 0 = g0 and TTR updates the trust-region radius by ⎧ if rk < μ1 , ⎪ ⎪ γ 1 k , ⎨ if rk ∈ [μ1 , μ2 ), γ 2 k , k+1 = , if rk ∈ [μ2 , μ3 ), ⎪ k ⎪ ⎩ min{γ3 k , 0 }, if rk μ3 . ATRS uses c = 0.75 and μ = 0.1 similar to [10], where qk = −Hk gk is computed by the Algorithm QN in [25]. For ATRN, we set υ = 0.45 and update ηk by 2 η + 0.01, if gk 10−2 , ηk := 3 k−1 max{0.99ηk−1 , 0.5}, otherwise, where ηk = 0.2. To compare the results appropriately, we use the performance profiles of Dolan and Moré in [28], where the measures of performance are the number of iterations (Ni ), function evaluations (N f ), and gradient evaluations (N g ). It is clear that in the considered algorithms the number of iterations and gradient evaluations are the same, so we only consider the performance for (N g ). It is believed that computing a gradient is as costly as computing three function values, i.e., we further consider the measure
123
A New Restarting Adaptive Trust-Region Method for…
N f + 3N g . In detail, the performance of each code is measured by considering the ratio of its computational outcome versus the best numerical outcome of all codes. The obtained results are shown in Table 1, where we report the number of iterations (Ni ) and the number of function evaluations (N f ). The results of Table 1 suggest that the proposed algorithm has promising behavior for small-scale, medium-scale, and large-scale unconstrained optimization problems and it is superior to other considered algorithms in the most cases. To see the results of implementations in detail and demonstrate the overall behavior of the present algorithms and get more insight about the performance of considered codes, we illustrate the results with performance profile in Fig. 1 with N g , N f , and N f + 3N g as measures of performance. In Fig. 1, P(τ ) (y-axis) designates the percentage of problems solved within a factor τ (x-axis) of the best solver. In Fig. 1, subfigure (a) displays the comparisons for N g , where ATRN by 48% significantly outperforms the others. However, all of three other methods win 36% 1.0
1.0
0.9
0.6 0.5
≤ τ : 1 ≤ s ≤ ns)
0.7
0.8
p,s
0.8
0.4
P(r
P(r
p,s
≤ τ : 1 ≤ s ≤ ns)
0.9
ATRS ATRT ATRG ATRN
0.4
1.0
1.5
2.0
τ
2.5
3.0
0.7 0.6 0.5
0.3 ATRS ATRT ATRG ATRN
0.2 0.1
3.5
1.0
1.5
1.0
0.9
0.9
≤ τ : 1 ≤ s ≤ ns)
1.0
0.8 0.7 0.6
p,s
0.5 0.4 ATRS ATRT ATRG ATRN
0.3
0.2 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50
τ
(c)
τ
2.5
3.0
3.5
(b)
P(r
P(rp,s ≤ τ : 1 ≤ s ≤ ns)
(a)
2.0
0.8 0.7 0.6 0.5 0.4 ATRS ATRT ATRG ATRN
0.3 0.2
1.0
1.5
2.0
τ
2.5
3.0
3.5
(d)
Fig. 1 A comparison among TTR, ATRS, ATRG, ATRN by performance profiles with the measures N g , N f , and N f + 3N g . Subfigure (a) displays the performance profile for the number gradient evaluations (N g ). Subfigure (b) shows the performance profile for the number of function evaluations (N f ). Subfigures (c) and (d) illustrate the performance profile for the hybrid measure N f + 3N g with τ = 1.5 and τ = 3.5. (a) Ni and N g performance profile. (b) N f performance profile. (c) N f + 3N g performance profile with τ = 1.5. (d) N f + 3N g performance profile with τ = 3.5
123
M. Kimiaei, S. Ghaderi
of scores, but ATRS is better than TTR and TTR is better than ATRG. Subfigure (b) shows performance of N f , where ATRN by 57% performs better than other methods. Then TTR by 50% comes second and afterward ATRG by 31% and ATRS by 16% come fourth and fifth, respectively. Subfigures (c) and (d) depict N f +3N g for τ = 1.5 and τ = 3.5. It can be seen that ATRN by 52% performs better than the others, then TTR by 43% has a promising performance. ATRG by 32% and ATRS by 21% come the next, respectively.
5 Concluding Remarks We present an iterative scheme to solve unconstrained optimization problems based on the trust-region framework in which an adaptive radius is proposed according to a nonmonotone technique. The new adaptive procedure increases the trust-region radius to find the optimizer in greater region. Consequently, it decreases the total number of iterations and it will decrease the total number of subproblems to be solved. From the theoretical point of view, the proposed algorithm inherits the global convergence of traditional trust-region algorithms to the first-order critical points under classical assumptions. Under some suitable conditions, the quadratic convergence rate is established. Finally, our preliminary numerical experiments on CUTEst unconstrained test problems point out that the proposed algorithm is remarkably efficient for solving unconstrained optimization problems.
References [1] Powell, M.J.D.: On the global convergence of trust region algorithms for unconstrained optimization. Math. Program. 29, 297–303 (1984) [2] Ahookhosh, M., Amini, K.: An efficient nonmonotone trust-region method for unconstrained optimization. Numer. Algorithms 59, 523–540 (2012) [3] Ahookhosh, M., Amini, K.: A nonmonotone trust region method with adaptive radius for unconstrained optimization problems. Comput. Math. Appl. 60, 411–422 (2010) [4] Ahookhosh, M., Ghaderi, S.: Two globally convergent nonmonotone trust-region methods for unconstrained optimization. J. Appl. Math. Comp. (2015). doi:10.1007/s12190-015-0883-9 [5] Esmaeili, H., Kimiaei, M.: An improved adaptive trust-region method for unconstrained optimization. Math. Model. Anal. 19(4), 469–490 (2014) [6] Gould, N.I.M., Orban, D., Sartenaer, A., Toint, PhL: Sensitivity of trust region algorithms to their parameters. Q. J. Oper. Res. 3, 227–241 (2005) [7] Nocedal, J., Yuan, Y.: Combining trust region and line search techniques. In: Yuan, Y. (ed.) Advanced in Nonlinear Programming, pp. 153–175. Kluwer Academic Publishers, Dordrecht (1996) [8] Powell, M.J.D.: A new algorithm for unconstrained optimization. In: Rosen, J.B., Mangasarian, O.L., Ritter, K. (eds.) Nonlinear programming, pp. 311–765. Academic Press, London (1970) [9] Sartenaer, A.: Automatic determination of an initial trust region in nonlinear programming. SIAM J. Sci. Comput. 18, 1788–1803 (1997) [10] Shi, Z.J., Guo, J.H.: A new trust region method with adaptive radius. Comput. Optim. Appl. 213, 509–520 (2008) [11] Zhang, X.S., Zhang, J.L., Liao, L.Z.: An adaptive trust region method and its convergence. Sci. China 45, 620–631 (2002) [12] Ahookhosh, M., Esmaeili, H., Kimiaei, M.: An effective trust-region-based approach for symmetric nonlinear systems. Int. J. Comput. Math. 90, 671–690 (2013) [13] Esmaeili, H., Kimiaei, M.: An efficient adaptive trust-region method for systems of nonlinear equations. Int. J. Comput. Math. 92(1), 151–166 (2015)
123
A New Restarting Adaptive Trust-Region Method for… [14] Esmaeili, H., Kimiaei, M.: A new adaptive trust-region method for system of nonlinear equations. Appl. Math. Model. 38(11–12), 3003–3015 (2014) [15] Fasano, G., Lampariello, F., Lucidi, S.: A nonmonotone line search technique for Newton’s method. SIAM J. Numer. Anal. 23, 707–716 (1986) [16] Zhang, H.C., Hager, W.W.: A nonmonotone line search technique for unconstrained optimization. SIAM J. Optim. 14(4), 1043–1056 (2004) [17] Fatemi, M., Mahdavi-Amiri, N.: A filter trust-region algorithm for unconstrained optimization with strong global convergence properties. Comput. Optim. Appl. 52(1), 239–266 (2012) [18] Fletcher, R., Leyffer, S.: Nonlinear programming without a penalty function. Math. Program. Ser. A 91(2), 239–269 (2002) [19] Gould, N.I.M., Leyffer, S., Toint, P.L.: A multidimensional filter algorithm for nonlinear equations and nonlinear least-squares. SIAM J. Optim. 15(1), 17–38 (2004) [20] Conn, A.R., Gould, N.I.M., Toint, PhL: Trust-Region Methods. Society for Industrial and Applied Mathematics. SIAM, Philadelphia (2000) [21] Schultz, G.A., Schnabel, R.B., Byrd, R.H.: A family of trust-region-based algorithms for unconstrained minimization with strong global convergence. SIAM J. Numer. Anal. 22, 47–67 (1985) [22] Gould, N.I.M., Orban, Toint, P.L.: CUTEst: A constrained and unconstrained testing environment with safe threads. Technical Report, RAL-TR-2013-005 [23] Steihaug, T.: The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20(3), 6261–7637 (1983) [24] Byrd, R., Nocedal, J., Schnabel, R.: Representation of quasi-Newton matrices and their use in limited memory methods. Math. Program. 63, 129–156 (1994) [25] Kaufman, L.: Reduced storage, quasi-Newton trust region approaches to function optimization. SIAM J. Optim. 10(1), 56–69 (1999) [26] Shanno, D.F., Phua, K.H.: Matrix conditioning and non-linear optimization. Math. Program. 14, 149–160 (1987) [27] Xu, L., Burke, J.V.: An active set 1 -trust region algorithm for box constrained optimization, Technical report preprint, University of Washington (2007). http://www.optimization-online.org/DB_HTML/ 2007/07/1717.html [28] Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
123