Figures
Abstract
This paper proposes a modified BFGS formula using a trust region model for solving nonsmooth convex minimizations by using the Moreau-Yosida regularization (smoothing) approach and a new secant equation with a BFGS update formula. Our algorithm uses the function value information and gradient value information to compute the Hessian. The Hessian matrix is updated by the BFGS formula rather than using second-order information of the function, thus decreasing the workload and time involved in the computation. Under suitable conditions, the algorithm converges globally to an optimal solution. Numerical results show that this algorithm can successfully solve nonsmooth unconstrained convex problems.
Citation: Cui Z, Yuan G, Sheng Z, Liu W, Wang X, Duan X (2015) A Modified BFGS Formula Using a Trust Region Model for Nonsmooth Convex Minimizations. PLoS ONE 10(10): e0140606. https://doi.org/10.1371/journal.pone.0140606
Editor: Lixiang Li, Beijing University of Posts and Telecommunications, CHINA
Received: April 8, 2015; Accepted: September 27, 2015; Published: October 26, 2015
Copyright: © 2015 Cui et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All data are available and they are listed in the paper.
Funding: This work is supported by the Program for Excellent Talents in Guangxi Higher Education Institutions (Grant No. 201261), Guangxi NSF (Grant No. 2012GXNSFAA053002), China NSF (Grant No. 11261006 and 11161003), the Guangxi Science Fund for Distinguished Young Scholars (No. 2015GXNSFGA139001), NSFC No. 61232016, NSFC No. U1405254, and PAPD issue of Jiangsu advantages discipline.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Consider the following convex problem:
(1)
where f : ℝn → ℝ is a possibly nonsmooth convex function. In general, this problem has been well studied for several decades when f is continuously differentiable, and a number of different methods have been developed for its solution Eq (1) (for example, numerical optimization method [1–3] etc, heuristic algorithm [4–6] etc). However, when f is a nondifferentiable function, the difficulty of solving this problem increases. Recently, such problems have arisen in many medical, image restoration and optimal control applications (see [7–13] etc). Some authors have previously studied nonsmooth convex problems (see [14–18] etc).
Let F : ℝn → ℝ be the so-called Moreau-Yosida regularization of f, which is defined by
(2)
where λ is a positive parameter and ‖ ⋅ ‖ denotes the Euclidean norm. The problem Eq (1) is equivalent to the following problem
(3)
It is well known that the problems Eqs (1) and (3) of the solution sets are the same. As we know, one of the most effective methods for problems Eq (3) is the trust region method.
The trust region method plays an important role in the area of nonlinear optimization, and it has been proven to be a very efficient method. Levenberg [19] and Marquardt [20] first applied this method to nonlinear least-squares problems, and Powell [21] established a convergence result for this method for unconstrained problems. Fletcher [22] first proposed a trust region method for composite nondifferentiable optimization problems. Over the past decades, many authors have studied the trust region algorithm to minimize nonsmooth objective function problems. For example, Sampaio, Yuan and Sun [23] used the trust region algorithm for nonsmooth optimization problems; Sun, Sampaio and Yuan [24] proposed a quasi-Newton trust region algorithm for nonsmooth least-squares problems; Zhang [25] used a new trust region algorithm for nonsmooth convex minimization; and Yuan, Wei and Wang [26] proposed a gradient trust region algorithm with a limited memory BFGS update for nonsmooth convex minimization problems. For other references on trust region methods, see [27–35], among others. In particular, for the problem we address in this study, as we can compute the exact Hessian, the trust region method could be very efficient. However, it is difficult to compute the Hessian at every iteration, which increases the computational workload and time.
The purpose of this paper is to present an efficient trust region algorithm to solve Eq (3). With the use of the Moreau-Yosida regularization (smoothing) and the new quasi-Newton equation, the given method has the following good properties: (i) the Hessian makes use of not only the gradient value but also the function value and (ii) the subproblem of the proposed method, which possesses the form of an unconstrained trust region subproblem, can be solved using existing methods.
The remainder of this paper is organized as follows. In the next section, we briefly review some basic results in convex analysis and nonsmooth analysis and state a new quasi-Newton secant equation. In section 3, we present a new algorithm for solving problem Eq (3). In section 4, we prove the global convergence of the proposed method. In section 5, we report numerical results and present some comparisons for the existing methods to solve problem Eq (1). We conclude our paper in Section 6.
Throughout this paper, unless otherwise specified, ‖ ⋅ ‖ denotes the Euclidean norm of vectors or matrices.
Initial results
In this section, we first state some basic results in convex analysis and nonsmooth analysis. Let
and denote p(x): = argminz ∈ ℝn θ(z, x). Then, p(x) is well defined and unique, as θ(z, x) is strongly convex. By Eq (2), F can be rewritten as
In the following, we denote g(x) = ∇F(x). Some important properties of F are given as follows:
- F is finite-valued, convex and everywhere differentiable with
(4)
- The gradient mapping g : ℝn → ℝ is globally Lipschitz continuous with modulus λ, i.e.,
(5)
- x solves Eq (1) if and only if ∇F(x) = 0, namely, p(x) = x.
It is obvious that F(x) and g(x) can be obtained through the optimal solution of argminz ∈ ℝn θ(z, x). However, the minimizer of θ(z, x), p(x) is difficult or even impossible to solve for exactly. Thus, we cannot compute the exact value of p(x) to define F(x) and g(x). Fortunately, for each x ∈ ℝn and any ϵ > 0, there exists a vector pα(x, ϵ) ∈ ℝn such that
(6)
Thus, we can use pα(x, ϵ) to define respective approximations of F(x) and g(x) as follows, when ϵ is small,
(7)
and
(8)
The papers [36, 37] describe some algorithms to calculate pα(x, ϵ). The following remarkable feature of Fα(x, ϵ) and gα(x, ϵ) is obtained from [38].
Proposition 2.1 Let pα(x, ϵ) be a vector satisfying Eq (6), and Fα(x, ϵ) and gα(x, ϵ) are defined by Eqs (7) and (8), respectively. Then, we obtain
(9)
(10)
and
(11)
The relations Eqs (9), (10) and (11) imply that Fα(x, ϵ) and gα(x, ϵ) may be made arbitrarily close to F(x) and g(x), respectively, by choosing the parameter ϵ to be small enough.
Second, recall that when f is smooth, the quasi-Newton secant method is used to solve problem Eq (1). The iterate xk satisfies ∇fk + Bk(xk+1 − xk) = 0, where ∇fk = ∇f(xk), Bk is an approximation Hessian of f at xk, and the sequence of matrix {Bk} satisfies the secant equation as follows.
(12)
where yk = ∇fk+1− ∇fk and sk = xk+1 − xk. However, the function values are not exploited in Eq (12), which the method solves by only using the gradient information. Motivated by the above observations, we hope to develop a method that uses both the gradient information and function information. This problem has been studied by several authors. In particular, Wei, Li and Qi [39] proposed an important modified secant equation by using not only the gradient values but also the function values, and the modified secant is defined as
(13)
where νk = yk + βk sk, fk = f(xk), ∇fk = ∇f(xk), and
. When f is twice continuously differentiable and Bk+1 is updated by the BFGS formula [40–43], where Bk = I is a unit matrix if k = 0, this secant Eq (13) possesses the following remarkable property:
This property holds for all k. Based on the result of Theorem 2.1 [39], Eq (13) has an advantage over Eq (12) in this approximate relation.
The new model
In this section, we present a modified BFGS formula using trust region model for solving Eq (1), which is motivated by the Moreau-Yosida regularization (smoothing), general trust region method and the new secant Eq (13). First, we describe the trust region method. In each iteration, a trial step dk is generated by solving an adaptive trust region subproblem, in which the values of the gradient of F(x) at xk and Eq (13) are used:
(14)
where the scalar ϵk > 0 and Δk describe the trust region radius.
Let dk be the optimal solution of Eq (14). The actual reduction is defined by
(15)
and we define the predict reduction as
(16)
Then, we define rk to be the ratio between Are dk and Pre dk
(17)
Based on the new secant Eq (13) and with Bk+1 being updated by the BFGS formula, we propose a modified BFGS formula. The Bk+1 is defined by
(18)
where sk = xk+1 − xk, yk = gα(xk+1, ϵk+1) − gα(xk, ϵk), νk = yk + βk sk and
if k = 0, then Bk = I, and I is a unit matrix.
We now list the steps of the modified trust region algorithm as follows.
Algorithm 1.
Step 0. Choose x0 ∈ ℝn, 0 < σ1 < σ2 < 1, 0 < η1 < 1 < η2, λ > 0, 0 ≤ ɛ ≪ 1, Δmax ≥ Δ0 > 0 is called the maximum value of trust region radius, B0 = I, and I is the unit matrix. Let k: = 0.
Step 1. Choose a scalar ϵk+1 satisfying 0 < ϵk+1 < ϵk, and calculate pα(xk, ϵk), . If xk satisfies the termination criterion ‖gα(xk, ϵk)‖ ≤ ɛ, then stop. Otherwise, go to Step 2.
Step 2. dk solves the trust region subproblem Eq (14).
Step 3. Compute Are dk, Pre dk, rk using Eqs (15), (16) and (17).
Step 4. Regulate the trust region radius. Let
Step 5. If the condition rk ≥ σ1 holds, then let xk + 1 = xk + dk, update Bk + 1 by Eq (18), and let k: = k + 1; go back to Step 1. Otherwise, let xk+1: = xk and k: = k + 1; return to Step 2.
Similar to Dennis and Moré [44] or Yuan and Sun [45], we have the following result.
Lemma 1 If and only if the condition
holds, Bk+1 will inherit the positive property of Bk.
Proof “ ⇒ ” If Bk+1 is symmetric and positive definite, then
“⇐” For the proof of the converse, suppose that
and Bk is symmetric and positive definite for all k ≥ 0. We shall prove that xT Bk+1 x > 0 holds for arbitrary x ≠ 0 and x ∈ ℝn by induction. It is easy to see that B0 = I is symmetric and positive definite. Thus, we have
(19)
Because Bk is symmetric and positive definite for all k ≥ 0, there exists a symmetric and positive definite matrix
such that
. Thus, by using the Cauchy-Schwartz inequality, we obtain
(20)
It is not difficult to prove that the above inequality holds true if and only if there exists a real number γk ≠ 0 such that
, namely, x = γk sk.
Hence, if Eq (20) strictly holds (and note that ), then from Eq (19), we have
Otherwise,
; then, there exists γk such that x = γk sk. Thus,
Therefore, for each 0 ≠ x ∈ ℝn, we have xT Bk+1 x > 0. This completes the proof.
Lemma 1 states that if , then the matrix sequence {Bk} is symmetric and positive definite, which is updated by the BFGS formula of Eq (18).
Convergence analysis
In this section, the global convergence of Algorithm 1 is established under the assumption that the following conditions are required.
Assumption A.
- Let the level set Ω
- F is bounded from below.
- The matrix sequence {Bk} is bounded on Ω, which means that there exists a positive constant M such that
- The sequence {ϵk} converges to zero.
Now, we present the following lemma.
Lemma 2 If dk is the solution of Eq (14), then (21) Proof Similar to the proof of Lemma 7(6.2) in Ma [46]. Note that the matrix sequence {Bk} is symmetric and positive definite; then, we present
to be a Cauchy point at iteration point xk, which is defined by
where
. It is easy to verify that the Cauchy point is a feasible point, i.e.,
.
If , then
and
Thus, we obtain
Otherwise, we have
. Thus, we obtain
Let dk be the solution of Eq (14). Because , we have
This completes the proof.
Lemma 3 Let Assumption A hold true and the sequence {xk} be generated by Algorithm 1. If dk is the solution of Eq (14), then (22) Proof Let dk be the solution of Eq (14). By using Taylor expansion, Fα(xk + dk, ϵk+1) can be expressed by
(23)
Note that with the definitions of Are dk and Pre dk and by using Eq (23), we have
The proof is complete.
Lemma 4 Let Assumption A hold. Then, Algorithm 1 does not circle in the inner cycle infinitely.
Proof Suppose, by contradiction to the conclusion of the lemma, that Algorithm 1 cycles between Steps 2 and 5 infinitely at iteration point xk, i.e., rk < σ1 and that there exists a scalar ρ > 0 such that ‖gα(xk, ϵk)‖ ≥ ρ. Thus, noting that 0 < η1 < 1, we have
By using the result Eq (22) of Lemma 3 and the definition of rk, we obtain
which means that we must have rk ≥ σ1; this contradicts the assumption that rk < σ1, and the proof is complete.
Based on the above lemmas, we can now demonstrate the global convergence of Algorithm 1 under suitable conditions.
Theorem 1 (Global Convergence). Suppose that Assumption A holds and that the sequence {xk} is generated by Algorithm 1. Let dk be the solution of Eq (14). Then, holds, and any accumulation point of xk is an optimal solution of Eq (1).
Proof We first prove that
(24)
Suppose that gα(xk, ϵk) ≠ 0. Without loss of generality, by the definition of rk, we have
(25)
Using Taylor expansion, we obtain
When Δk > 0 and small enough, we have
(26)
Suppose that there exists ω0 > 0 such that ‖gα(xk, ϵk)‖ ≥ ω0. By contradiction, using Eqs (25) and (26) and Lemma 2, we have
(27)
which means that there exists sufficiently small
such that
for each k, and we have ∣rk − 1∣ < 1 − σ2, i.e., rk > σ2. Then, according to the Algorithm 1, we have Δk+1 ≥ Δk.
Thus, there exists a positive integer k0 and a constant ρ0 for arbitrary k ≥ k0 and satisfying , for which we have
(28)
On the other hand, because F is bounded from below, and supposing that there exists an infinite number k such that rk > σ1, by the definition of rk and Lemma 2, for each k ≥ k0,
which means that Δk → 0 for k → ∞; this is a contradiction to Eq (28).
Moreover, suppose that for sufficiently large k, we have rk < σ1. Then, , and we can see that Δk → 0 for k → ∞; this is also a contradiction to Eq (28). The contradiction shows that Eq (24) holds.
We now show that holds. By using Eq (11), we have
Together with Assumption A(iv), this implies that
(29)
Finally, we make a final assertion. Let x* be an accumulation point of {xk}. Then, without loss of generality, there exists a subsequence {xk}K satisfying
(30)
From the properties of F, we have
Thus, by using Eqs (29) and (30), we have x* = p(x*). Therefore, x* is an optimal solution of Eq (1). The proof is complete.
Similar to Theorem 3.7 in [25], we can show that the rate of convergence of Algorithm 1 is Q-superlinear. We omit this proof here (the proof of the Q-superlinear convergence can be found in [25]).
Theorem 2 (Q-superlinear Convergence) [25] Suppose that Assumption A(ii) holds, that the sequence {xk} is generated by Algorithm 1, which has a limit point x*, and that g is BD-regular and semismooth at x*. Furthermore, suppose that ϵk = o(‖g(xk)‖2). Then,
- x* is the unique solution of Eq (1);
- the entire sequence {xk} converges to x* Q-superlinearly, i.e.,
Results
In this section, we test our modified BFGS formula using a trust region model for solving nonsmooth problems. The type of nonsmooth problems addressed in Table 1 can be found in [47–53]. The problem dimensions and optimum function values are listed in Table 1, where “No.” is the number of the test problem, “Dim” is the dimension of the test problem, “Problem” is the name of the test problem, “x0” is the initial point, and “fops(x)” is the optimization function evaluation. Here, the modified algorithm was implemented using MATLAB 7.0.4, and all numerical experiments were run on a PC with CPU Intel CORE(TM) 2 Duo T6600 2.20 GHZ, with 2.00 GB of RAM and with the Windows 7 operating system.
To test the performance of the given algorithm for the problems listed in Table 1, we compared our method with the trust region concept (BT) of paper [15], the proximal bundle method (PBL) of paper [17] and the gradient trust region algorithm with limited memory BFGS update (LGTR) described in [26]. The parameters were chosen as follows: σ1 = 0.45, σ2 = 0.75, η1 = 0.5, η2 = 4, λ = 1, Δ0 = 0.5 < Δmax = 100 and (where k is the iterate number). We stopped the algorithm when the condition ‖gα(x, ϵ)‖ ≤ 10 − 6 was satisfied. Based on the idea of [26], we use the function fminsearch in MATLAB for solving min θ(z, x). Then, we obtained the solution p(x); moreover, we obtained gα(x, ϵ), which is computed using Eq (8). Meanwhile, we also listed the results of PBL, LGTR, BT and our modified algorithm in Table 2. The numerical results of PBL and BT can be found in [17], and the numerical results of LGTR can be found in [26]. The following notations are used in Table 2: “NI” is the number of iterations; “NF” is the number of the function evaluations; “f(x)” is the function value at final iteration; “——” indicates that the algorithm fails to solve the problem; and “Total” denotes the sum of the NI/NF.
The numerical results show that the performance of our algorithm is superior to those of the methods in Table 2. It can be seen clearly that the sum of our algorithm relative to NI and NF is less than the other three algorithms. The paper [54] provides a new tool for analyzing the efficiency of these four algorithms. Figs 1 and 2 show the performances of these four methods relative to NI and NF of Table 2, respectively. These two figures prove that Algorithm 1 provides a good performance for all the problems tested compared to PBL, LGTR and BT. In sum, the preliminary numerical results indicate that the modified method is efficient for solving nonsmooth convex minimizations.
Conclusion
The trust region method is one of the most efficient optimization methods. In this paper, by using the Moreau-Yosida regularization (smoothing) and a new secant equation with the BFGS formula, we present a modified BFGS formula using a trust region model for solving nonsmooth convex minimizations. Our algorithm does not compute the Hessian of the objective function at every iteration, which decrease the computational workload and time, and it uses the function information and the gradient information. Under suitable conditions, global convergence is established, and we show that the rate of convergence of our algorithm is Q-superlinear. Numerical results show that this algorithm is efficient. We believe that this algorithm can be used in future applications to solve non smooth convex minimizations.
Acknowledgments
This work is supported by China NSF (Grant No. 11261006 and 11161003), the Guangxi Science Fund for Distinguished Young Scholars (No. 2015GXNSFGA139001), NSFC No. 61232016, NSFC No. U1405254, and PAPD issue of Jiangsu advantages discipline. The authors wish to thank the editor and the referees for their useful suggestions and comments which greatly improve this paper.
Author Contributions
Conceived and designed the experiments: ZC GY ZS. Performed the experiments: ZC GY ZS. Analyzed the data: ZC GY ZS WL XW XD. Contributed reagents/materials/analysis tools: ZC GY ZS WL XW XD. Wrote the paper: ZC GY ZS.
References
- 1. Steihaug T. The conjugate gradient method and trust regions in lagre scale optimization, SIAM Journal on Numerical Analysis, 20, 626–637 (1983)
- 2. Dai Y, Yuan Y. A nonlinear conjugate gradient with a strong alobal convergence properties, SIAM Journal on Optimization, 10, 177–182 (2000)
- 3. Wei Z, Li G and Qi L. New nonlinear conjugate gradient formulas for large-scale unconstrained optimization problems, Appiled Mathematics and Computation, 179, 407–430 (2006)
- 4. Li L, Peng H, Kurths J, Yang Y and Schellnhuber H.J. Chaos-order transition in foraging behavior of ants, PNAS, 111, 8392–8397 (2014) pmid:24912159
- 5. Peng H, Li L, Yang Y and Liu F. Parameter estimation of dynamical systems via a chaotic ant swarm, Physical Review E, 81, 016207, (2010)
- 6. Wan M, Li L, Xiao J, Wang C and Yang Y. Data clustering using bacterial foraging optimization, Journal of Intelligent Information Systems, 38, 321–341 (2012)
- 7. Chan C, Katsaggelos A.K and Sahakian A.V. Image sequence filtering in quantum noise with applications to low-dose fluoroscopy, IEEE Transactions on Medical Imaging, 12, 610–621 (1993) pmid:18218455
- 8. Banham M.R, Katsaggelos A.K. Digital image restoration, IEEE Signal Processing Magazine, 14, 24–41 (1997)
- 9. Gu B, Sheng V. Feasibility and finite convergence analysis for accurate on-line v-support vector learning, IEEE Transactions on Neural Networks and Learning Systems, 24, 1304–1315 (2013)
- 10. Li J, Li X, Yang B and Sun X. Segmentation-based image copy-move forgery detection scheme, IEEE Transactions on Information Forensics and Security, 10, 507–518 (2015)
- 11. Wen X, Shao L, Fang W, and Xue Y. Efficient feature selection and classification for vehicle detection, IEEE Transactions on Circuits and Systems for Video Technology, (2015)
- 12. Zhang H, Wu J, Nguyen T and Sun M. Synthetic aperture radar image segmentation by modified student’s t-mixture model, IEEE Transaction on Geoscience and Remote Sensing, 52, 4391–4403 (2014)
- 13. Fu Z. Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing, IEICE Transactions on Communications, E98-B, 190–200 (2015)
- 14. Yuan G, Wei Z and Li G, A modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs, Journal of Computational and Appiled Mathematics, 255, 86–96 (2014)
- 15. Schramm H, Zowe J. A version of the bundle idea for minimizing a nonsmooth function: conceptual idea, convergence analysis, numercial results, SIAM Journal on Optimization, 2, 121–152 (1992)
- 16. Haarala M. Miettinen K and Mäkelä M.M, New limited memory bundle method for lagre-scale nonsmooth optimization, Optimization Methods and Software, 19, 673–692 (2004)
- 17. Lukšan L, Vlček J. A bundle-Newton method for nonsmooth unconstrained minimization, Mathematical Programming, 83, 373–391 (1998)
- 18. Wei Z, Qi L and Birge J.R. A new methods for nonsmooth convex optimization, Journal of Inequalities and Applications, 2, 157–179 (1998)
- 19. Levenberg K. A method for the solution of certain nonlinear problem in least squares, Quarterly Journal of Mechanics and Applied Mathematics, 2, 164–166 (1944)
- 20. Martinet B. Régularisation d’inéquations variationelles par approxiamations succcessives, Rev. Fr. Inform. Rech. Oper, 4, 154–159 (1970)
- 21.
Powell M.J.D. Convergence properties of a class of minimization algorithms. In: Mangasarian Q.L., Meyer R.R., Robinson S.M. (eds.) Nonlinear Programming, vol. 2. Academic Press, New York (1975)
- 22. Fletcher R. A model algorithm for composite nondifferentiable optimization problems, Math. Program. Stud, 17, 67–76 (1982)
- 23. Sampaio R.J.B. Yuan J and Sun W. Trust region algorithm for nonsmooth optimization, Appiled Mathematics and Computation, 85, 109–116 (1997)
- 24. Sun W, Sampaio R.J.B and Yuan J. Quasi-Newton trust region algorithm for non-smooth least squares problems, Appiled Mathematics and Computation, 105, 183–194 (1999)
- 25. Zhang L. A new trust region algorithm for nonsmooth convex minimization, Appiled Mathematics and Computation 193, 135–142 (2007)
- 26. Yuan G, Wei Z and Wang Z. Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minization, Computational Optimization and Application, 54, 45–64 (2013)
- 27. Yuan G, Lu X and Wei Z. BFGS trust-region method for symmetric nonlinear equations, Journal of Computational and Appiled Mathematics, 230, 44–58 (2009)
- 28. Qi L, Sun J. A trust region algorithm for minimization of locally Lipschitzian functions, Mathematical Programming, 66, 25–43 (1994)
- 29. Bellavia S, Maccini M and Morini B. An affine scaling trust-region approach to bound-constrained nonlinear systems, Appiled Numerical Mathematics, 44, 257–280 (2003)
- 30. Akbari Z, Yousefpour R and Reza Peyghami M. A new nonsmooth trust region algorithm for locally Lipschitz unconstrained optimization problems, Journal of Optimization Theory and Applications, 164, 733–754 (2015)
- 31. Bannert T. A trust region algorithm for nonsmooth optimization, Mathematical Programming, 67, 247–264 (1994)
- 32. Amini K, Ahookhosh M. A hybrid of adjustable trust-region and nonmonotone algorithms for unconstrained optimization, Applied Mathematical Modelling, 38, 2601–2612 (2014)
- 33. Zhou Q, Hang D. Nonmonotone adaptive trust region method with line search based on new diagonal updating, Appiled Numerical Mathematics, 91, 75–88 (2015)
- 34. Yuan G, Wei Z and Lu X. A BFGS trust-region method for nonlinear equations, Computing, 92, 317–333 (2011)
- 35. Lu S, Wei Z and Li L. A trust region algorithm with adaptive cubic regularization methods for nonsmooth convex minization, Computational Optimization and Application, 51, 551–573 (2012)
- 36. Correa R, Lemaréchal C. Convergence of some algorithms for convex minimization, Mathematical Programming, 62, 261–273 (1993)
- 37. Dennis J.E. Jr, Li S.B and Tapia R.A. A unified approach to global convergence of trust region methods for nonsmooth optimization, Mathematical Programming, 68, 319–346 (1995)
- 38. Fukushima M, Qi L. A global and superlinearly convergent algorithm for nonsmooth convex minimization, SIAM Journal on Optimization, 6, 1106–1120 (1996)
- 39. Wei Z, Li G and Qi L. New quasi-Newton methods for unconstrained optimization problems, Appiled Mathematics and Computation, 175, 1156–1188 (2006)
- 40. Broyden C.G. The convergence of a class of double rank minimization algorithms: the new algorithm, Journal of the Institute of Mathematics and its Applications, 6, 222–131 (1970)
- 41. Fletcher R. A new approach to variable metric algorithms, Computer Journal, 13, 317–322 (1970)
- 42. Goldfarb D. A family of variable metric methods derived by variational means, Mathematics of Computation, 24, 23–26 (1970)
- 43. Shanno D.F. Conditioning of quadi-Newton methods for function minization, Mathematics of Computation, 24, 647–650 (1970)
- 44. Dennis J.E, Moré J.J. A characterization of superlinear convergence and its application to quasi-Newton methods, Mathematics of Computation, 28, 549–560 (1974)
- 45.
Yuan Y, Sun W. Optimization theory and methods, Science Press, Beijing (1997)
- 46.
Ma C. Optimization method and the Matlab programming, Science Press, Beijing (2010)
- 47.
Mäkelä M.M, Neittaanmäki P. Nonsmooth Optimization. World Scientific, London (1992)
- 48. Charalambous J, Conn A.R. An efficient method to solve the minimax problem directly, SIAM Journal on Numerical Analysis, 15, 162–187 (1978)
- 49.
Demyanov V.F, Malozemov V.N. Introduction to Minimax. Wiley, New York (1974)
- 50.
Womersley J. Numerical methods for structured problems in nonsmooth optimization. Ph.D. thesis. Mathematics Department, University of Dundee, Dundee, Scotland (1981)
- 51.
Gupta N. A higher than first order algorithm for nonsmooth constrained optimization. P.h. thesis, Department of Philosophy, Washington State University, Pullman, WA (1985)
- 52.
Short N.Z. Minimization methods for nondifferentiable function. Springer, Berlin (1985)
- 53. Kiwiel K.C. An ellipsoid trust region bundle method for nonsmooth convex minimization, SIAM Journal on Control and Optimization, 27, 737–757 (1989)
- 54. Dolan E.D, Moré J.J. Benchmarking optimization software with performance profiles, Mathematical Programming, 91, 201–213 (2002)