Stochastic three points method for unconstrained smooth minimization

El Houcine Bergou, Eduard Gorbunov, Peter Richtarik

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


In this paper we consider the unconstrained minimization problem of a smooth function in Rn in a setting where only function evaluations are possible. We design a novel randomized derivative-free algorithm-the stochastic three points (STP) method-and analyze its iteration complexity. At each iteration, STP generates a random search direction according to a certain fixed probability law. Our assumptions on this law are very mild: Roughly speaking, all laws which do not concentrate all measures on any halfspace passing through the origin will work. For instance, we allow for the uniform distribution on the sphere and also distributions that concentrate all measures on a positive spanning set. Although our approach is designed to not explicitly use derivatives, it covers some first order methods. For instance, if the probability law is chosen to be the Dirac distribution concentrated on the sign of the gradient, then STP recovers the signed gradient descent method. If the probability law is the uniform distribution on the coordinates of the gradient, then STP recovers the randomized coordinate descent method. The complexity of STP depends on the probability law via a simple characteristic closely related to the cosine measure which is used in the analysis of deterministic direct search (DDS) methods. Unlike in DDS, where O(n) (n is the dimension of x) function evaluations must be performed in each iteration in the worst case, our method only requires two new function evaluations per iteration. Consequently, while the complexity of DDS depends quadratically on n, our method depends linearly on n.
Original languageEnglish (US)
Pages (from-to)2726-2749
Number of pages24
JournalSIAM Journal on Optimization
Issue number4
StatePublished - Oct 1 2020


Dive into the research topics of 'Stochastic three points method for unconstrained smooth minimization'. Together they form a unique fingerprint.

Cite this