Leveraging Non-uniformity in First-order Non-convex Optimization

Mei, Jincheng; Gao, Yue; Dai, Bo; Szepesvari, Csaba; Schuurmans, Dale

Computer Science > Machine Learning

arXiv:2105.06072 (cs)

[Submitted on 13 May 2021 (v1), last revised 2 Jun 2022 (this version, v3)]

Title:Leveraging Non-uniformity in First-order Non-convex Optimization

Authors:Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvari, Dale Schuurmans

View PDF

Abstract:Classical global convergence results for first-order methods rely on uniform smoothness and the Łojasiewicz inequality. Motivated by properties of objective functions that arise in machine learning, we propose a non-uniform refinement of these notions, leading to \emph{Non-uniform Smoothness} (NS) and \emph{Non-uniform Łojasiewicz inequality} (NŁ). The new definitions inspire new geometry-aware first-order methods that are able to converge to global optimality faster than the classical $\Omega(1/t^2)$ lower bounds. To illustrate the power of these geometry-aware methods and their corresponding non-uniform analysis, we consider two important problems in machine learning: policy gradient optimization in reinforcement learning (PG), and generalized linear model training in supervised learning (GLM). For PG, we find that normalizing the gradient ascent method can accelerate convergence to $O(e^{-t})$ while incurring less overhead than existing algorithms. For GLM, we show that geometry-aware normalized gradient descent can also achieve a linear convergence rate, which significantly improves the best known results. We additionally show that the proposed geometry-aware descent methods escape landscape plateaus faster than standard gradient descent. Experimental results are used to illustrate and complement the theoretical findings.

Comments:	48 pages, 10 figures. Accepted at ICML 2021
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2105.06072 [cs.LG]
	(or arXiv:2105.06072v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2105.06072

Submission history

From: Jincheng Mei [view email]
[v1] Thu, 13 May 2021 04:23:07 UTC (1,271 KB)
[v2] Fri, 17 Sep 2021 21:13:47 UTC (2,615 KB)
[v3] Thu, 2 Jun 2022 06:44:29 UTC (2,617 KB)

Computer Science > Machine Learning

Title:Leveraging Non-uniformity in First-order Non-convex Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Leveraging Non-uniformity in First-order Non-convex Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators