TY - GEN
T1 - Convergence analysis of gradient descent for eigenvector computation
AU - Xu, Zhiqiang
AU - Cao, Xin
AU - Gao, Xin
N1 - Publisher Copyright:
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved.
PY - 2018
Y1 - 2018
N2 - We present a novel, simple and systematic convergence analysis of gradient descent for eigenvector computation. As a popular, practical, and provable approach to numerous machine learning problems, gradient descent has found successful applications to eigenvector computation as well. However, surprisingly, it lacks a thorough theoretical analysis for the underlying geodesically non-convex problem. In this work, the convergence of the gradient descent solver for the leading eigenvector computation is shown to be at a global rate O(min{(∆λ1p )2 log11 }), where ∆p = λp − λp+1 > 0 represents the generalized positive eigengap and always exists without loss of generality with λi being the i-th largest eigenvalue of the given real symmetric matrix and p being the multiplicity of λ1. The rate is linear at O((∆λ1p )2 log1 ) if (∆λ1p )2 = O(1), otherwise sub-linear at O(1 ). We also show that the convergence only logarithmically instead of quadratically depends on the initial iterate. Particularly, this is the first time the linear convergence for the case that the conventionally considered eigengap ∆1 = λ1 − λ2 = 0 but the generalized eigengap ∆p satisfies (∆λ1p )2 = O(1), as well as the logarithmic dependence on the initial iterate are established for the gradient descent solver. We are also the first to leverage for analysis the log principal angle between the iterate and the space of globally optimal solutions. Theoretical properties are verified in experiments.
AB - We present a novel, simple and systematic convergence analysis of gradient descent for eigenvector computation. As a popular, practical, and provable approach to numerous machine learning problems, gradient descent has found successful applications to eigenvector computation as well. However, surprisingly, it lacks a thorough theoretical analysis for the underlying geodesically non-convex problem. In this work, the convergence of the gradient descent solver for the leading eigenvector computation is shown to be at a global rate O(min{(∆λ1p )2 log11 }), where ∆p = λp − λp+1 > 0 represents the generalized positive eigengap and always exists without loss of generality with λi being the i-th largest eigenvalue of the given real symmetric matrix and p being the multiplicity of λ1. The rate is linear at O((∆λ1p )2 log1 ) if (∆λ1p )2 = O(1), otherwise sub-linear at O(1 ). We also show that the convergence only logarithmically instead of quadratically depends on the initial iterate. Particularly, this is the first time the linear convergence for the case that the conventionally considered eigengap ∆1 = λ1 − λ2 = 0 but the generalized eigengap ∆p satisfies (∆λ1p )2 = O(1), as well as the logarithmic dependence on the initial iterate are established for the gradient descent solver. We are also the first to leverage for analysis the log principal angle between the iterate and the space of globally optimal solutions. Theoretical properties are verified in experiments.
UR - http://www.scopus.com/inward/record.url?scp=85055715842&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2018/407
DO - 10.24963/ijcai.2018/407
M3 - Conference contribution
AN - SCOPUS:85055715842
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 2933
EP - 2939
BT - Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
A2 - Lang, Jerome
PB - International Joint Conferences on Artificial Intelligence
T2 - 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
Y2 - 13 July 2018 through 19 July 2018
ER -