Optimisation on Manifolds — Lifting Iterative Algorithms from Euclidean Space to Manifolds
A natural question to ask in the theory of Optimisation on Manifolds is how the Newton method can be generalised to an iterative method on a manifold. Traditionally, the standard way was formulaic: because the Newton method involves a gradient, a Hessian, and vector space addition, generalise these concepts first then apply Newton’s formula. This led to the Riemannian Newton method. Gradients, Hessians and vector addition are not intrinsic though; they are just one realisation of an iterative method that uses the first and second order derivatives to converge locally quadratically to a (non-degenerate) critical point. The standard Newton formula in Euclidean space is designed to be optimal for quadratic cost functions, but a simple change of coordinates will change it to being optimal for another class of cost functions instead.
It turns out that changes of coordinates play a fundamental role in understanding how the Newton method, or any (memoryless) iterative method for that matter, can be lifted to manifolds. In A Framework for Generalising the Newton Method and Other Iterative Methods from Euclidean Space to Manifolds, it was essentially demonstrated that all generalised Newton methods can be understood as applying a different change of coordinates at each iteration, and local quadratic convergence can be assured if and only if the Newton method in Euclidean space is “robust” to these changes, meaning its radius of convergence can be uniformly bounded under all coordinate changes of interest. The Newton method is indeed very robust, leading to a tremendous variety of possibilities for lifting it to manifolds which go well beyond the traditional class of Riemannian Newton methods.
It was also pointed out there is generally no reason to insist that the lift is uniquely defined globally. It suffices to lift locally in response to where the iteration is currently at. Visually, one way to think of this is to re-centre at each step of the iteration: rather than move a point around a sphere searching for a critical point, keep applying rotations to the sphere endeavouring to bring a critical point to the North pole. More generally, arbitrary transformations of the manifold, or of the ambient space containing the manifold, can be used for re-centring at a distinguished point, and the Newton method need only be lifted to a region containing the distinguished point. (A special case is “rolling without slipping”, but there a Riemannian metric is present, which results in the local lifts fitting together globally. This need not be the case for more general re-centring techniques, and may possibly lead to more efficient algorithms that use simpler lifts.)
Many competing factors come into play when designing a Newton method for a particular problem. Hopefully the framework will prove useful, but it is far from being a panacea. It raises more questions than it answers.