A very important aspect of Machine Learning is Optimization, therefore to have the best results one requires fast and scalable methods before one can appreciate a learning model. Such algorithms involve minimization of a class of functions \(f(\mathbf{x})\), that usually do not have a closed form solution, or even if they have, computing them is expensive in both memory and computation time. Here is where iterative methods turn up to be easy and handy. Analyzing such algorithms involve mathematical analysis of both the function to optimize and the algorithm. This post contains a summary and survey of the theoretical understandings of Large Scale Optimization by referring some talks, papers, and lectures that I have come across in the recent. I hope that the insights of the working of these optimization algorithms will allow the reader to appreciate the rich literature of large scale optimization methods.
The complete PDF post can be viewed here.
Readers please note that the article is a compilation of popular and interesting results, and is not meant for publication at any case.