Random Reshuffling converges to a smaller neighborhood than SGD
Posted on:
This article is on the recent work by Ying et. al. 2018, in which the authors show that SGD with Random Reshuffling outperforms independent sampling with replacement by showing that the MSE of the iterates at the end of each epoch is of the order \(O(\eta^2)\) for constant step-size \(\eta\). This is a significant improvement compared to the traditional SGD with i.i.d. sampling where the same quantity is of the order \(O(\eta)\).
The complete PDF post can be viewed here.