Algorithmic Convexity in the Discrete World and the Resolution of a 30-Years-Old Conjecture Continue Reading
Implicit regularization for constant step-size stochastic gradient descent: why it works for non-smooth, non-convex, low-rank models Continue Reading