Nonlinear Physics seminar: "Effective Theory for Online Learning in Complex Models"

Date: 
Wed, 20/12/202312:00-13:30
Location: 
Danciger B Building, Seminar room
Lecturer:  Dr. Inbar Serousi 
 Abstract:
Stochastic gradient descent (SGD) is a fundamental optimization technique in modern machine
learning, yet a comprehensive understanding of its exceptional performance remains a challenge.
Drawing on the rich history of this problem in statistical physics, which has provided insights into
simple neural networks with isotropic Gaussian data, this talk reviews existing results and
introduces a theory for SGD in high dimensions. Our theory extends to a broader class of models,
accommodating data with general covariance structures and loss functions. We present limiting
deterministic dynamics governed by low-dimensional order parameters, applicable to a spectrum
of optimization problems, including linear and logistic regression, as well as two-layer neural
networks. This framework also reveals the implicit bias in SGD. For each problem, the deterministic
equivalent of SGD allows us to derive an equation for the generalization error. Moreover, we
establish explicit conditions on the step size, ensuring the convergence and stability of SGD.