Nonlinear Physics seminar: "Statistical mechanics of wide but finite artificial neural networks"

Date: 
Wed, 06/12/202312:00-13:30
Location: 
Danciger B Building, Seminar room
Lecturer:  Mr. Gadi Naveh (HUJI)
Abstract:
 Over the past decade, Deep neural networks (DNNs) have brought significant advancements to various fields of science and technology. Key to their success is feature learning, enabling them to extract useful representations from raw data. However, DNNs remain challenging to interpret and lack a comprehensive analytical understanding. Recent studies revealed that DNNs in the infinite width limit are equivalent to Gaussian Processes (GPs), which are well-understood
analytically but lack feature learning capabilities due to pre-determined features.
In this thesis, we aim to bridge the gap between GPs and finite DNNs, using approaches inspired by theoretical physics. Employing statistical field theory, we treat the DNN output as a random object from a statistical ensemble. We performed extensive computer simulations of DNNs trained in both stylized and real-world settings. As for analytical methods, we first employ a perturbative approach to obtain the leading correction in 1/width to the GP limit. Subsequently, we develop a self-consistent mean-field type theory, providing a GP description of finite DNNs with a data-dependent kernel that embodies feature learning properties.