Date:
Mon, 02/12/201912:00-13:30
Location:
Levin building, Lecture Hall No. 8
Lecturer:
Dr. Daniel Soudry
Lecturer: Dr. Daniel Soudry
Abstract:
Deep learning relies on Artificial Neural Networks (ANNs) with deep architectures – machine learning models that have reached unparalleled performance in many domains, such as machine translation, autonomous vehicles, computer vision, text generation, and speech understanding. However, this impressive performance typically requires large datasets and massive ANN models. Gathering the data and training the models – all can take long times and have prohibitive costs. Significant research efforts are being invested in improving ANN training efficiency, i.e. the amount of time, data, and resources required to train these models. For example, changing the model (e.g., architecture, numerical precision) or the training algorithm (e.g., parallelization). However, such modifications often cause an unexplained degradation in the generalization performance of the ANN to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmic bias of the training algorithm and model. This bias determines which solution is selected from all solutions which fit the data. I will discuss how understanding and controlling such algorithmic bias can be the key to unlocking the true potential of deep learning.
Abstract:
Deep learning relies on Artificial Neural Networks (ANNs) with deep architectures – machine learning models that have reached unparalleled performance in many domains, such as machine translation, autonomous vehicles, computer vision, text generation, and speech understanding. However, this impressive performance typically requires large datasets and massive ANN models. Gathering the data and training the models – all can take long times and have prohibitive costs. Significant research efforts are being invested in improving ANN training efficiency, i.e. the amount of time, data, and resources required to train these models. For example, changing the model (e.g., architecture, numerical precision) or the training algorithm (e.g., parallelization). However, such modifications often cause an unexplained degradation in the generalization performance of the ANN to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmic bias of the training algorithm and model. This bias determines which solution is selected from all solutions which fit the data. I will discuss how understanding and controlling such algorithmic bias can be the key to unlocking the true potential of deep learning.