Geometry of energy landscapes and the optimizability of deep neural networks
We analyze the energy landscape of a spin glass model of deep neural networks using random matrix theory and algebraic geometry. We analytically show that the multilayered structure makes the network easier to optimise: Fixing the number of parameters and increasing network depth, the number of stationary points in the loss function decreases, minima become more clustered in parameter space, and the tradeoff between the depth and width of minima becomes less severe.
Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction
We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction to highlight subtleties in model comparison. Our study reveals that "older" methods are non-inferior to "new" deep learning methods.
Achieving Robustness to Aleatoric Uncertainty with Heteroscedastic Bayesian Optimisation
Typical optimisation problems strive to find the maximum or the minimum of an objective function. However, in many materials discovery applications, the objective function is noisy, we would often want the solutions that are reliable rather than sensitive to experimental noise. We formulate this problem as a Heteroscedastic Bayesian Optimisation as derive show that appropriate choice of acquisition functions leads to robust solutions.
Energy–entropy competition and the effectiveness of stochastic gradient descent in machine learning
We use statistical physics to understand the effectiveness of the stochastic gradient descent algorithm. We show that the anisotropic noise in the algorithm plays a key role in facilitating convergence to generalisable basins of attraction.