Together with the participants of the Oberwolfach Seminar: Mathematics of Deep Learning, I wrote a (not entirely serious) paper called "The oracle of DLPhi" proving that Deep Learning techniques can perform accurate classifications on test data that is entirely uncorrelated to the training data. This, however, requires a couple of non-standard assumptions such as uncountably many data points and the axiom of choice. In a sense this shows that mathematical results on machine learning need to be approached with a bit of scepticism.
Endre Süli and I submitted a new preprint on a Gamma-convergence of a shearlet-based Ginzburg--Landau energy to the arxiv. We analyse what happens if one replaces the elasticity term of a standard Ginzburg-Landau energy by a shearlet-based energy. It turns out that under suitable conditions the shearlet based energy Gamma- converges to an anisotropic perimeter functional. Moreover, the anisotropy can be controlled by introducing direction dependent weights into the definition of the shearlet-based energy.
Felix and I submitted our preprint: Equivalence of approximation by convolutional neural networks and fully-connected networks to the arXiv. In this note, we establish approximation theoretical results, i.e., lower and upper bounds on approximation fidelity compared to the number of parameters, for convolutional neural networks. In practice, convolutional neural networks are used to a much greater extent than standard neural networks, while, traditionally, mathematical analysis mostly dealt with standard neural networks. We now show that all classical approximation results of standard neural networks imply very similar approximation results for convolutional neural networks.
I will give a mini-lecture on applied harmonic analysis at the PDE-CDT Summer School 2018 at Ripon college. The lecture notes can be found here.
Felix, Mones, and I just uploaded a preprint on topological properties of sets of functions that are representable by neural networks of fixed size. In this work we analyse simple set topological properties, such as, convexity, closedness or density of the set of networks with a fixed architecture. Quite surprisingly we found that the topology of this set is not particularly convenient for optimisation. Indeed, for all commonly-used activation functions, the sets of networks of fixed size are non-convex (not even weakly), nowhere dense, cannot be stably parametrised, and are not closed with respect to L^p norms. For almost all commonly-used activation functions except for the parametric ReLU, the non-closedness extends to the uniform norm. In fact, for the parametric ReLU the associated spaces are closed with respect to the supremum norm if the architecture has only one hidden layer. When training a network, these properties can lead to many local minima of the minimization problem, exploding coefficients, and very slow convergence.
I created this webpage after I moved from TU Berlin to U Oxford.