What can a sufficiently large network with one hidden layer learn?

It has been shown that such a network can learn to approximate any mathematical model to any level of accuracy! Even though this is theoretically possible, in practice we need more layers to achieve good performance. More layers == "deep" and this is how we get deep learning!