**Abstract: **Deep learning is a rapidly developing area of machine learning, which uses artificial neural networks to perform learning tasks. Although mathematical description of neural networks is simple, theoretical justification for the spectacular performance of deep learning remains elusive. Even the most basic questions about remain open. For example, how many different functions can a neural network compute? Jointly with Pierre Baldi (UCI CS) we discovered a general capacity formula for all fully connected networks with the threshold activation function. The formula predicts, counterintuitively, that shallow networks have greater capacity than deep ones. So, the mystery remains.