Speaker
Description
Deep neural nets with a large number of parameters are very powerful machine learning
systems. However, over?tting is a serious problem in such networks. Large networks are also
slow to use, making it di?cult to deal with over?tting by combining the predictions of many
di?erent large neural nets at test time. Dropout is a technique for addressing this problem.
The key idea is to randomly drop units (along with their connections) from the neural
network during training. This prevents units from co-adapting too much. During training,
dropout samples from an exponential number of di?erent \thinned" networks. At test time,
it is easy to approximate the e?ect of averaging the predictions of all these thinned networks
by simply using a single unthinned network that has smaller weights. This signi?cantly
reduces over?tting and gives major improvements over other regularization methods. We
show that dropout improves the performance of neural networks on supervised learning
tasks in vision, speech recognition, document classi?cation and computational biology,
obtaining state-of-the-art results on many benchmark data sets.