21–24 Feb 2018
Bonn
Europe/Zurich timezone

This is a sandbox server intended for trying out Indico. It should not be used for real events and any events on this instance may be deleted without notice.

Distilling the Knowledge in a Neural Network

Not scheduled
15m
50 (Bonn)

50

Bonn

Machine Learning

Speakers

Mr Oriol Vinyals (Google)Mr Geoffrey Hinton (Google)

Description

A very simple way to improve the performance of almost any mac
hine learning algorithm is to train many different models on the same data a
nd then to average their predictions [3]. Unfortunately, making predictions
using a whole ensemble of models is cumbersome and may be too computationally expen sive to allow deployment to a large number of users, especially if the indivi dual models are large neural nets. Caruana and his collaborators [1] have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and w e show that we can significantly improve the acoustic model of a heavily used commercial systemby distilling the knowledge in an ensemble of models into a single model.

Authors

Mr Oriol Vinyals (Google) Mr Geoffrey Hinton (Google)

Presentation materials

There are no materials yet.