21–24 Feb 2018
Bonn
Europe/Zurich timezone

This is a sandbox server intended for trying out Indico. It should not be used for real events and any events on this instance may be deleted without notice.

Large-scale Video Classification with Convolutional Neural Networks

Not scheduled
15m
50 (Bonn)

50

Bonn

Machine Learning

Speaker

Mr Andrej Karpathy (Google )

Description

Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on largescale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated architecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3%to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF- 101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).

Author

Mr Andrej Karpathy (Google )

Presentation materials

There are no materials yet.