On-Device Machine Intelligence with Neural Projections

Deep neural networks and other machine learning models have been transformative for building intelligent systems capable of visual recognition, speech and language understanding. While recent advances have led to progress for machine intelligence applications running on the cloud, it is often infeasible to use typical machine learning models on devices like mobile phones or smart watches due to computation and memory constraints — model sizes are huge and cannot fit into the limited memory available on such devices. While these devices could make use of models running on high-performance data centers with CPUs or GPUs, this is not feasible for many applications and scenarios where inference needs to be performed directly “on” device. This requires re-thinking existing machine learning algorithms and coming up with new models that are directly optimized for on-device machine intelligence rather than doing post-hoc model compression. In this talk, I will introduce a novel “projection-based” machine learning system for training compact neural networks. The approach uses a joint optimization framework to simultaneously train a “full” deep network like feed-forward or recursive neural network and a lightweight “projection” network. Unlike the full deep network, the projection network uses random projection operations that are efficient to compute and operates in bit space yielding a low memory footprint. The system is trained end-to-end using backpropagation. We show that the approach is flexible and easily extensible to other machine learning paradigms, for example, we learn graph-based projection models using label propagation. The trained “projection” models are directly used for inference and achieve significant model size reductions and efficiency on several visual and language tasks while providing competitive performance. We have used the novel networks to power machine intelligence applications on devices such as mobile phones and smart watches, for example, a fully on-device Smart Reply model that runs on Android smart watches.

Date:
Speakers:
Sujith Ravi
Affiliation:
Google