Efficient Learning from Diverse Sources of Information

Although machine learning has witnessed rapid progress in the last decade, many current learning algorithms are very inefficient in terms of the amount of data it uses, and the time used to train the model. On the other hand, humans excel at many of the learning tasks with very limited data. Why are machines so inefficient, and why can humans learn so well? The key to the answer lies in that humans can learn from diverse sources of information, and are able to use past knowledge to apply in new domains. In this talk, I will study learning from diverse sources of information to make ML algorithms more efficient. In the first part, I will talk about how to incorporate diverse forms of questions into the learning process. Particularly, I will look at the problem of utilizing preference information for learning a regression function and show an interesting connection to nearest neighbors and isotonic regression. In the second part, I will talk about multitask and transfer learning from different domains for natural language understanding. I will explain a sample-reweighting scheme using language models to automatically weight external-domain samples according to their help for the target task.

[Slides]