vast amount of unlabelled data • Need more labels but unable to label them all (due to the budget, etc.) • How about labelling only informative data points? 4 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources [Image]: DataCamp (Machine learning context)
increase training data (but in a different way) • Choose an informative data point (unlablled yet), then ask Oracles (e.g., human annotators) for its label • Import the labelled data point into training data 5 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources [Image]: DataCamp (Machine learning context)
to choose informative data points from unlabelled data • E.g., if an ML model can produce the probability of output predictions, choosing predictions of lowest predicted probability (uncertainty) is one way (i.e., Least Confidence method) • Such methods are called Acquisition Functions 6 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources
weights as point estimates • Bayesian NNs learn the distribution of the weights 7 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources [Image]: Blundell et al. (2015)
to model uncertainty in the prediction (Blundell et al., 2015) 1. regularisation via a compression cost on the weights 2. richer representations and predictions from cheap model averaging, and 3. exploration in simple reinforcement learning problems such as contextual bandits. • Another way to avoid over-fitting (e.g., early stopping, weight decay, dropout, etc.) 8 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources
aleatoric but not epistemic uncertainty (Kendall and Gal, 2017) ‣ Least confidence is just a heuristic ‣ Bayesian acquisition functions can make use of real uncertainty • Now, we also have Bayesian framework for deep neural networks • Bayesian active learning + Bayesian neural networks 9 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources
dataset with an arbitrarily architecture, without peeking at the labels to perform hyperparameter tuning?’ 10 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources
Houlsby et al. (2011) • Bayesian Deep Learning ‣ Monte Carlo Dropout (MC Dropout) — Gal and Ghahramani (2016) ‣ Bayes by Backprop — Blundell et al. (2015) • BALD + MC Dropout vs. BALD + Bayes by Backprop 11 • Basic Concepts • Active Learning? • Bayesian Neural Networks? • Siddhant & Lipton, 2018 • Learning Resources