Bayesian inference and big data: are we there yet? by Jose Luis Hidalgo at Big Data Spain 2017

Bayesian statistics and big data: are we there yet? Jose
Luis Hidalgo BigData Spain 2017

Clarification of some concepts Bayesian - "Bayes rule" - "Bayesian
statistics" (vs. frequentist statistics) - "Reverse probability", Fisher definition - "Bayesian models"!

Clarification of some concepts Inference - In classic statistics: "inferential"
vs "descriptive" - In machine learning: "inference" vs "training" - In Bayesian statistics: estimation of parameters from data - … to make predictions - … to validate the model

Clarification of some concepts Big Data - As many definitions
as there are vendors interested in selling you something! - Incremental vs. something new - In our case: "big data" as the fact that we use increasingly larger amounts of data to get to some information/insight (we manage to extract weaker signals from oceans of noise)

A bit of history Early Bayesian models - Treated analytically
- Limited to what can be treated analytically... duh! Nineties: MCMC - Offers (the promise of) generic inference algorithms - Very hard and computationally expensive - Variational inference as an (even harder) alternative

A bit of history Oughties: Probabilistic programming - Standard ways
to explain probabilistic models to a computer - Bayesian models are a subset of probabilistic models - JAGS, BUGS, Stan... - Further developments, HMC, Gibbs sampling.. - Becomes quite popular in academic circles

A bit of history Tens: “Practical” Probabilistic programming - Further
advances in inference: NUTS, ADVI... - New technologies to speedup computations - GPU parallelization - Automatic Differentiation - "Tall" datasets (very large number of cases) - "Wide" datasets (very large number of features)

Some sample applications From cognitive science - Exactly the opposite
of what our NN friends are trying to do! - Models of human memory, of language understanding, etc. - Bayesian models are very well suited for this kind of studies From fin-tech - Large copula models become tractable using (Bayesian) inference algorithms

Some sample applications From AI - Generative image recognition systems
From business operations - The inventory information problem - Probabilistic model of inventory - Enables operational optimization

Conclusions If you are a data science practitioner - Familiarize
yourself with this kind of models - Learn about tools and libraries: Stan, PMC3, Edwars, etc. If you are responsible for technical infrastructure - Leveraging big data will require big compute... -... and not only for neural networks! If you are responsible for a business - Ask for more - Then ask again!

Thank you!

Bayesian inference and big data: are we there y...

Bayesian inference and big data: are we there yet? by Jose Luis Hidalgo at Big Data Spain 2017

Big Data Spain

More Decks by Big Data Spain

Other Decks in Technology

Featured

Transcript

Bayesian statistics and big data: are we there yet? Jose

Clarification of some concepts Bayesian - "Bayes rule" - "Bayesian

Clarification of some concepts Inference - In classic statistics: "inferential"

Clarification of some concepts Big Data - As many definitions

A bit of history Early Bayesian models - Treated analytically

A bit of history Oughties: Probabilistic programming - Standard ways

A bit of history Tens: “Practical” Probabilistic programming - Further

Some sample applications From cognitive science - Exactly the opposite

Some sample applications From AI - Generative image recognition systems

Conclusions If you are a data science practitioner - Familiarize

Thank you!