Data Science 101:
insight, not numbers
Ronojoy Adhikari
The Institute of Mathematical Sciences
Chennai, India
Orangescape
Chennai, India
Wednesday, 30 September 15
Slide 2
Slide 2 text
The purpose of computing is insight, not numbers.
Wednesday, 30 September 15
Slide 3
Slide 3 text
The purpose of computing is insight, not numbers.
Wednesday, 30 September 15
Slide 4
Slide 4 text
The purpose of computing is insight, not numbers.
Richard Hamming
Wednesday, 30 September 15
Slide 5
Slide 5 text
What is the purpose of data science ?
Wednesday, 30 September 15
Slide 6
Slide 6 text
What is the purpose of data science ?
Insight, not numbers!
Wednesday, 30 September 15
Slide 7
Slide 7 text
Data science
Wednesday, 30 September 15
Slide 8
Slide 8 text
Wednesday, 30 September 15
Slide 9
Slide 9 text
Data
Wednesday, 30 September 15
Slide 10
Slide 10 text
Data
Domain knowledge
Wednesday, 30 September 15
Slide 11
Slide 11 text
Data
Domain knowledge
Data curation
Wednesday, 30 September 15
Slide 12
Slide 12 text
Data
Domain knowledge
Data curation
Mathematical
model
Wednesday, 30 September 15
Slide 13
Slide 13 text
Data
Domain knowledge
Data curation
Mathematical
model
A/B testing
Wednesday, 30 September 15
Slide 14
Slide 14 text
Data
Domain knowledge
Data curation
Mathematical
model
A/B testing
Machine
learning
Wednesday, 30 September 15
Slide 15
Slide 15 text
Data
Domain knowledge
Data curation
Mathematical
model
A/B testing
Machine
learning
Machine
inference
Wednesday, 30 September 15
Slide 16
Slide 16 text
Data
Domain knowledge
Data curation
Mathematical
model
A/B testing
Machine
learning
Machine
inference
Value from data
Wednesday, 30 September 15
Slide 17
Slide 17 text
1. Problem or question ?
Wednesday, 30 September 15
Slide 18
Slide 18 text
Wednesday, 30 September 15
Slide 19
Slide 19 text
Let the data speak for themselves!
Ronald Fisher
Wednesday, 30 September 15
Slide 20
Slide 20 text
Let the data speak for themselves!
Ronald Fisher
The data cannot speak for themselves;
and they never have, in any real problem
of inference.
Edwin Jaynes
Wednesday, 30 September 15
Slide 21
Slide 21 text
Classification
Regression
Clustering
Dimensionality reduction
Wednesday, 30 September 15
Slide 22
Slide 22 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
Wednesday, 30 September 15
Slide 23
Slide 23 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
Wednesday, 30 September 15
Slide 24
Slide 24 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
Wednesday, 30 September 15
Slide 25
Slide 25 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
Wednesday, 30 September 15
Slide 26
Slide 26 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
group similar things together
Wednesday, 30 September 15
Slide 27
Slide 27 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
group similar things together
Wednesday, 30 September 15
Slide 28
Slide 28 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
group similar things together
keeping only the relevant variables
Wednesday, 30 September 15
Slide 29
Slide 29 text
Classification
Regression
Clustering
Dimensionality reduction
predict class, given attributes
predict values, given other values
group similar things together
keeping only the relevant variables
Wednesday, 30 September 15
Slide 30
Slide 30 text
3. Frame a hypothesis
(mathematical models)
Wednesday, 30 September 15
Slide 31
Slide 31 text
Bayesian
Blackbox
Frequentist
Causal
Wednesday, 30 September 15
Slide 32
Slide 32 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge
Wednesday, 30 September 15
Slide 33
Slide 33 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge probability is a frequency
Wednesday, 30 September 15
Slide 34
Slide 34 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge probability is a frequency
Wednesday, 30 September 15
Slide 35
Slide 35 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge
ML : toolbox for processing data
probability is a frequency
Wednesday, 30 September 15
Slide 36
Slide 36 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge
ML : toolbox for processing data
probability is a frequency
Wednesday, 30 September 15
Slide 37
Slide 37 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge
ML : toolbox for processing data ML : learning generative models of data
probability is a frequency
Wednesday, 30 September 15
Slide 38
Slide 38 text
Bayesian
Blackbox
Frequentist
Causal
probability is a state of knowledge
ML : toolbox for processing data ML : learning generative models of data
probability is a frequency
Wednesday, 30 September 15
Slide 39
Slide 39 text
Wednesday, 30 September 15
Slide 40
Slide 40 text
Wednesday, 30 September 15
Slide 41
Slide 41 text
Wednesday, 30 September 15
Slide 42
Slide 42 text
We are building a causal
learning and inference
engine that will beat the
current state-of-art!
Wednesday, 30 September 15
Slide 43
Slide 43 text
We are building a causal
learning and inference
engine that will beat the
current state-of-art!
Thank you for your attention!
Wednesday, 30 September 15