Galaxy Zoo 2: detailed morphological
classifications for !
304,122 galaxies from the Sloan Digital Sky
Survey!
!
K. Willett et al 2013
• ~300e3 galaxies (mr > 17)
• ~16e6 classifications
• ~50 classifications / galaxy
100s of 1000s of volunteers!
Slide 7
Slide 7 text
Classification Decision Tree
37 different outcomes
Slide 8
Slide 8 text
0
B
B
B
B
B
B
B
B
B
B
@
n1
.
.
.
.
.
.
.
.
.
.
.
.
n37
1
C
C
C
C
C
C
C
C
C
C
A
3 x 96 x 96 = 27648 numbers 37 numbers
People (even “pro astronomers”) disagree, so
there is no single classification per galaxy.
The distribution of votes is the “classification”.
Slide 9
Slide 9 text
0
B
B
B
B
B
B
B
B
B
B
@
n1
.
.
.
.
.
.
.
.
.
.
.
.
n37
1
C
C
C
C
C
C
C
C
C
C
A
3 x 96 x 96 = 27648 numbers 37 numbers
Thanks for the labels, volunteers!
!
Now time for some
supervised machine learning.
some algorithm
Slide 10
Slide 10 text
kaggle!
a website for machine learning competitions
Slide 11
Slide 11 text
0
B
B
B
B
B
B
B
B
B
B
@
n1
.
.
.
.
.
.
.
.
.
.
.
.
n37
1
C
C
C
C
C
C
C
C
C
C
A
3 x 96 x 96 = 27648 numbers 37 numbers
What algorithm?
some algorithm
Slide 12
Slide 12 text
0
B
B
B
B
B
B
B
B
B
B
@
n1
.
.
.
.
.
.
.
.
.
.
.
.
n37
1
C
C
C
C
C
C
C
C
C
C
A
3 x 96 x 96 = 27648 numbers 37 numbers
An early try: PCA + RF
PCA
Random
Forest
Regressor
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
PCA + RF:
Quick to run, good performance.
Slide 15
Slide 15 text
0
B
B
B
B
B
B
B
B
B
B
@
n1
.
.
.
.
.
.
.
.
.
.
.
.
n37
1
C
C
C
C
C
C
C
C
C
C
A
3 x 96 x 96 = 27648 numbers 37 numbers
What algorithm?
some algorithm
Slide 16
Slide 16 text
-
“Analyzing complex image data?
Use a deep, convolutional neural network.”
etc.
lots of buzz over the past 2 years
Slide 17
Slide 17 text
What can they do?
classification
localization!
!
…building block for
machine vision pipelines
Russakovsky et al 2014
Karpathy et al 2014
Slide 18
Slide 18 text
What is a
deep convolutional neural net?
Slide 19
Slide 19 text
What is a
deep convolutional neural net?
a (non-linear) function from to
Rm Rn
Slide 20
Slide 20 text
What is a
deep convolutional neural net?
a (non-linear) function from to
Rm Rn
Slide 21
Slide 21 text
What is a
deep convolutional neural net?
Self-learn a set of multi-channel convolutional kernels.
(tiles, filter bank)
Take advantage of approximate translational invariance.
Fewer parameters to learn.
Slide 22
Slide 22 text
What is a
deep convolutional neural net?
“Lots” of layers
hierarchical / compositional
Slide 23
Slide 23 text
from “Visualizing and Understanding Convolutional
Networks” Zeiler & Fergus 2013
A Hierarchy of Features
Slide 24
Slide 24 text
Why now?
Slide 25
Slide 25 text
• number of layers
• width of layers
• size of conv. kernels
• pooling stride
• choice of non-linear fn’s.
• dropout
• learning rates
• …
Lots of hyper-parameters
!
!
!
!
!
!
!
!
Convnets might be overkill for a domain
as “limited” as astronomical images.
Computationally expensive to train
(GPU-days or CPU-weeks)
Cons
Slide 26
Slide 26 text
~All the best classifiers, including the best,
used convolutional neural nets.
results
…300+ teams
Astro. Applications?
• objective (non-human) morphology
• finding strong gravitational lenses
• image-based redshifts (as opposed to photo-z.)
• spectrograms or other 2d data
• extending any of above to LSST-like scale
• ?
Slide 29
Slide 29 text
“Smooth Galaxies” movie!
1000 nearby spirals
15 frames (galaxies) per second
Subsequent frames chosen to be close in
morphology space and image space.
http://vimeo.com/86254924