The Unconventional Introduction to Deep Learning @ PyData Florence

Deep Learning in Python The Unconventional Introduction Data Scientist and
Researcher Fondazione Bruno Kessler (FBK)  Trento, Italy Valerio Maggio @leriomaggio

Hint:  Doctor Who?

Quite Common Deep Learning Intro

A multi-layer feed-forward neural network that starts w/ an input
layer fully connected, which is followed by multiple hidden layer of non-linear transformation Neural Network Introduction

Neural Networks Machinery

Neural Networks Machinery ReLu | sigmoid | tanh

Neural Networks Machinery Repeat for each layer…

Neural Networks Machinery Image Classification Task

Neural Networks Machinery Summary;   A Neural Network is: •
Built from layers; each of which is: • a matrix multiplication, • then add bias • then apply non- linearity Learn values for parameters; W and b (for each layer using Back- Propagation)

Deep Networks

Deep Learning is quite like the Tardis…

Deeply Wobbly, Timey Wimey… stuff

So, what about Python?

Machine (+Deep) Learning libraries zoo All the same?

Machine (+Deep) Learning libraries zoo

Deep Learning Hardware

VGG16:   Fully Convolutional Neural Network (FCNN) Vanilla VGG16 VGG16

Source: Learning Semantic Image Representations at a Large Scale

GEneralised Matrix-to-Matrix Multiplication GEMM (BLAS l3) is at the heart
of Deep Learning The difference is in the SCALE:  A single layer in a typical network may require the multiplication of 256x1,152 matrix by 1,152x192 matrix —> 256x192 result.   Naively, that requires 57 million (256 x 1,152, x 192) floating point operations and there can be dozens of these layers in a modern architecture

Fully Connected (FC) Layer

Convolutional (Conv) Layer

GEMM for Conv

Conv Implementation for GEMM C= Conv = I*C DeConv =
I*CT

Deep Learning Frameworks

Deep Learning Frameworks Model specification:   Configuration file (e.g. Caffe,
CNTK)   vs.   programmatic generation   (e.g. Torch, Theano, Tensorflow) From a programmatic perspective:   Lua (Torch) vs. Python (Theano, Tensorflow)

import tensorflow as tf vs. import theano as th Theano
is a deep learning library with python wrapper (inspiration for tensorflow) Tensorflow is a deep learning library recently open sourced by Google th ~= tf: tf has better support for distributed systems

What does   tensorflow provides? TensorFlow provides primitives for defining
functions on tensors and automatically computing their derivatives

Tensor?

Tensor? An intuitive way to represent a tensor is a 
multidimensional array

numpy vs tensorflow

tf requires explicit evaluation In [1]: import numpy as np
In [2]: import tensorflow as tf In [3]: a = np.zeros((2,2) In [4]: ta = tf.zeros((2,2)) In [5]: print(a) [[ 0. 0.] [ 0. 0.]] In[6]: print(ta) Tensor("zeros_1:0", shape=(2, 2), dtype=float32) In[7]: print(ta.eval()) [[ 0. 0.] [ 0. 0.]]

tf.Graph (IDEA) A Machine Learning application is the result of
the repeated computation of complex mathematical expressions, thus we could describe this computation by using a Data Flow Graph Data Flow Graph: each Node represents the instance of a mathematical operation:  multiply, add, divide each Edge is a multi-dimensional data set (tensors) on which the operations are performed.

tf.Graph Node: instantiation of an operation w/   inputs (>=
2), outputs >= 0. Data Edges:   carriers tensors, where an output of one operation (from one node) becomes the input for another operation. Dependency Edges:   control dependency between two nodes (i.e. "happens before" relationship). Before and after graph transformation for partial execution

Logistic Regression

source: https:/ /github.com/tensorflow/fold tf.Graph w/ multi-layers

Logistic Neuron

Logistic Neuron In [1]: import tensorflow as tf In [2]:
# tf Graph Input x = tf.placeholder("float", [None, 784]) y = tf.placeholder("float", [None, 10])  In [3]: # Set model weights W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10]))  In [4]: # Construct model activation = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax  In [5]: # Minimize error using cross entropy cross_entropy = y*tf.log(activation) cost = tf.reduce_mean(-tf.reduce_sum(cross_entropy,  reduction_indices=1)) Repeat this for each layer you want to add

Logistic Neuron In [6]: learning_rate = 0.01 In [7]: #
Set the Optimizer  optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) In [9]: # Initializing the variables init = tf.global_variable_initializers() In[10]: for epoch in range(training_epochs): avg_cost = 0. total_batch = int(mnist.train.num_examples/batch_size) # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # Fit training using batch data sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys}) # Compute average loss avg_cost += sess.run(cost,   feed_dict={x: batch_xs, y: batch_ys}) / total_batch

Deep Learning: the Keras way @pydatait @pycon8

Thank you! @leriomaggio

The Unconventional Introduction to Deep Learnin...

The Unconventional Introduction to Deep Learning @ PyData Florence

More Decks by Valerio Maggio

Other Decks in Programming

Featured

Transcript