Python AI - All you need to know about Machine Learning and Deep Learning

DEEP LEARNING WITH RECURRENT NEURAL NETWORKS IN PYTHON / /
Donald Whyte @donald_whyte Alejandro Saucedo @AxSaucedo

ABOUT US Alejandro Saucedo Donald Whyte

CREATE AN AI AUTHOR. Create a neural network that can
write novels. Using 34,000 English novels to train the network.

THE OUTPUT Gradually drawing away from the rest, two combatants
are striving; each devoting every nerve, every energy, to the overthrow of the other. But each attack is met by counter attack, each terrible swinging stroke by the crash of equally hard pain or the dull slap of tough hard shield opposed in parry. More men are down. Even numbers of men on each side, these two combatants strive on.

Less than 100 lines of Tensorflow code! # ONE import
tensorflow as tf from tensorflow.contrib import layers, rnn import os import time import math import numpy as np tf.setrandom_seed(0) # model parameters SEQLEN = 30 BATCHSIZE = 200 ALPHASIZE = 89 INTERNALSIZE = 512 NLAYERS = 3 learning_rate = 0.001 dropout_pkeep = 0.8 codetext, valitext, bookranges = load_data() # the model lr = tf.placeholder(tf.float32, name='lr') # learning rate pkeep = tf.placeholder(tf.float32, name='pkeep') # dropout parameter batchsize = tf.placeholder(tf.int32, name='batchsize') # inputs X = tf.placeholder(tf.uint8, [None, None], name='X') Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) # expected outputs Y = tf.placeholder(tf.uint8, [None, None], name='Y') Yo = tf.onehot(Y, ALPHASIZE, 1.0, 0.0) # input state Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE*NLAYERS], name='Hin') # hidden layers cells = [rnn.GRUCell(INTERNALSIZE) for _ in range(NLAYERS)] multicell = rnn.MultiRNNCell(cells, state_is_tuple=False) # TWO Yr, H = tf.nn.dynamicrnn(multicell, Xo, dtype=tf.float32, initial_state=Hin) H = tf.identity(H, name='H') # Softmax layer implementation Yflat = tf.reshape(Yr, [1, INTERNALSIZE]) Ylogits = layers.linear(Yflat, ALPHASIZE) Yflat = tf.reshape(Yo, [1, ALPHASIZE]) loss = tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Yflat) loss = tf.reshape(loss, [batchsize, 1]) Yo = tf.nn.softmax(Ylogits, name='Yo') Y = tf.argmax(Yo, 1) Y = tf.reshape(Y, [batchsize, 1], name="Y") trainstep = tf.train.AdamOptimizer(lr).minimize(loss) # Init for saving models if not os.path.exists("checkpoints"): os.mkdir("checkpoints") saver = tf.train.Saver(max_to_keep=1000) # init istate = np.zeros([BATCHSIZE, INTERNALSIZE*NLAYERS]) init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) step = 0 # train on one minibatch at a time for x, y, epoch in txt.rnnminibatch_sequencer(codetext, BATCHSIZE, SEQLEN, nb_ep feed_dict = {X: x, Ye: ye, Hin: istate, lr: learning_rate, pkeep: dropout_pkeep, batc , y, ostate = sess.run([trainstep, Y, H], feed_dict=feed_dict) if step // 10 % _50_BATCHES == 0: saved_file = saver.save(sess, 'checkpoints/rnn_train' + timestamp, global_st print("Saved file: " + saved_file) istate = ostate step += BATCHSIZE * SEQLEN

GUTENBERG DATATSET Contains 34,000 English novels. https://www.gutenberg.org/

['h','e','l','l','o',' ', 'm','y',' ','n','a','m','e',' ','i','s', ...] [10, 5, 12, 12,
17, 27, 15, 25, 27, 16, 1, 15, 5, 27, 6, 18, ...] merge all training documents into one load as a ﬂat list of chars convert chars to integers ﬂat sequence of integers that represent all text in dataset

COME TO OUR WORKSHOP!

1. TRADITIONAL SUPERVISED LEARNING Use labelled historical data to predict
future outcomes

Given some input data, predict the correct output What features
of the input tell us about the output?

FEATURE SPACE A feature is some property that describes raw
input data Features represented as a vector in feature space Abstract complexity of raw input for easier processing

Training data is used to produce a model Model divides
feature space into segments Each segment corresponds to one output class CLASSIFICATION f (x̄ ) = mx̄ + c

Use trained model to predict outcome of new, unseen data.

EXAMPLE x̄ = (area, perimeter) m = (1, −3.5) c
= 0

Using this model, let's predict what shape an object is.
Feature Value Area 3 Perimeter 1

means le side of the line. Input shape is a
triangle. x̄ = (3, 1) f (x̄ ) = 1(3) + −3.5(1) + 0 f (x̄ ) = 3 − 3.5 + 0 f (x̄ ) = −0.5 −0.5 < 0 −0.5 < 0

THE HARD PART Learning and . m c

Can this be used to learn how to write novels?

GENERATING COHERENT TEXT REQUIRES MEMORY OF WHAT WAS WRITTEN PREVIOUSLY.
Male Person Valentin, he Drinks drink, beer, lagers Valentin's favourite drink is beer. He likes lagers the most.

How do we do this?

2. DEEP NEURAL NETWORKS

Deep neural nets can learn to patterns in complex data,
like language. We can encode memory into the algorithm.

Just use the raw input data. Our training data is
the raw text of existing novels. No need for for manual feature extraction.

THE MIGHTY PERCEPTRON Equivalent to the straight line equation from
before Linearly splits feature space Modeled a er a neuron in the human brain

THE MIGHTY PERCEPTRON Synonymous to our linear function f(x) =
mx + c For n features, the perceptron is defined as: Y = F(WX + B) n-dimensional weight vector w n-dimensional input vector x bias scalar b activation function f(s) output y

THE MIGHTY PERCEPTRON F(WX + B) = Y

ACTIVATION FUNCTION Simulates the 'firing' of a physical neuron. Takes
the weighted sum and squashes it into a smaller range.

ACTIVATION FUNCTION SIGMOID FUNCTION Squashes perceptron output into range [0,1]
Used to learn weights (w)

How do we learn w and b?

PERCEPTRON LEARNING ALGORITHM Algorithm which learns correct weights and bias
Use training dataset to incrementally train perceptron Guaranteed to create line that divides output classes (if data is linearly separable)

REPRESENTING TEXT Make the input layer represent: a single word
or a single character Use the input to word/char to predict the next.

WE WILL USE CHARACTERS AS THE INPUTS. 21 22 23
24 25 current char next char 21 22 23 24 25

21 22 23 24 25 21 22 23 24 25
Input: b Predicted char: ? Current sentence: b?

21 22 23 24 25 21 22 23 24 25
Input: b Predicted char: a Current sentence: ba

21 22 23 24 25 21 22 23 24 25
Input: a Predicted char: d Current sentence: bad

21 22 23 24 25 21 22 23 24 25
Input: e Predicted char: d Current sentence: ball games were o en played

#WINNING

PROBLEM Single perceptrons are straight line equations. Produce a single
output, and hence cannot be used for complex problems like language. Need a network of neurons to output the full one-hot vector.

SOLUTION: NEURAL NETWORKS Uses multi-layer perceptrons to: learn patterns in
complex data, like language produce the multiple outputs required for text prediction Has multiple layers to provide flexibility on learning

#WINNING

NEURON CONNECTIVITY Each layer is fully connected to the next
All nodes in layer are connected to nodes in layer Every single connection has a weight l l + 1

Produces multiple weight matrices One for each layer Which allows
us to...

TRAINING NEURAL NETWORKS Learn the weight matrices!

LOSS FUNCTION AN OPTIMIZATION PROBLEM Inputs: 1. the real output
of the network a er each batch 2. the expected output (from our training data) Outputs: Number indicating performance of network.

Lower loss values = better performance Better performance = better
prediction accuracy

GRADIENT DESCENT OPTIMISER We optimise the network by minimising its
loss. Keep adjusting the weights of each hidden layer... ...until loss is not getting any smaller.

BACKPROPAGATION Equivalent to gradient descent The training algorithm for neural
networks For each feature vector in the training dataset, do a: 1. forward pass 2. backward pass

FORWARD PASS expected output

BACKWARD PASS expected output

A er training the network, we obtain weights which minimise
prediction error. Predict next character by running the last character through the forward pass step.

#WINNING

HOWEVER... Network still has no memory of past characters. Valentin's
favourite drink is beer. He likes lagers the most.

3. DEEP RECURRENT NETWORKS

SINGLE NEURON — ONE OUTPUT

NEURAL NETWORK — MULTIPLE OUTPUTS

DEEP NETWORKS — MANY HIDDEN LAYERS

SIMPLIFIED VISUALISATION One node represents a full layer of neurons.

RECURRENT NETWORKS O_ 0 (layer 0 output) O_ 1 (layer
1 output) O_ 2 (layer 2 output) Hidden layer's input includes the output of itself during the last run of the network.

UNROLLED RECURRENT NETWORK Previous predictions help make the next prediction.
Each prediction is a time step. Time

Time Time t=0 t=1 t=2 t=3 t=4 t=5

Time Time B t=0 t=1 t=2 t=3 t=4 t=5

Time Time B o t=0 t=1 t=2 t=3 t=4 t=5

Time Time B o o b t=0 t=1 t=2 t=3
t=4 t=5

Time Time B o b o b _ t=0 t=1
t=2 t=3 t=4 t=5

Time Time B o b _ o b i _
t=0 t=1 t=2 t=3 t=4 t=5

Time Time B o b i _ o b i
s _ t=0 t=1 t=2 t=3 t=4 t=5

Time Time B o b i s _ o b
i s _ _ t=0 t=1 t=2 t=3 t=4 t=5

PROBLEM: LONG-TERM DEPENDENCIES Time B o b i s _
_ a _ a n m o b i s _ _ a _ a n m ... ✖

CELL STATES Add extra state to each layer of the
network. Remembers inputs far into the past. Transforms layer's original output into something that is relevant to the current context.

O _0 O_ 1 O _2 H_0 (cell state) H_2
(cell state) H_1 (cell state)

Hidden layer output and cell state is feed into next
time step. Gives network ability to handle long-term dependencies in sequences.

Time B o b i s _ _ a _
a n m o b i s _ _ a _ a n m ... ✓

4. TRAINING RNNS

These recurrent networks are trained in the same way as
regular network.

Backpropagation and gradient descent.

expected output

We need data to train the network.

GUTENBERG DATATSET Contains 34,000 English novels. https://www.gutenberg.org/

['h','e','l','l','o',' ', 'm','y',' ','n','a','m','e',' ','i','s', ...] [10, 5, 12, 12,
17, 27, 15, 25, 27, 16, 1, 15, 5, 27, 6, 18, ...] merge all training documents into one load as a ﬂat list of chars convert chars to integers use integers to generate one-hot inputs for each time step

COMMON TRAINING METHODS Run backpropagation a er: Stochastic one sequence
Batch all sequences Mini-Batch smaller batch of sequences b

We'll use mini-batch.

WHY? Stochastic long time to converge on good weights Batch
consumes lots of memory, gets stuck on "okay" weights Mini- Batch quick to converge and memory eﬀicient

Iterate across all batches. Run backpropagation a er processing each
batch. [10, 5, 12, 12, 17, 27, 15, 25, 27, 16, 1, 15, 5, 27, 6, 18, ...] [27, 16, 1, 15] [10, 5, 12, 12] [5, 27, 6, 18] [17, 27, 15, 25] ...

5. NEURAL NETS IN PYTHON

Building a neural network involves: 1. defining its architecture 2.
learning the weight matrices for that architecture

PROBLEM: COMPLEX GRAPHS

PROBLEM: COMPLEX DERIVATIONS

SOLUTION Can build very complex networks quickly Easy to extend
if required Built-in support for RNN memory cells

OTHER PYTHON NEURAL NET LIBRARIES

6. TENSORFLOW BUILDING OUR MODEL

current char predicted next char Input Output Hidden Recurrent Layers
BUILD A RECURRENT NEURAL NETWORK TO GENERATE STORIES IN TENSORFLOW.

HOW? Build a computation graph that learns the weights of
our network.

THE COMPUTATION GRAPH tf.Tensor Unit of data. Vectors or matrices
of values (floats, ints, etc.). tf.Operation Unit of computation. Takes 0+ tf.Tensors as inputs and outputs 0+ tf.Tensors. tf.Graph Collection of connected tf.Tensors and tf.Operations. Operations are nodes and tensors are edges.

THE COMPUTATION GRAPH

GRAPH THAT TRIPLES NUMBERS AND SUMS THEM. Output # 1.
Define Inputs # Input is a 2D vector containing the two numbers to triple. inputs = tf.placeholder(tf.float32, [2]) # 2. Define Internal Operations tripled_numbers = tf.scalar_mul(3, inputs) # 3. Define Final Output # Sum the previously tripled inputs. output_sum = tf.reduce_sum(tripled_numbers) # 4. Run the graph with some inputs to produce the output. session = tf.Session() result = session.run(output_sum, feed_dict={inputs: [300, 10]}) print(result) 930

DEFINING HYPERPARAMETERS current char predicted next char Input Output Hidden
Recurrent Layers # Input Hyperparameters SEQUENCE_LEN = 30 BATCH_SIZE = 200 ALPHABET_SIZE = 98 # Hidden Recurrent Layer Hyperparameters HIDDEN_LAYER_SIZE = 512 NUM_HIDDEN_LAYERS = 3

current char predicted next char Input # Dimensions: [ BATCH_SIZE,
SEQUENCE_LEN ] X = tf.placeholder(tf.uint8, [None, None], name='X')

current char predicted next char Input # Dimensions: [ BATCH_SIZE,
SEQUENCE_LEN, ALPHABET_SIZE ] Xo = tf.one_hot(X, ALPHABET_SIZE, 1.0, 0.0)

DEFINING HIDDEN STATE current char predicted next char Input Hidden
Recurrent Layers

DEFINING HIDDEN STATE from tensorflow.contrib import rnn # Cell State
# [ BATCH_SIZE, HIDDEN_LAYER_SIZE * NUM_HIDDEN_LAYERS] H_in = tf.placeholder( tf.float32, [None, HIDDEN_LAYER_SIZE * NUM_HIDDEN_LAYERS], name='H_in') # Create desired number of hidden layers that use the `GRUCell` # for managing hidden state. cells = [ rnn.GRUCell(HIDDEN_LAYER_SIZE) for _ in range(NUM_HIDDEN_LAYERS) ] multicell = rnn.MultiRNNCell(cells)

UNROLLING RECURRENT NETWORK LAYERS Time

UNROLLING RECURRENT NETWORK LAYERS Wrap recurrent hidden layers in tf.dynamic_rnn.
Unrolls loops when computation graph is running. Loops will be unrolled SEQUENCE_LENGTH times. Yr, H_out = tf.nn.dynamic_rnn( multicell, Xo, dtype=tf.float32, initial_state=H_in) # Yr = output of network. probability distribution of # next character. # H_out = the altered hidden cell state after processing # last input.

OUTPUT IS PROBABILITY DISTRIBUTION current char predicted next char Input
Output Hidden Recurrent Layers from tensorflow.contrib import layers # [ BATCH_SIZE x SEQUENCE_LEN, HIDDEN_LAYER_SIZE ] Yflat = tf.reshape(Yr, [1, HIDDEN_LAYER_SIZE]) # [ BATCH_SIZE x SEQUENCE_LEN, ALPHABET_SIZE ] Ylogits = layers.linear(Yflat, ALPHABET_SIZE) # [ BATCH_SIZE x SEQUENCE_LEN, ALPHABET_SIZE ] Yo = tf.nn.softmax(Ylogits, name='Yo')

PICK MOST PROBABLE CHARACTER current char predicted next char Input
Output Hidden Recurrent Layers # [ BATCH_SIZE * SEQUENCE_LEN ] Y = tf.argmax(Yo, 1) # [ BATCH_SIZE, SEQUENCE_LEN ] Y = tf.reshape(Y, [BATCH_SIZE, 1], name="Y")

Remaining tasks: define our loss function decide what weight optimiser
to use

LOSS FUNCTION Needs: 1. the real output of the network
a er each batch 2. the expected output (from our training data)

LOSS FUNCTION current char predicted next char loss expected next
char Input Output Accuracy/Loss Calculation Hidden Recurrent Layers

LOSS FUNCTION Input expected next chars into network: # [
BATCH_SIZE, SEQUENCE_LEN ] Y_ = tf.placeholder(tf.uint8, [None, None], name='Y_') # [ BATCH_SIZE, SEQUENCE_LEN, ALPHABET_SIZE ] Yo_ = tf.one_hot(Y_, ALPHABET_SIZE, 1.0, 0.0) # [ BATCH_SIZE x SEQUENCE_LEN, ALPHABET_SIZE ] Yflat_ = tf.reshape(Yo_, [1, ALPHABET_SIZE])

LOSS FUNCTION Defining the loss function: # [ BATCH_SIZE *
SEQUENCE_LEN ] loss = tf.nn.softmax_cross_entropy_with_logits( logits=Ylogits, labels=Yflat_) # [ BATCH_SIZE, SEQUENCE_LEN ] loss = tf.reshape(loss, [BATCH_SIZE, 1])

CHOOSE AN OPTIMISER Will adjust network weights to minimise the
loss. In the workshop we'll use a flavour called AdamOptimizer. train_step = tf.train.GradientDescentOptimizer(lr).minimize(loss)

7. TENSORFLOW TRAINING THE MODEL

EPOCHS We run mini-batch training on the network. Train network
on all batches multiple times. Each run across all batches is an epoch. More epochs = better weights = better accuracy.

MINIBATCH SPLITTING ACROSS EPOCHS # Contains: [Training Data, Test Data,
Epoch Number] Batch = Tuple[np.matrix, np.matrix, int] def rnn_minibatch_generator( data: List[int], batch_size: int, sequence_length: int, num_epochs: int) > Generator[Batch, None, None]: for epoch in range(num_epochs): for batch in range(num_batches): # split data into batches, where each batch contains `b` # of length `sequence_length`. training_data = ... test_data = ... yield training_data, c, epoch

START TRAINING Load dataset and construct mini-batch generator: # Initialize
the hidden cell states to 0 before running any steps. input_state = np.zeros( [BATCH_SIZE, HIDDEN_LAYER_SIZE * NUM_HIDDEN_LAYERS]) # Create the session and initialize its variables to 0. init = tf.global_variables_initializer() session = tf.Session() session.run(init) char_integer_list = [] generator = rnn_minibatch_generator( char_integer_list, BATCH_SIZE, SEQUENCE_LENGTH, num_epochs=10)

Run training step on all mini-batches for multiple epochs: #
Initialise input state step = 0 input_state = np.zeros([ BATCH_SIZE, HIDDEN_LAYER_SIZE * NUM_HIDDEN_LAYERS ]) # Run training step loop for batch_input, expected_batch_output, epoch in generator: graph_inputs = { X: batch_input, Y_: expected_batch_output, Hin: input_state, batch_size: BATCH_SIZE } _, output, output_state = session.run( [train_step, Y, H], feed_dict=graph_inputs) # Loop state around for next recurrent run input_state = output_state step += BATCH_SIZE * SEQUENCE_LENGTH

FINAL RESULTS

Epoch 0.0 Dy8v:SH3U 2d4 xZ Vaf%hO kS0i6 7y U5SUu6nSsR0 x
MYiZ5ykLOtG3Q,cu St k V ctc_N CQFSbF%]q3ZsWWK8wP gyfYt3DpFo yhZ_ss,"IedX%lj,R%_4ux IX5 R%N3wQNG PnSl 1DJqLdpc[kLeSYMoE]kf xCe29 J[r_k 6BiUs GUguW Y [Kw8"P Sg" e[2OCL%G mad6,:J[A k 5 jz 46iyQLuuT 9qTn GjT6:dSjv6RXMyjxX8:3 h cr sYBgnc8 DP04A8laW

Epoch 0.1 Uum awetuarteeuF toBdU iwObaaMlr o rM OufNJetu iida
cZeDbRuZfU m igdaao QH NBJ diace e L cjoXeu ZDjiM AeN g iu O Aoc jdjrmIuaai ie t qmuozPwaEkoihca eXuzRCgZ iW AeqapiwaT VInBosPkqroi s yWbJoj yKq oUo jebaYigEouzxVb eyt Px hiamIf vPOiiPu ky Cut LviPoej iE w hpFVxes h zwsvoidmoWxzgTnL ujDt Pr a

Epoch 1 Here is the goal of my further. I
shouldn't be the shash of no. Sky is bright and blue as running goeg on. Paur decided to move downwards to the floor, where the treasure was stored. She then thought to call her friend from ahead.

Epoch 50 Gradually drawing away from the rest, two combatants
are striving; each devoting every nerve, every energy, to the overthrow of the other. But each attack is met by counter attack, each terrible swinging stroke by the crash of equally hard pain or the dull slap of tough hard shield opposed in parry. More men are down. Even numbers of men on each side, these two combatants strive on.

FURTHER EXAMPLES Andrej Karpathy's blog post: The Unreasonable Eﬀectiveness of
Recurrent Neural Networks

/* * Increment the size file of the new incorrect
UI_FILTER group information * of the size generatively. */ static int indicate_policy(void) { int error; if (fd == MARN_EPT) { /* The kernel blank will coeld it to userspace. */ if (ss>segment < mem_total) unblock_graph_and_set_blocked(); else ret = 1; goto bail; } segaddr = in_SB(in.addr); selector = seg / 16; setup_works = true; for (i = 0; i < blocks; i++) { seq = buf[i++]; bpf = bd>bd.next + i * search; if (fd) { current = blocked; } } rw>name = "Getjbbregs"; bprm_self_clearl(&iv>version); regs>new = blocks[(BPF_STATS << info>historidac)] | PFMR_CLOBATHINC_SECONDS << 12; return segtable; }

SUCCESS!

We have created an AI author! Less than 100 lines
of Tensorflow code! # ONE import tensorflow as tf from tensorflow.contrib import layers, rnn import os import time import math import numpy as np tf.setrandom_seed(0) # model parameters SEQLEN = 30 BATCHSIZE = 200 ALPHASIZE = 89 INTERNALSIZE = 512 NLAYERS = 3 learning_rate = 0.001 dropout_pkeep = 0.8 codetext, valitext, bookranges = load_data() # the model lr = tf.placeholder(tf.float32, name='lr') # learning rate pkeep = tf.placeholder(tf.float32, name='pkeep') # dropout parameter batchsize = tf.placeholder(tf.int32, name='batchsize') # inputs X = tf.placeholder(tf.uint8, [None, None], name='X') Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) # expected outputs Y = tf.placeholder(tf.uint8, [None, None], name='Y') Yo = tf.onehot(Y, ALPHASIZE, 1.0, 0.0) # input state Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE*NLAYERS], name='Hin') # hidden layers cells = [rnn.GRUCell(INTERNALSIZE) for _ in range(NLAYERS)] multicell = rnn.MultiRNNCell(cells, state_is_tuple=False) # TWO Yr, H = tf.nn.dynamicrnn(multicell, Xo, dtype=tf.float32, initial_state=Hin) H = tf.identity(H, name='H') # Softmax layer implementation Yflat = tf.reshape(Yr, [1, INTERNALSIZE]) Ylogits = layers.linear(Yflat, ALPHASIZE) Yflat = tf.reshape(Yo, [1, ALPHASIZE]) loss = tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Yflat) loss = tf.reshape(loss, [batchsize, 1]) Yo = tf.nn.softmax(Ylogits, name='Yo') Y = tf.argmax(Yo, 1) Y = tf.reshape(Y, [batchsize, 1], name="Y") trainstep = tf.train.AdamOptimizer(lr).minimize(loss) # Init for saving models if not os.path.exists("checkpoints"): os.mkdir("checkpoints") saver = tf.train.Saver(max_to_keep=1000) # init istate = np.zeros([BATCHSIZE, INTERNALSIZE*NLAYERS]) init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) step = 0 # train on one minibatch at a time for x, y, epoch in txt.rnnminibatch_sequencer(codetext, BATCHSIZE, SEQLEN, nb_ep feed_dict = {X: x, Ye: ye, Hin: istate, lr: learning_rate, pkeep: dropout_pkeep, batc , y, ostate = sess.run([trainstep, Y, H], feed_dict=feed_dict) if step // 10 % _50_BATCHES == 0: saved_file = saver.save(sess, 'checkpoints/rnn_train' + timestamp, global_st print("Saved file: " + saved_file) istate = ostate step += BATCHSIZE * SEQLEN

CODE SLIDES https://github.com/DonaldWhyte/deep-learning-with-rnns http://donaldwhyte.co.uk/deep-learning-with-rnns

COME TO OUR WORKSHOP!

GET IN TOUCH don@donso .io @donald_whyte https://github.com/DonaldWhyte [email protected] @AxSaucedo https://github.com/axsauze

SOURCES Martin Görner -- Tensorflow RNN Shakespeare Understanding LSTMs Composing
Music with Recurrent Neural Networks

Python AI - All you need to know about Machine ...

Python AI - All you need to know about Machine Learning and Deep Learning

More Decks by Moscow Python Meetup

Other Decks in Programming

Featured

Transcript