Slide 1

Slide 1 text

PRACTICAL MACHINE LEARNING FOR EVERYDAY WEB APPS Making TensorFlow the brain of your Django app Dražen Lučanin @metakermit

Slide 2

Slide 2 text

THE INTERNET IS FOR CAT PHOTOS!

Slide 3

Slide 3 text

IS THAT MIFFLES IN YOUR PHOTO?

Slide 4

Slide 4 text

AI: THE NEXT BIG THING™ ● Communication was the Internet’s killer app – web apps storing data to DBs ruled – Social networks, online stores, messaging apps, productivity apps, online courses, … ● My view… AI will be the next killer app – Faster & cheaper GPU hardware – Lots of R&D around machine learning – Cool applications ● Self-driving cars ● Good speech recognition ● Automating repetitive manual tasks ● “The secret sauce”

Slide 5

Slide 5 text

GETTING INTO AI ● …probably a good idea ● Applying AI != researching AI ● Modern AI frameworks – Torch (Facebook) – Theano (academy) – TensorFlow (Google)

Slide 6

Slide 6 text

TENSORFLOW (TF) ● Great AI framework built in Google – Easy for developers and researchers – Production-ready ● MapReduce – White paper only – Hadoop became the standard ● TF open sourced to became the standard ● Model marketplace

Slide 7

Slide 7 text

TF OVERVIEW ● DataFlow programming language ● describe a graph of interacting operations that run entirely outside Python – Graph – Session ● Abstraction levels – Low-level API (for researchers) – High-level API (GTD)

Slide 8

Slide 8 text

LOW-LEVEL API import numpy as np import tensorflow as tf W = tf.Variable([.3], tf.float32) b = tf.Variable([-.3], tf.float32) x = tf.placeholder(tf.float32); y = tf.placeholder(tf.float32) linear_model = W * x + b loss = tf.reduce_sum(tf.square(linear_model - y)) optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) x_train = [1,2,3,4] y_train = [0,-1,-2,-3] init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) for i in range(1000): sess.run(train, {x:x_train, y:y_train}) curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

Slide 9

Slide 9 text

EVENTS EXAMPLE ● AI for event discovery – Django web app ( https://www.posterbat.com ) – Side-project I’m working on ● Baby steps – Scrapped some event cover images from the web – Does an image have text in it? ● AI can be used on everyday problems ● Not only for cutting edge research problems – e.g. speech recognition

Slide 10

Slide 10 text

GETTING DATA FROM DJANGO $ ./manage.py shell -c 'from export_images import run; run()' from events.models import Event def run(): for event in Event.objects.all(): prepare_event_image(event)

Slide 11

Slide 11 text

NORMALISING IMAGES USING PIL from PIL import Image def prepare_event_image(event): with open(f'./images/{event.id}.png', 'wb') as f: size = (256, 256) try: image = Image.open(event.img) except ValueError: return image.thumbnail(size, Image.ANTIALIAS) region = image.crop((0, 0, *size)) region.save(f, 'png')

Slide 12

Slide 12 text

LABELING CLASSES

Slide 13

Slide 13 text

IMPORTING DATA INTO TF labels = pd.read_csv('./labels.csv', index_col=0) def read_data(folder): path = './data/images/' + folder + '/' x = []; y = [] for filename in os.listdir(path): image_id = int(filename.split('.')[0]) # convert input 256x256 image to grayscale # flatten to a 1-d array of floats (0-255) im = misc.imread( path + filename,flatten=True ).flatten() x.append(im) y.append(int(labels.ix[image_id])) x = np.array(x) return tf.constant(x), tf.constant(y) train, train_labels = read_data('train')

Slide 14

Slide 14 text

TRAINING THE MODEL feature_columns = [tf.contrib.layers.real_valued_column("", dimension=65536)] classifier = tf.contrib.learn.DNNClassifier( feature_columns=feature_columns, hidden_units=[1024, 512, 256], n_classes=2, model_dir="/tmp/model_dir", ) classifier.fit( x=train, y=train_labels, steps=20000 )

Slide 15

Slide 15 text

LOAD THE MODEL IN DJANGO ● Save ● Add the model directory to your code repository ● Load in Django saver = tf.train.Saver() saver.save(session, 'my-model') new_saver = tf.train.import_meta_graph('events/my-model.meta') new_saver.restore(session, 'events/my-model')

Slide 16

Slide 16 text

APPLY THE MODEL def get_new_img(): x = [] img_path = 'image.png' im = misc.imread(img_path, flatten=True).flatten() x.append(im) x = np.array(x) return x classifier.predict(input_fn=get_new_img, as_iterable=False)

Slide 17

Slide 17 text

TENSORBOARD – MONITORING ● Open http://localhost:6006 ● Monitor training at runtime $ tensorboard --logdir /tmp/model_dir

Slide 18

Slide 18 text

TENSORBOARD – GRAPHS

Slide 19

Slide 19 text

TENSORBOARD – CLASSES

Slide 20

Slide 20 text

TENSORBOARD – CLASSES

Slide 21

Slide 21 text

PERFORMANCE ● CPU (C++ implementation – pretty efficient) ● GPU – even faster! ● JIT compiler – Speed things up by adding a single line of code – Experimental ● XLA compiler – Ahead-of time compilation – Run on embedded devices (phones, IoT)

Slide 22

Slide 22 text

DEPLOYMENT ● Google Cloud & AWS offer VMs with GPUs ● FloydHub – Heroku for AI – https://www.floydhub.com/ Provider AWS Google Floyd Cost per hour ($) 0.99 0.795 0.432

Slide 23

Slide 23 text

LEARNING ● Easy riding – https://changelog.com/podcast/219 – TF Dev Summit ‘17 videos – https://events.withgoogle.com/tensorflow-dev-summit/ ● Docs & tutorial – https://www.tensorflow.org/get_started/get_started – https://medium.freecodecamp.com/big-picture-machine-learning-classifying-text-with- neural-networks-and-tensorflow-d94036ac2274 ● Good free books – ESL – http://statweb.stanford.edu/~tibs/ElemStatLearn/ – Michael Nielsen – http://neuralnetworksanddeeplearning.com/ ● Research – http://distill.pub/

Slide 24

Slide 24 text

THANKS! ● Dražen Lučanin ● @metakermit ● Building apps with a kick! https://punkrockdev.com/