Practical machine learning for everyday web apps

PRACTICAL MACHINE LEARNING FOR EVERYDAY WEB APPS Making TensorFlow the
brain of your Django app Dražen Lučanin @metakermit

THE INTERNET IS FOR CAT PHOTOS!

IS THAT MIFFLES IN YOUR PHOTO?

AI: THE NEXT BIG THING™ • Communication was the Internet’s
killer app – web apps storing data to DBs ruled – Social networks, online stores, messaging apps, productivity apps, online courses, … • My view… AI will be the next killer app – Faster & cheaper GPU hardware – Lots of R&D around machine learning – Cool applications • Self-driving cars • Good speech recognition • Automating repetitive manual tasks • “The secret sauce”

GETTING INTO AI • …probably a good idea • Applying
AI != researching AI • Modern AI frameworks – Torch (Facebook) – Theano (academy) – TensorFlow (Google)

TENSORFLOW (TF) • Great AI framework built in Google –
Easy for developers and researchers – Production-ready • MapReduce – White paper only – Hadoop became the standard • TF open sourced to became the standard • Model marketplace

TF OVERVIEW • DataFlow programming language • describe a graph
of interacting operations that run entirely outside Python – Graph – Session • Abstraction levels – Low-level API (for researchers) – High-level API (GTD)

LOW-LEVEL API import numpy as np import tensorflow as tf
W = tf.Variable([.3], tf.float32) b = tf.Variable([-.3], tf.float32) x = tf.placeholder(tf.float32); y = tf.placeholder(tf.float32) linear_model = W * x + b loss = tf.reduce_sum(tf.square(linear_model - y)) optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) x_train = [1,2,3,4] y_train = [0,-1,-2,-3] init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) for i in range(1000): sess.run(train, {x:x_train, y:y_train}) curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

EVENTS EXAMPLE • AI for event discovery – Django web
app ( https://www.posterbat.com ) – Side-project I’m working on • Baby steps – Scrapped some event cover images from the web – Does an image have text in it? • AI can be used on everyday problems • Not only for cutting edge research problems – e.g. speech recognition

GETTING DATA FROM DJANGO $ ./manage.py shell -c 'from export_images
import run; run()' from events.models import Event def run(): for event in Event.objects.all(): prepare_event_image(event)

NORMALISING IMAGES USING PIL from PIL import Image def prepare_event_image(event):
with open(f'./images/{event.id}.png', 'wb') as f: size = (256, 256) try: image = Image.open(event.img) except ValueError: return image.thumbnail(size, Image.ANTIALIAS) region = image.crop((0, 0, *size)) region.save(f, 'png')

LABELING CLASSES

IMPORTING DATA INTO TF labels = pd.read_csv('./labels.csv', index_col=0) def read_data(folder):
path = './data/images/' + folder + '/' x = []; y = [] for filename in os.listdir(path): image_id = int(filename.split('.')[0]) # convert input 256x256 image to grayscale # flatten to a 1-d array of floats (0-255) im = misc.imread( path + filename,flatten=True ).flatten() x.append(im) y.append(int(labels.ix[image_id])) x = np.array(x) return tf.constant(x), tf.constant(y) train, train_labels = read_data('train')

TRAINING THE MODEL feature_columns = [tf.contrib.layers.real_valued_column("", dimension=65536)] classifier = tf.contrib.learn.DNNClassifier(
feature_columns=feature_columns, hidden_units=[1024, 512, 256], n_classes=2, model_dir="/tmp/model_dir", ) classifier.fit( x=train, y=train_labels, steps=20000 )

LOAD THE MODEL IN DJANGO • Save • Add the
model directory to your code repository • Load in Django saver = tf.train.Saver() saver.save(session, 'my-model') new_saver = tf.train.import_meta_graph('events/my-model.meta') new_saver.restore(session, 'events/my-model')

APPLY THE MODEL def get_new_img(): x = [] img_path =
'image.png' im = misc.imread(img_path, flatten=True).flatten() x.append(im) x = np.array(x) return x classifier.predict(input_fn=get_new_img, as_iterable=False)

TENSORBOARD – MONITORING • Open http://localhost:6006 • Monitor training at
runtime $ tensorboard --logdir /tmp/model_dir

TENSORBOARD – GRAPHS

TENSORBOARD – CLASSES

PERFORMANCE • CPU (C++ implementation – pretty efficient) • GPU
– even faster! • JIT compiler – Speed things up by adding a single line of code – Experimental • XLA compiler – Ahead-of time compilation – Run on embedded devices (phones, IoT)

DEPLOYMENT • Google Cloud & AWS offer VMs with GPUs
• FloydHub – Heroku for AI – https://www.floydhub.com/ Provider AWS Google Floyd Cost per hour ($) 0.99 0.795 0.432

LEARNING • Easy riding – https://changelog.com/podcast/219 – TF Dev Summit
‘17 videos – https://events.withgoogle.com/tensorflow-dev-summit/ • Docs & tutorial – https://www.tensorflow.org/get_started/get_started – https://medium.freecodecamp.com/big-picture-machine-learning-classifying-text-with- neural-networks-and-tensorflow-d94036ac2274 • Good free books – ESL – http://statweb.stanford.edu/~tibs/ElemStatLearn/ – Michael Nielsen – http://neuralnetworksanddeeplearning.com/ • Research – http://distill.pub/

THANKS! • Dražen Lučanin • @metakermit • Building apps with
a kick! https://punkrockdev.com/

Practical machine learning for everyday web apps

Practical machine learning for everyday web apps

Dražen Lučanin

More Decks by Dražen Lučanin

Other Decks in Programming

Featured

Transcript