Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Practical machine learning for everyday web apps

Practical machine learning for everyday web apps

Making TensorFlow the brain of your Django app.

A new wave of machine learning is in full swing. This talk gives an overview of the modern Python machine learning stack based on TensorFlow and my practical experiences from using it in a typical Django web app.

You've probably heard of the recent buzz surrounding deep learning, self-driving cars, Amazon Echo's speech interface, Google DeepMind's AlphaGo program beating a human Go master… The modern applications of machine learning are exploding! But can you benefit from the advances in AI in your everyday web apps? How much data do you need to be able to solve some concrete classification problems? What sort of machinery do you need to run it?

This talk will give an overveiw of Google's recently open sourced TensorFlow library and how it can be used for machine learning. To keep things grounded to realistic problems, we will go through an example Django web app for finding events where we want to classify images based on their content. We'll show how to train the desired image classifier using TensorFlow and use it to classify unknown images. We'll also cover the sort of infrastructure you need to use TensorFlow and give an overview of the available cloud services with specialised hardware support for high performance use cases.

PyDays, 6.5.2017.
https://cfp.linuxwochen.at/de/LWW17/public/events/594

Dražen Lučanin

May 06, 2017
Tweet

More Decks by Dražen Lučanin

Other Decks in Programming

Transcript

  1. PRACTICAL MACHINE LEARNING FOR EVERYDAY WEB APPS Making TensorFlow the

    brain of your Django app Dražen Lučanin @metakermit
  2. AI: THE NEXT BIG THING™ • Communication was the Internet’s

    killer app – web apps storing data to DBs ruled – Social networks, online stores, messaging apps, productivity apps, online courses, … • My view… AI will be the next killer app – Faster & cheaper GPU hardware – Lots of R&D around machine learning – Cool applications • Self-driving cars • Good speech recognition • Automating repetitive manual tasks • “The secret sauce”
  3. GETTING INTO AI • …probably a good idea • Applying

    AI != researching AI • Modern AI frameworks – Torch (Facebook) – Theano (academy) – TensorFlow (Google)
  4. TENSORFLOW (TF) • Great AI framework built in Google –

    Easy for developers and researchers – Production-ready • MapReduce – White paper only – Hadoop became the standard • TF open sourced to became the standard • Model marketplace
  5. TF OVERVIEW • DataFlow programming language • describe a graph

    of interacting operations that run entirely outside Python – Graph – Session • Abstraction levels – Low-level API (for researchers) – High-level API (GTD)
  6. LOW-LEVEL API import numpy as np import tensorflow as tf

    W = tf.Variable([.3], tf.float32) b = tf.Variable([-.3], tf.float32) x = tf.placeholder(tf.float32); y = tf.placeholder(tf.float32) linear_model = W * x + b loss = tf.reduce_sum(tf.square(linear_model - y)) optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) x_train = [1,2,3,4] y_train = [0,-1,-2,-3] init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) for i in range(1000): sess.run(train, {x:x_train, y:y_train}) curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
  7. EVENTS EXAMPLE • AI for event discovery – Django web

    app ( https://www.posterbat.com ) – Side-project I’m working on • Baby steps – Scrapped some event cover images from the web – Does an image have text in it? • AI can be used on everyday problems • Not only for cutting edge research problems – e.g. speech recognition
  8. GETTING DATA FROM DJANGO $ ./manage.py shell -c 'from export_images

    import run; run()' from events.models import Event def run(): for event in Event.objects.all(): prepare_event_image(event)
  9. NORMALISING IMAGES USING PIL from PIL import Image def prepare_event_image(event):

    with open(f'./images/{event.id}.png', 'wb') as f: size = (256, 256) try: image = Image.open(event.img) except ValueError: return image.thumbnail(size, Image.ANTIALIAS) region = image.crop((0, 0, *size)) region.save(f, 'png')
  10. IMPORTING DATA INTO TF labels = pd.read_csv('./labels.csv', index_col=0) def read_data(folder):

    path = './data/images/' + folder + '/' x = []; y = [] for filename in os.listdir(path): image_id = int(filename.split('.')[0]) # convert input 256x256 image to grayscale # flatten to a 1-d array of floats (0-255) im = misc.imread( path + filename,flatten=True ).flatten() x.append(im) y.append(int(labels.ix[image_id])) x = np.array(x) return tf.constant(x), tf.constant(y) train, train_labels = read_data('train')
  11. TRAINING THE MODEL feature_columns = [tf.contrib.layers.real_valued_column("", dimension=65536)] classifier = tf.contrib.learn.DNNClassifier(

    feature_columns=feature_columns, hidden_units=[1024, 512, 256], n_classes=2, model_dir="/tmp/model_dir", ) classifier.fit( x=train, y=train_labels, steps=20000 )
  12. LOAD THE MODEL IN DJANGO • Save • Add the

    model directory to your code repository • Load in Django saver = tf.train.Saver() saver.save(session, 'my-model') new_saver = tf.train.import_meta_graph('events/my-model.meta') new_saver.restore(session, 'events/my-model')
  13. APPLY THE MODEL def get_new_img(): x = [] img_path =

    'image.png' im = misc.imread(img_path, flatten=True).flatten() x.append(im) x = np.array(x) return x classifier.predict(input_fn=get_new_img, as_iterable=False)
  14. PERFORMANCE • CPU (C++ implementation – pretty efficient) • GPU

    – even faster! • JIT compiler – Speed things up by adding a single line of code – Experimental • XLA compiler – Ahead-of time compilation – Run on embedded devices (phones, IoT)
  15. DEPLOYMENT • Google Cloud & AWS offer VMs with GPUs

    • FloydHub – Heroku for AI – https://www.floydhub.com/ Provider AWS Google Floyd Cost per hour ($) 0.99 0.795 0.432
  16. LEARNING • Easy riding – https://changelog.com/podcast/219 – TF Dev Summit

    ‘17 videos – https://events.withgoogle.com/tensorflow-dev-summit/ • Docs & tutorial – https://www.tensorflow.org/get_started/get_started – https://medium.freecodecamp.com/big-picture-machine-learning-classifying-text-with- neural-networks-and-tensorflow-d94036ac2274 • Good free books – ESL – http://statweb.stanford.edu/~tibs/ElemStatLearn/ – Michael Nielsen – http://neuralnetworksanddeeplearning.com/ • Research – http://distill.pub/