$30 off During Our Annual Pro Sale. View Details »

Practical machine learning for everyday web apps

Practical machine learning for everyday web apps

Making TensorFlow the brain of your Django app.

A new wave of machine learning is in full swing. This talk gives an overview of the modern Python machine learning stack based on TensorFlow and my practical experiences from using it in a typical Django web app.

You've probably heard of the recent buzz surrounding deep learning, self-driving cars, Amazon Echo's speech interface, Google DeepMind's AlphaGo program beating a human Go master… The modern applications of machine learning are exploding! But can you benefit from the advances in AI in your everyday web apps? How much data do you need to be able to solve some concrete classification problems? What sort of machinery do you need to run it?

This talk will give an overveiw of Google's recently open sourced TensorFlow library and how it can be used for machine learning. To keep things grounded to realistic problems, we will go through an example Django web app for finding events where we want to classify images based on their content. We'll show how to train the desired image classifier using TensorFlow and use it to classify unknown images. We'll also cover the sort of infrastructure you need to use TensorFlow and give an overview of the available cloud services with specialised hardware support for high performance use cases.

PyDays, 6.5.2017.
https://cfp.linuxwochen.at/de/LWW17/public/events/594

Dražen Lučanin

May 06, 2017
Tweet

More Decks by Dražen Lučanin

Other Decks in Programming

Transcript

  1. PRACTICAL MACHINE LEARNING
    FOR EVERYDAY WEB APPS
    Making TensorFlow the brain of your Django app
    Dražen Lučanin
    @metakermit

    View Slide

  2. THE INTERNET IS FOR CAT PHOTOS!

    View Slide

  3. IS THAT MIFFLES IN YOUR PHOTO?

    View Slide

  4. AI: THE NEXT BIG THING™

    Communication was the Internet’s killer app
    – web apps storing data to DBs ruled
    – Social networks, online stores, messaging apps, productivity apps, online courses, …

    My view… AI will be the next killer app
    – Faster & cheaper GPU hardware
    – Lots of R&D around machine learning
    – Cool applications

    Self-driving cars

    Good speech recognition

    Automating repetitive manual tasks

    “The secret sauce”

    View Slide

  5. GETTING INTO AI

    …probably a good idea

    Applying AI != researching AI

    Modern AI frameworks
    – Torch (Facebook)
    – Theano (academy)
    – TensorFlow (Google)

    View Slide

  6. TENSORFLOW (TF)

    Great AI framework built in Google
    – Easy for developers and researchers
    – Production-ready

    MapReduce
    – White paper only
    – Hadoop became the standard

    TF open sourced to became the standard

    Model marketplace

    View Slide

  7. TF OVERVIEW

    DataFlow programming language

    describe a graph of interacting operations that run
    entirely outside Python
    – Graph
    – Session

    Abstraction levels
    – Low-level API (for researchers)
    – High-level API (GTD)

    View Slide

  8. LOW-LEVEL API
    import numpy as np
    import tensorflow as tf
    W = tf.Variable([.3], tf.float32)
    b = tf.Variable([-.3], tf.float32)
    x = tf.placeholder(tf.float32); y = tf.placeholder(tf.float32)
    linear_model = W * x + b
    loss = tf.reduce_sum(tf.square(linear_model - y))
    optimizer = tf.train.GradientDescentOptimizer(0.01)
    train = optimizer.minimize(loss)
    x_train = [1,2,3,4]
    y_train = [0,-1,-2,-3]
    init = tf.global_variables_initializer()
    sess = tf.Session()
    sess.run(init)
    for i in range(1000):
    sess.run(train, {x:x_train, y:y_train})
    curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
    print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

    View Slide

  9. EVENTS EXAMPLE

    AI for event discovery
    – Django web app ( https://www.posterbat.com )
    – Side-project I’m working on

    Baby steps
    – Scrapped some event cover images from the web
    – Does an image have text in it?

    AI can be used on everyday problems

    Not only for cutting edge research problems
    – e.g. speech recognition

    View Slide

  10. GETTING DATA FROM DJANGO
    $ ./manage.py shell -c 'from export_images import run; run()'
    from events.models import Event
    def run():
    for event in Event.objects.all():
    prepare_event_image(event)

    View Slide

  11. NORMALISING IMAGES USING PIL
    from PIL import Image
    def prepare_event_image(event):
    with open(f'./images/{event.id}.png', 'wb') as f:
    size = (256, 256)
    try:
    image = Image.open(event.img)
    except ValueError:
    return
    image.thumbnail(size, Image.ANTIALIAS)
    region = image.crop((0, 0, *size))
    region.save(f, 'png')

    View Slide

  12. LABELING CLASSES

    View Slide

  13. IMPORTING DATA INTO TF
    labels = pd.read_csv('./labels.csv', index_col=0)
    def read_data(folder):
    path = './data/images/' + folder + '/'
    x = []; y = []
    for filename in os.listdir(path):
    image_id = int(filename.split('.')[0])
    # convert input 256x256 image to grayscale
    # flatten to a 1-d array of floats (0-255)
    im = misc.imread(
    path + filename,flatten=True
    ).flatten()
    x.append(im)
    y.append(int(labels.ix[image_id]))
    x = np.array(x)
    return tf.constant(x), tf.constant(y)
    train, train_labels = read_data('train')

    View Slide

  14. TRAINING THE MODEL
    feature_columns = [tf.contrib.layers.real_valued_column("",
    dimension=65536)]
    classifier = tf.contrib.learn.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[1024, 512, 256],
    n_classes=2,
    model_dir="/tmp/model_dir",
    )
    classifier.fit(
    x=train, y=train_labels, steps=20000
    )

    View Slide

  15. LOAD THE MODEL IN DJANGO

    Save

    Add the model directory to your code repository

    Load in Django
    saver = tf.train.Saver()
    saver.save(session, 'my-model')
    new_saver = tf.train.import_meta_graph('events/my-model.meta')
    new_saver.restore(session, 'events/my-model')

    View Slide

  16. APPLY THE MODEL
    def get_new_img():
    x = []
    img_path = 'image.png'
    im = misc.imread(img_path, flatten=True).flatten()
    x.append(im)
    x = np.array(x)
    return x
    classifier.predict(input_fn=get_new_img, as_iterable=False)

    View Slide

  17. TENSORBOARD – MONITORING

    Open http://localhost:6006

    Monitor training at runtime
    $ tensorboard --logdir /tmp/model_dir

    View Slide

  18. TENSORBOARD – GRAPHS

    View Slide

  19. TENSORBOARD – CLASSES

    View Slide

  20. TENSORBOARD – CLASSES

    View Slide

  21. PERFORMANCE

    CPU (C++ implementation – pretty efficient)

    GPU – even faster!

    JIT compiler
    – Speed things up by adding a single line of code
    – Experimental

    XLA compiler
    – Ahead-of time compilation
    – Run on embedded devices (phones, IoT)

    View Slide

  22. DEPLOYMENT

    Google Cloud & AWS offer VMs with GPUs

    FloydHub
    – Heroku for AI
    – https://www.floydhub.com/
    Provider AWS Google Floyd
    Cost per
    hour ($) 0.99 0.795 0.432

    View Slide

  23. LEARNING

    Easy riding
    – https://changelog.com/podcast/219
    – TF Dev Summit ‘17 videos – https://events.withgoogle.com/tensorflow-dev-summit/

    Docs & tutorial
    – https://www.tensorflow.org/get_started/get_started
    – https://medium.freecodecamp.com/big-picture-machine-learning-classifying-text-with-
    neural-networks-and-tensorflow-d94036ac2274

    Good free books
    – ESL – http://statweb.stanford.edu/~tibs/ElemStatLearn/
    – Michael Nielsen – http://neuralnetworksanddeeplearning.com/

    Research
    – http://distill.pub/

    View Slide

  24. THANKS!

    Dražen Lučanin

    @metakermit

    Building apps with a kick!
    https://punkrockdev.com/

    View Slide