Upgrade to Pro — share decks privately, control downloads, hide ads and more …

So What's the Story? by Kerryn Gammie

Pycon ZA
October 12, 2018

So What's the Story? by Kerryn Gammie

As the world's data grows, so does its aptitude for AI.
In the context of business, however, translating black-box-magic into something more accessible for business-users to engage with is tricky. While this speaks to a larger problem of upskilling and making education more accessible, one method of translation is through story telling.

I learn best when an idea is relatable, simple, and colourful. This talk is going to look at how to convey complex ideas simply. I'm going to be covering two sections:

1. That's So Random (Forests)!
2. You Gotta (Neural) Network to Get Work

I'll run through the high level concepts and methodologies, and then show the work/code that was done to create a random forest, and a neural network. Note: this will cover how I built the RF, and NN using Python via Jupyter Notebook.

This session is for anyone who uses/wants to use ML to solve problems but struggles with translating the black-box-magic.
It's going to be an engaging, and slightly animated, talk with the intention of reinforcing concepts and showcasing different ways of explaining them.

Pycon ZA

October 12, 2018
Tweet

More Decks by Pycon ZA

Other Decks in Technology

Transcript

  1. Hello, world! I’m Kerryn I am here because I love

    learning (& PowerPoint animations.) As the world’s data grows, so does its aptitude for AI. I believe in ideas that are powerful but not exclusionary, and in concepts instantiated through stories. You’re in the right place if: 1. You like AI/want to like AI. 2. You want other people to like AI, too. 3. You like stories.
  2. What I’ll Cover 1 What’s it, how’s it, and why’s

    it? - Neural Nets & Random Forests 2 Story time 3 Tensorflow vs Keras? 4 Which Model 5 Know your audience
  3. I like my Decision Tree. Why move to a Random

    Forest? Random Forests give us improved ACCURACY Predicting discrete output variables Fraud = 0.1 Not fraud = 0.9 Classified as NOT FRAUD R1,000 <Limit < R2,000 = 0.42 Predicting continuous output variables
  4. B O O T S T R A P S

    A M P L E D : RANDOMISE AND ADD ITERATIONS TO CREATE ADDITIONAL SUBSETS OF D: D1 … DM SUBSET: SUBSET: SUBSET:
  5. CREATE RANDOM SUBSETS OF D: THIS IS THE TRAINING DATA

    13 13 13 13 . . . . . . 27 27 27 27 =¿ 21 21 21 21 . . . . . . 32 32 32 32 =¿ 10 10 10 10 . . . . . . 4 4 4 4 =¿ … Variable B Variable C Variable B RANK THE CLASSIFIERS ACROSS THE DECISION TREES CLASSIFIER WITH THE MOST “VOTES” = HIGHEST RANK IN THIS CASE, IT’S VARIABLE B THIS OUR BAGGED DATA (BOOTSTRAPPED AND AGGREGATED) Predict N Predict N Predict N
  6. R U N V A L I D A T

    I O N D A T A T H O R U G H R A N D O M F O R E S T A L G O R I T H M : OUT-OF-BAG DATA HAS NOT BEEN TRAINED, SO WANT TO SEE HOW WELL IT PREDICTS OUT-OF-BAG DATA HAS NOT BEEN TRAINED LIKE OUR DEVELOPMENT DATA. THIS IS HOW WE DISCERN THE ACCURACY OF OUR PREDICTIONS COMPARE PREDICTIONS WITH ACTUALS- CONFUSION MATRIX Predict N Predict N Predict N
  7. What is it? INPUT OUTPUT Universality Theorem: Regardless of the

    function in question, there is a neural network that can compute/approximate it.
  8. What is it? Brain analogy Composite function Pattern interpreter Computing

    system An artificial intelligence technique. It is an application of deep learning. It has become a popular in recent years due to the increase in data, and technological improvement.
  9. What is it? Activation Functions Sigmoid Tanh ReLU Leaky ReLU

    SoftMax Gaussian Activation functions make the neural net work. If you don’t apply an activation function the output function would be linear. LINEAR FUNCTION NON-LINEAR FUNCTION Recalibrates the gradients of the activation function, and then re- weightsand re-biases. Backprop How much should we adjust our weights and biases? Learning Rate How much should the previous outcome affect our weights and biases? Momentum
  10. Ingredients Configuration 1. Random w, b 2. Activation function 3.

    Adjust weights and biases 4. Iterations and error level Data Prep 1. Standardise 2. Training split 3. Testing split Architecture 1. Input layer 2. Hidden layer 3. Output layer Training & Testing 1. Validate trained data (epochs) 2. Run test data (out-of-time) through the model
  11. How it Works 1 2 Node1 Node2 Node3 1 =1

    2 =0 w4 w1 w2 w3 w6 w5 b1 b2 b3 [( +( [( +( [( +( ACTIVATION FUNCTION ACTIVATION FUNCTION ACTIVATION FUNCTION N o n - l i n e a r o u t p u t PREDICTION PREDICTION BACKPROPOGATION Calculate the error and the delta, and adjust weights and biases Use the learning rate and momentum to determine how big the adjustments should be. ITERATE
  12. Libraries Tensorflow vs Keras Whoa. Slow down. Can you build

    it? Build it, break it. Understand it, use it! Principally; only use it if you can build, break, and understand it.
  13. width length . . . . . . . .

    . . . 1 Input Layers Hidden Layer Output Layer 0
  14. Which Model for What? NO FREE LUNCH No one best

    model; there are tradeoffs. Testing multiple models may lead to overfitting. Training many models is time consuming and resource-intensive.
  15. Which Model for What? NO FREE LUNCH WORK ON YOUR

    FEATURES & IMPROVE UPON YOUR DATA. HOW MUCH DO FALSE-POSITIVES / FALSE NEGATIVES MATTER?
  16. Which Story for Which Person? Relaying your models to an

    audience hello. print('hello, world!’) hello, world This is what it is. Here is how it affects us. Here’s the tutorial, github page, and journal article. This is the theory. Here’s how it works for us.