Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for the rescue

Machine Learning for the rescue

Gathering the data is not a problem today. The bigger challenge is to understand these informations and draw some conclusions. Fortunately, we can use some techniques like machine learning to „teach” computer how to learn from our data. Fast artificial neural networks, random forests, SVMs, classification, clustering - just to name a few concepts ready to use… We will apply all these solutions to PHP application to deliver automatic insights/predictions and create a real business value for a client. By the end of this session you will be familiar with Machine Learning ideas and prepared to solve unsolvable problems in PHP.

Mariusz Gil

June 24, 2016
Tweet

More Decks by Mariusz Gil

Other Decks in Programming

Transcript

  1. RESCUE
    Machine Learning for a
    Mariusz Gil

    View full-size slide

  2. CLIENT PROBLEM

    View full-size slide

  3. 1M BACKLINKS
    CLASSIFY THEM

    View full-size slide

  4. OK
    NOT OK
    I DON’T CARE

    View full-size slide

  5. OK
    NOT OK
    I DON’T CARE

    View full-size slide

  6. OK
    NOT OK
    I DON’T CARE

    View full-size slide

  7. T(URL) → [1, 2, 3, …]

    View full-size slide

  8. IF-OLOGY
    UGLY CODE
    FOR POC
    1ST APPROACH

    View full-size slide

  9. I DON’ KNOW

    View full-size slide

  10. NAIVE
    MACHINE LEARNING
    2ND APPROACH

    View full-size slide

  11. NAIVE
    MACHINE LEARNING
    2ND APPROACH

    View full-size slide

  12. DATA ML
    TASK
    SEND TO
    RESULTS
    CALCULATE

    View full-size slide

  13. RECIPE FOR A FAILURE
    DOING WITHOUT KNOWING

    View full-size slide

  14. DATA ORIENTED
    MACHINE LEARNING
    WORKFLOW
    3RD APPROACH, FINAL

    View full-size slide

  15. A COMPUTER PROGRAM
    IS SAID TO LEARN FROM EXPERIENCE E
    WITH RESPECT TO SOME CLASS OF TASKS T
    AND PERFORMANCE MEASURE P
    IF ITS PERFORMANCE AT TASKS IN T,
    AS MEASURED BY P,
    IMPROVES WITH EXPERIENCE E

    View full-size slide

  16. DATA ML
    TASK
    PREPARED, INPUT FOR
    RESULTS
    WITH PERFORMANCE
    EXPERIENCE FEEDBACK LOOP
    LEARNING, VALIDATING

    View full-size slide

  17. ML
    TASK
    CLASSIFICATION
    REGRESSION
    CLUSTERING
    DIMENSIONALITY REDUCTION
    ASSOCIATION RULES

    View full-size slide

  18. EXAMPLE TIME :)

    View full-size slide

  19. FAST ARTIFICIAL
    NEURAL NETWORK
    CLASSIFICATION

    View full-size slide

  20. 80 2 4
    2 1
    1 0 0 0
    1 9
    1 0 0 0
    1 8
    1 0 0 0
    9 8
    1 0 0 0
    4 3
    1 0 0 0
    5 8
    1 0 0 0
    5 1
    1 0 0 0
    9 10
    1 0 0 0
    4 7
    1 0 0 0
    5 9

    View full-size slide

  21. $num_input = 2;
    $num_output = 4;
    $num_layers = 3;
    $num_neurons_hidden = 4;
    $desired_error = 0.001;
    $max_epochs = 500000;
    $epochs_between_reports = 1000;
    $ann = fann_create_standard($num_layers, $num_input, $num_neurons_hidden, $num_output);
    if ($ann) {
    fann_set_activation_function_hidden($ann, FANN_SIGMOID_SYMMETRIC);
    fann_set_activation_function_output($ann, FANN_SIGMOID_SYMMETRIC);
    $filename = dirname(__FILE__) . "/coordination_system.data";
    if (fann_train_on_file(
    $ann,
    $filename,
    $max_epochs,
    $epochs_between_reports,
    $desired_error
    )) {
    fann_save($ann, dirname(__FILE__) . "/coordination_system.net");
    }
    fann_destroy($ann);
    }

    View full-size slide

  22. $train_file = (dirname(__FILE__) . "/coordination_system.net");
    $ann = fann_create_from_file($train_file);
    if ($ann) {
    $input = array($argv[1], $argv[2]);
    $calc_out = fann_run($ann, $input);
    fann_destroy($ann);
    }

    View full-size slide

  23. SUPERVISED
    LEARNING

    View full-size slide

  24. SUPPORT
    VECTOR MACHINES
    CLASSIFICATION

    View full-size slide

  25. A SUPPORT VECTOR MACHINE
    PERFORMS CLASSIFICATION
    BY FINDING THE HYPERPLANE
    THAT MAXIMIZES THE MARGIN
    BETWEEN THE GIVEN CLASSES

    View full-size slide

  26. SUPERVISED
    LEARNING

    View full-size slide

  27. K-MEANS
    CLUSTERING

    View full-size slide

  28. IRIS DATASET
    1936, RONALD FISHER

    View full-size slide

  29. require 'vendor/autoload.php';
    use Phpml\Dataset\Demo\Iris;
    use Phpml\Clustering\KMeans;
    $dataset = new Iris();
    $kmeans = new KMeans(3);
    echo 'Dataset size: ' . count($dataset->getSamples()) . PHP_EOL;
    $clusters = $kmeans->cluster($dataset->getSamples());
    foreach ($clusters as $i => $cluster) {
    echo 'Cluster #' . $i . ' :' . count(($cluster)) . PHP_EOL;
    }

    View full-size slide

  30. $ php -f ./iris-clustering.php
    Dataset size: 150
    Cluster #0 :39
    Cluster #1 :50
    Cluster #2 :61
    $ php -f ./iris-clustering.php
    Dataset size: 150
    Cluster #0 :38
    Cluster #1 :50
    Cluster #2 :62
    $ php -f ./iris-clustering.php
    Dataset size: 150
    Cluster #0 :39
    Cluster #1 :50
    Cluster #2 :61
    $ php -f ./iris-clustering.php
    Dataset size: 150
    Cluster #0 :96
    Cluster #1 :24
    Cluster #2 :30

    View full-size slide

  31. RESULTS STABILITY

    View full-size slide

  32. UNSUPERVISED
    LEARNING

    View full-size slide

  33. RECIPE FOR A FAILURE
    DON’T YOU KNOW YOUR DATA?

    View full-size slide

  34. PREDICTING VALUES
    REGRESSION

    View full-size slide

  35. HOW MANY BRITISH POUNDS… EURO
    I SHOULD EARN AS DEVELOPER
    ACCORDING TO MY SKILLSET?

    View full-size slide

  36. | age | linkedin_php | salary |
    |-----|--------------|--------|
    | 20 | 0 | 2000 |
    | 26 | 8 | 3975 |
    | 30 | 10 | 4000 |

    View full-size slide

  37. YEARS →
    LINKEDIN PHP →

    View full-size slide

  38. require 'vendor/autoload.php';
    use Phpml\Dataset\ArrayDataset;
    use Phpml\Regression\LeastSquares;
    $dataset = new ArrayDataset(
    [
    [20, 0],
    [26, 8],
    [30, 10],
    ],
    [
    2000,
    3975,
    4000,
    ]
    );
    $regression = new LeastSquares();
    $regression->train($dataset->getSamples(), $dataset->getTargets());
    echo $regression->predict(array_slice($argv, 1)) . PHP_EOL;

    View full-size slide

  39. | age | city_size | linkedin_php | salary |
    |-----|-----------|--------------|--------|
    | 20 | 900000 | 0 | 2000 |
    | 20 | 400000 | 0 | 1800 |
    | 25 | 450000 | 8 | 3700 |
    | 26 | 900000 | 8 | 3975 |
    | 30 | 100000 | 10 | 4000 |
    | 30 | 500000 | 10 | 3500 |

    View full-size slide

  40. SUPERVISED
    LEARNING

    View full-size slide

  41. …JVM, PYTHON

    View full-size slide

  42. ML IS NOT
    A SINGLE RUN
    OF ALGORITHM

    View full-size slide

  43. IT’S A PROCESS

    View full-size slide

  44. ML
    PROCESS
    DEFINE A PROBLEM
    ANALYZE YOUR DATA
    UNDERSTAND YOUR DATA
    PREPARE DATA FOR ML
    SELECT & RUN ALGO(S)
    TUNE ALGO(S) PARAMETERS
    SELECT FINAL MODEL
    VALIDATE FINAL MODEL

    View full-size slide

  45. ML
    PROCESS
    DEFINE A PROBLEM
    ANALYZE YOUR DATA
    UNDERSTAND YOUR DATA
    PREPARE DATA FOR ML
    SELECT & RUN ALGO(S)
    TUNE ALGO(S) PARAMETERS
    SELECT FINAL MODEL
    VALIDATE FINAL MODEL

    View full-size slide

  46. | age | city_size | linkedin_php | salary |
    |-----|-----------|--------------|--------|
    | 20 | 900000 | 0 | 2000 |
    | 20 | 400000 | 0 | 1800 |
    | 25 | 450000 | 8 | 3700 |
    | 26 | 900000 | 8 | 3975 |
    | 30 | 100000 | 10 | 4000 |
    | 30 | 500000 | 10 | 3500 |

    View full-size slide

  47. | age | city_size | linkedin_php | salary | currency |
    |-----|-----------|--------------|--------|----------|
    | 20 | 900000 | 0 | 2000 | EUR |
    | 20 | 400000 | 0 | 1800 | USD |
    | 25 | 450000 | 8 | 3700 | USD |
    | 26 | 900000 | 8 | 3975 | USD |
    | 30 | 100000 | 10 | 4000 | USD |
    | 30 | 500000 | 10 | 3500 | USD |

    View full-size slide

  48. ONE MORE THING…

    View full-size slide

  49. PHPCON CFP WILL BE
    CLOSED TOMORROW!
    http://phpcon.pl/2016/en/cfp

    View full-size slide

  50. THANKS
    mariuszgil
    HAPPY LEARNING YOUR MACHINES!

    View full-size slide