Why shouldn't you use AI in your business

Why shouldn't you use AI in your business

Agenda:
- An overview of AI applications in various areas.
- 5 fundamental questions, an ML system may try to answer.
- Defining a problem, to make it solvable with AI.

The slides have been presented on the CodeteCON LBN#2.

308c44d9a8939c8253a81a7afca54793?s=128

Kacper Łukawski

April 12, 2018
Tweet

Transcript

  1. 3.

    The most typical reasons of applying ML ✓ just because

    it exists and everybody does it ✓ availability of data ✓ a well described problem we don’t know the algorithm for, or an algorithmic approach is not performant enough and the general correctness is not necessary input output algorithm
  2. 4.

    Current AI applications ✓ self-driving cars ✓ simultaneous translation ✓

    fraud detection ✓ medical diagnosis ✓ ...and many more
  3. 5.

    Current AI applications It’s still narrow AI only. We are

    far away from General Artificial Intelligence. ✓ self-driving cars ✓ simultaneous translation ✓ fraud detection ✓ medical diagnosis ✓ ...and many more
  4. 7.

    Is this A or B? Typically, we have a dataset

    containing some observations (i.e. images) and a limited set of categories, every observation belongs to, but only to one of them. In other words, for each observation there is only one category assigned.
  5. 8.

    Is this A or B? Images are a common variation

    of such questions. What’s in the picture? Is that a cat, a dog or something else?
  6. 9.

    Is this A or B? This type of questions are

    related to classification algorithms. Such an algorithm chooses the most probable category for given observation. A majority of classification methods will require to provide so called training dataset containing many observations from all the categories. These examples are then used in order to find a generalization for each category, and these generalized categories may be in turn used for the further labeling the observation our model hasn’t seen before. car
  7. 10.

    Is this weird? For this family of problems we usually

    have the dataset of some observations collected and we want to detect some anomalies. The underlying assumption is, there is some expected behaviour, and we want to detect any unusual pattern.
  8. 11.

    Is this weird? Imagine, we have a history of card

    transactions for a particular person. Amount: 30.00 PLN Date: 2018-01-03 11:31 a.m. Location: Cracow, Poland Amount: 30.00 PLN Date: 2018-01-04 11:45 a.m. Location: Cracow, Poland Amount: 15.00 EUR Date: 2018-01-04 05:24 p.m. Location: Online Amount: 36.00 PLN Date: 2018-01-06 12:54 p.m. Location: Warsaw, Poland Amount: 20.30 EUR Date: 2018-01-09 10:00 a.m. Location: Helsinki, Finland Amount: 15.00 EUR Date: 2018-01-08 12:32 p.m. Location: Helsinki, Finland Amount: 1000.00 EUR Date: 2018-01-08 08:03 a.m. Location: Sao Paulo, Brazil Amount: 7.50 EUR Date: 2018-01-08 07:37 a.m. Location: Helsinki, Finland
  9. 12.

    Is this weird? Imagine, we have a history of card

    transactions for a particular person. Amount: 30.00 PLN Date: 2018-01-03 11:31 a.m. Location: Cracow, Poland Amount: 30.00 PLN Date: 2018-01-04 11:45 a.m. Location: Cracow, Poland Amount: 15.00 EUR Date: 2018-01-04 05:24 p.m. Location: Online Amount: 36.00 PLN Date: 2018-01-06 12:54 p.m. Location: Warsaw, Poland Amount: 20.30 EUR Date: 2018-01-09 10:00 a.m. Location: Helsinki, Finland Amount: 15.00 EUR Date: 2018-01-08 12:32 p.m. Location: Helsinki, Finland Amount: 1000.00 EUR Date: 2018-01-08 08:03 a.m. Location: Sao Paulo, Brazil Amount: 7.50 EUR Date: 2018-01-08 07:37 a.m. Location: Helsinki, Finland
  10. 13.

    Is this weird? The question is related to anomaly detection

    algorithms. Such algorithms try to detect novelty - a pattern that has never occurred before. In other words, the algorithm detects outliers.
  11. 14.

    How much / how many? Asking a question about a

    numerical value is also quite common. We no longer have a limited number of categories to ask for, but a continuous space that an output may come from. Commonly, we have a set of measurements given and try to find the pattern that will allow us to predict the value in previously unseen conditions.
  12. 15.

    How much / how many? Weather prediction is a good

    example of such a problem. If we describe it in terms of numerical values, like temperature, humidity, etc., we can easily ask a question about their values in a particular point of time.
  13. 16.

    How much / how many? This family of questions may

    be solved with regression algorithms. Their purpose is to predict the numerical value, usually based on the historical values under different conditions.
  14. 17.

    How is this organized? Supposing we have a dataset of

    observations we don’t know too much about. As we would like to have an overview of what is inside, understand it a little bit, we can ask a question if the dataset is organized in any way. The difference to the previous question is - we usually don’t have any labels assigned to the entries of our dataset.
  15. 18.

    How is this organized? The most known example of such

    problem is probably the IRIS dataset. It contains the examples of three different kinds of irises - each observation is described in terms of sepal and petal width and length.
  16. 19.

    How is this organized? Clustering is a method that can

    help to answer this kind of questions. Such algorithms try to divide the dataset into groups in which the similarity of observations is higher than between two examples coming from two different groups - so called clusters.
  17. 20.

    What should I do next? Sometimes we might want to

    model the ongoing process in which there are several small decisions to be taken. It is quite similar to the way our brains work - when we have a goal to achieve, there are usually many different ways to get there - some of which are more effective than the others.
  18. 21.

    What should I do next? The most common examples of

    such problems are video games. Typically, we have a limited set of possible actions at each point of time and need to decide what to do next in order to win the game. In this very moment we cannot say which action is the best possible one.
  19. 22.

    What should I do next? The basic idea behind the

    reinforcement learning algorithms is to learn from the experience, through trial-and-error approach. An ML system based on such algorithm is punished or rewarded for every performed action with a goal to maximize the overall reward.
  20. 23.

    Summary: 5 fundamental questions of ML ✓ Is this A

    or B? Classification ✓ Is this weird? Anomaly detection ✓ How much / how many? Regression ✓ How is this organized? Clustering ✓ What should I do next? Reinforcement learning
  21. 25.

    The direction According the the survey conducted by “Business Over

    Broadway”, one of the top 10 challenges that data professionals have faced in the Past Year is: “The lack of clear question to be answering of a clear direction to go with the available data”.
  22. 26.

    Defining a problem ✓ What is a problem you want

    to solve with Machine Learning? ✓ Did you try to use an algorithmic approach? Is there any way to solve your problem using it? ✓ Are you able to rephrase the issue to match one of the fundamental questions? Is there anyone who can do it? ✓ Do you have the data for training? ✓ Are you aware of the fact, there is a high risk of failure?
  23. 27.
  24. 28.

    References ✓ https://docs.microsoft.com/en-us/azure/machine-learning/studio/data-science-for-beginners-th e-5-questions-data-science-answers ✓ https://cs.stanford.edu/~acoates/stl10/ ✓ http://en.ilmatieteenlaitos.fi/past-30-day-weather ✓ https://archive.ics.uci.edu/ml/datasets/iris

    ✓ https://medium.com/machine-learning-for-humans/reinforcement-learning-6eacf258b265 ✓ https://vimeo.com/229966263 ✓ https://medium.com/@RebelScience/people-ask-me-what-do-you-have-against-deep-learning-43 814df3175 ✓ http://businessoverbroadway.com/top-10-challenges-to-practicing-data-science-at-work