Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The AI Revolution will not be Monopoilized

The AI Revolution will not be Monopoilized

Who's going to "win at AI"? There are now several large companies eager to claim that title. Others say that China will take over, leaving Europe and the US far behind. But short of true Artificial General Intelligence, there's no reason to believe that machine learning or data science will have a single winner. Instead, AI will follow the same trajectory as other technologies for building software: lots of developers, a rich ecosystem, many failed projects and a few shining success stories.

Ines Montani

November 22, 2018
Tweet

More Decks by Ines Montani

Other Decks in Programming

Transcript

  1. The AI Revolution
    will not be Monopolized
    Ines Montani
    Explosion AI

    View full-size slide

  2. Open-source library for
    industrial-strength Natural
    Language Processing in Python

    View full-size slide

  3. Open-source library for
    industrial-strength Natural
    Language Processing in Python
    Company and digital
    studio, bootstrapped
    with consulting

    View full-size slide

  4. Open-source library for
    industrial-strength Natural
    Language Processing in Python
    Company and digital
    studio, bootstrapped
    with consulting
    First commercial product:
    radically efficient data collection
    and annotation tool, powered
    by active learning

    View full-size slide

  5. Open-source library for
    industrial-strength Natural
    Language Processing in Python
    Company and digital
    studio, bootstrapped
    with consulting
    First commercial product:
    radically efficient data collection
    and annotation tool, powered
    by active learning
    You are here!

    View full-size slide

  6. Open-source library for
    industrial-strength Natural
    Language Processing in Python
    Company and digital
    studio, bootstrapped
    with consulting
    First commercial product:
    radically efficient data collection
    and annotation tool, powered
    by active learning
    Extension platform with a SaaS
    layer to help users scale up
    annotation projects
    SCALE
    You are here!

    View full-size slide

  7. Open-source library for
    industrial-strength Natural
    Language Processing in Python
    Company and digital
    studio, bootstrapped
    with consulting
    First commercial product:
    radically efficient data collection
    and annotation tool, powered
    by active learning
    Coming soon: pre-trained,
    customisable models for a variety
    of languages and domains
    You are here!
    Extension platform with a SaaS
    layer to help users scale up
    annotation projects
    SCALE

    View full-size slide

  8. The concept of an “AI race”
    is hopelessly confused.

    View full-size slide

  9. Implications of an “AI race”
    races are competitive
    races have winners and losers
    races have a start and an end

    View full-size slide

  10. What do we mean by AI?

    View full-size slide

  11. What do we mean by AI?
    Specific consumer products?

    View full-size slide

  12. What do we mean by AI?
    Specific consumer products?
    “Artificial General Intelligence”?

    View full-size slide

  13. What do we mean by AI?
    Specific consumer products?
    “Artificial General Intelligence”?
    A technocratic dictatorship?

    View full-size slide

  14. What do we mean by AI?
    Specific consumer products?
    “Artificial General Intelligence”?
    A technocratic dictatorship?
    A robot army?

    View full-size slide

  15. What do we mean by AI?
    Specific consumer products?
    “Artificial General Intelligence”?
    A technocratic dictatorship?
    A robot army?
    Machine Learning research?

    View full-size slide

  16. Monopolizing a new 

    product category?

    View full-size slide

  17. Monopolizing a new 

    product category?
    Lots of new products will use machine learning
    Many will be monopolized. By who? Fair question!
    If different companies make all these products,
    who “won” at AI?

    View full-size slide

  18. Being the first to develop
    “Artificial General Intelligence”?

    View full-size slide

  19. Being the first to develop
    “Artificial General Intelligence”?
    Very hard to extrapolate from here to proper AGI
    If someone develops AGI, will it even matter who?

    View full-size slide

  20. Being the first to develop
    “Artificial General Intelligence”?
    Whatever you believe about AGI...
    “AGI is science fiction”

    Okay, so there’s nothing to win
    “AGI is an existential threat”

    Okay, so nobody will win
    “AGI will solve all our problems”

    Okay, so everybody wins?

    View full-size slide

  21. Using new technology

    to oppress people?

    View full-size slide

  22. Using new technology

    to oppress people?
    Oppression is a risk we should talk about in AI
    But hardly a race we want to win!

    View full-size slide

  23. Winning a literal arms race?

    View full-size slide

  24. Winning a literal arms race?
    Government research will follow, not lead
    If everyone publishes openly, nobody will pull far ahead

    View full-size slide

  25. Publishing the most

    machine learning research?

    View full-size slide

  26. Publishing the most

    machine learning research?
    Open research is collaborative, not competitive
    If research is published, everyone wins

    View full-size slide

  27. Why don’t companies
    like Google keep their 

    research secret?

    View full-size slide

  28. Reasons companies publish
    Attract talent. You can’t get the best
    researchers if you don’t let them publish
    Inevitability. Trying to lock down the secrets
    wouldn’t work anyway
    Leverage. A higher “AI waterline” is good for
    their business

    View full-size slide

  29. # Company / Institution Total Papers %
    1 Google 60 8.8
    2 Carnegie Mellon University 48 7.1
    3 Mass. Institute of Technology 43 6.3
    4 Microsoft 40 5.9
    5 Stanford University 39 5.7
    6 University of CA, Berkeley 35 5.2
    7 Deepmind 31 4.6
    8 University of Oxford 22 3.2
    9 University of Illinois 20 2.9
    10 Georgia Institute of Technology 18 2.7
    Source: NIPS Accepted Papers Stats by Robbie Allen (Medium)
    Nobody “dominates” 

    machine learning research

    View full-size slide

  30. Original source: gpo.gov
    NOV
    20

    View full-size slide

  31. But what about the data?
    Reusing data is like reusing code. If it doesn’t
    do what you want, it’s not very useful.
    Personal data matters when it’s about you
    personally.
    General knowledge is easy to acquire. It
    doesn’t need unique proprietary datasets.

    View full-size slide

  32. Data alone won’t grant 

    anyone a monopoly
    Data has diminishing returns
    Making your dataset 10× bigger doesn’t 

    make it 10× better
    It’s just not that expensive! What datasets cost
    more than a manufacturing plant?

    View full-size slide

  33. We won’t all be buying
    “AI” from the AI Store.

    View full-size slide


  34. Companies are in-housing
    Machine learning is software development, 

    it needs to evolve with the project
    Nobody has a monopoly on AI expertise, 

    people are learning quickly
    Owning and controlling the data is crucial for 

    many applications

    View full-size slide

  35. Vendors can provide

    products, not magic
    The challenge is taking what’s theoretically
    possible and applying it to a problem
    What matters is making the right decisions for
    the larger application

    View full-size slide


  36. Buyers aren’t old and stupid
    Tech illiterate management stereotype is outdated
    Most companies let developers choose their tools
    Developers strongly prefer open technologies:
    more flexible, better career growth

    View full-size slide

  37. Rich open-source ecosystem
    Library GitHub URL GitHub Stars
    TensorFlow tensorflow/tensorflow 115k
    scikit-learn scikit-learn/scikit-learn 32k
    PyTorch pytorch/pytorch 21k
    XGBoost dmlc/xgboost 14k
    spaCy explosion/spacy 11k
    Gensim RaRe-Technologies/gensim 8k
    NLTK nltk/nltk 7k
    NLTK
    Source: GitHub (November 2018)

    View full-size slide

  38. So, who’s going to
    win at AI then?

    View full-size slide

  39. ✊ The AI revolution will not 

    be monopolized
    There’s no single “AI race” – lots of people are
    building lots of things
    There’s no magic solution waiting to be
    discovered
    Nothing about machine learning suggests a
    monopoly or a winner-takes-all market

    View full-size slide

  40. Thanks!
    Explosion AI

    explosion.ai
    Follow us on Twitter

    @_inesmontani

    @explosion_ai

    View full-size slide