Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning Labs #7 Warsaw

Deep Learning Labs #7 Warsaw

Deep Learning Labs #7 in Warsaw. An initiative by Nextgrid.ai to accelerate understanding & possibilities with deep-learning & reinforcement learning technologies.

Mathias Åsberg

February 22, 2020
Tweet

More Decks by Mathias Åsberg

Other Decks in Programming

Transcript

  1. View Slide

  2. View Slide

  3. Video link: https://www.youtube.com/watch?v=JHX87iv8YJA

    View Slide

  4. Who we are and what we do?

    View Slide

  5. View Slide

  6. View Slide

  7. Deep Learning Labs / Warsaw
    Season #01 Episode #07

    View Slide

  8. Reinforcement Learning
    (RL) Basics
    By Misha Zanka

    View Slide

  9. What is RL?

    View Slide

  10. Policy
    A policy is an agent's strategy.
    https://towardsdatascience.com/self-learning-ai-agents-iv-stochastic-policy-gradients-b53f088fce20

    View Slide

  11. Stable-baselines
    ● Stable Baselines is a set of improved
    implementations of Reinforcement
    Learning (RL) algorithms based on OpenAI
    Baselines.
    ● Main feature is unified interface for all
    models.
    ● You are free to use other frameworks, but
    this one is the most user-friendly

    View Slide

  12. Grading
    Today the ranking includes the following tasks :
    ● CartPole - 1 pt.
    ● LunarLander - 4 pt.
    ● Hopper - 6 pt.
    ● HalfCheetah - 12 pt.
    ● BipedalWalker - 24 pt.
    Extra 20% points to each task in case when team will make
    something special, like good presentation with insights or
    non-standard solution of the problem.
    We will maintain the leaderboard on our page.

    View Slide

  13. Submission
    You will have the link to the google form where you will need to
    send:
    ● If you used stable-baselines .zip with a model and name
    of the algorithm used.
    ● If you used smth else, send trained model and instruction
    how to extract actions from your policy

    View Slide

  14. Template for submitting results will be available

    View Slide