CartPole - 1 pt. • LunarLander - 4 pt. • Hopper - 6 pt. • HalfCheetah - 12 pt. • BipedalWalker - 24 pt. Extra 20% points to each task in case when team will make something special, like good presentation with insights or non-standard solution of the problem. We will maintain the leaderboard on our page.
where you will need to send: • If you used stable-baselines .zip with a model and name of the algorithm used. • If you used smth else, send trained model and instruction how to extract actions from your policy