Slide 1

Slide 1 text

O L I V E R Z E I G E R M A N N E M B A R C C H R I S T I A N H I D B E R B S Q U A R E TO REINFORCEMENT LEARNING WITH TF–AGENTS & TENSOR FLOW 2.0 I N T R O D U C T I O N

Slide 2

Slide 2 text

Orso‘s World

Slide 3

Slide 3 text

T H E T A S K – 3 0 0 W A T E R + 1 0 0 0 H O N E Y – 5 0 M E A D O W – 1 0 0 F O R E S T – 2 0 0 M O U N T A I N

Slide 4

Slide 4 text

-1 0 0 – 5 0 M E A D O W – 1 0 0 F O R E S T – 2 0 0 M O U N T A I N – 3 0 0 W A T E R + 1 0 0 0 H O N E Y C L E V E R S T R A T E G Y + 1 0 0 0 --------- - 6 0 0 -1 0 0 -1 0 0 -1 0 0 -1 0 0

Slide 5

Slide 5 text

-1 0 0 – 5 0 M E A D O W – 1 0 0 F O R E S T – 2 0 0 M O U N T A I N – 3 0 0 W A T E R + 1 0 0 0 H O N E Y + 1 0 0 0 --------- - 4 0 0 -1 0 0 -2 0 0 -3 0 0 N O T S O C L E V E R S T R A T E G Y

Slide 6

Slide 6 text

– 5 0 M E A D O W – 1 0 0 F O R E S T – 2 0 0 M O U N T A I N – 3 0 0 W A T E R + 1 0 0 0 H O N E Y G O A L – F I N D A G O O D S T R A T E G Y F O R O R S O

Slide 7

Slide 7 text

C H O O S I N G A N A C T I O N O B S E R V A T I O N ( G A M E S T A T E ) P O L I C Y ( G A M I N G S T R A T E G Y ) ? A C T I O N

Slide 8

Slide 8 text

O B S E R V A T I O N ( G A M E S T A T E ) P O L I C Y ( G A M I N G S T R A T E G Y ) A C T I O N C H O O S I N G A N A C T I O N

Slide 9

Slide 9 text

O B S E R V A T I O N ( G A M E S T A T E ) P O L I C Y ( G A M I N G S T R A T E G Y ) A C T I O N C H O O S I N G A N A C T I O N

Slide 10

Slide 10 text

C H O O S I N G A N A C T I O N O B S E R V A T I O N ( G A M E S T A T E ) P O L I C Y ( G A M I N G S T R A T E G Y ) A C T I O N

Slide 11

Slide 11 text

E N V I R O N M E N T ( G A M E E N G I N E ) P O L I C Y ( G A M I N G S T R A T E G Y ) R E I N F O R C E M E N T L E A R N I N G A P P R O A C H A G E N T ( A L G O R I T H M )

Slide 12

Slide 12 text

E N V I R O N M E N T ( G A M E E N G I N E ) P O L I C Y ( G A M I N G S T R A T E G Y ) R E I N F O R C E M E N T L E A R N I N G A P P R O A C H A G E N T ( A L G O R I T H M )

Slide 13

Slide 13 text

E N V I R O N M E N T ( G A M E E N G I N E ) P O L I C Y ( G A M I N G S T R A T E G Y ) R E I N F O R C E M E N T L E A R N I N G A P P R O A C H A G E N T ( A L G O R I T H M ) P L A Y M E A S U R E U P D A T E

Slide 14

Slide 14 text

IMPLEMENTATION WITH TF–AGENTS

Slide 15

Slide 15 text

R E W A R D O B S E R V A T I O N T F - A G E N T S & T F 2 . 0 E N V I R O N M E N T R L A L G O R I T H M O p e n A I G y m • S t e p • R e s e t T F A g e n t s • R e i n f o r c e • D Q N • P P O • S A C • … T F 2 . 0 / K E R A S P O L I C Y

Slide 16

Slide 16 text

D E M O O R S O O N T F – A G E N T S

Slide 17

Slide 17 text

O R S O ’ s E N V I R O N M E N T ( G A M E E N G I N E ) E N V I R O N M E N T : G R A P H W O R L D L O G I C H O N E Y P O T P L A C E M E N T R E W A R D S : S T E P S H O N E Y P O T O B S E R V A T I O N : O R S O ’ s P O S I T I O N N E X T S T E P R E W A R D S P O S I T I O N R E W A R D S A C T I O N S : D I R E C T I O N S

Slide 18

Slide 18 text

P O L I C Y : F R O M O B S E R V A T I O N T O A C T I O N H I D D E N L A Y E R I N P U T L A Y E R O U T P U T L A Y E R S O F T M A X P O L I C Y C O L L E C T O R B U F F E R A L G O R I T H M E N V I R O N M E N T P 1 P 2 P 3 P 4 x 1 x 2 . . . x n

Slide 19

Slide 19 text

T F 2 . 0 / K E R A S P O L I C Y T F - A G E N T S & T F 2 . 0 C O L L E C T O R B U F F E R A L G O R I T H M B A C K P R O P E N V I R O N M E N T L E A R N P L A Y

Slide 20

Slide 20 text

PRACTITIONERS VIEW

Slide 21

Slide 21 text

P RAC T IT ION E RS VIE W PPO (outline) DQN (outline)

Slide 22

Slide 22 text

P RAC T IT ION E RS VIE W PPO (outline) DQN (outline)

Slide 23

Slide 23 text

P RAC T IT ION E RS VIE W

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

R E W A R D O B S E R V A T I O N R L A L G O R I T H M P O L I C Y I M P L E M E N T I N G R E I N F O R C E M E N T L E A R N I N G E N V I R O N M E N T

Slide 26

Slide 26 text

D I S C L A I M E R : E A S Y A G E N T S I S D E V E L O P E D B Y T H E S P E A K E R S .

Slide 27

Slide 27 text

D E M O O R S O O N E A S Y A G E N T S D I S C L A I M E R : E A S Y A G E N T S I S D E V E L O P E D B Y T H E S P E A K E R S .

Slide 28

Slide 28 text

R E I N F O R C E M E N T L E A R N I N G A S A S E R V I C E

Slide 29

Slide 29 text

YOU R S P E AKE RS [email protected] +41 44 260 54 00 https://www.linkedin.com/in/christian-hidber/ https://github.com/christianhidber C HRI STI A N HI D BE R [email protected] @DJCordhose https://www.linkedin.com/in/oliver-zeigermann-34989773 https://github.com/DJCordhose O L I VE R Z E I GE RMA N N

Slide 30

Slide 30 text

THANK YOU