Gradient Decent Uses partial derivatives to obtain a gradient for the weights and biases Take small steps along the gradient to minimize the error Dino Ratcliffe Neural Networks in action
Institute of Standards and Technology) Handwritten digit dataset 28x28 images 60,000 training images 10,000 test images Figure 3: MNIST Examples Dino Ratcliffe Neural Networks in action
= tf.train.get_checkpoint_state( dir/to/save ) if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path) print loaded model at: dir/to/save else: print No model found at: dir/to/save Dino Ratcliffe Neural Networks in action
state a: action taken from the initial state R: reward recieved after taking action a St+1: resulting state after taking action a t: is initial state terminal Dino Ratcliffe Neural Networks in action
queue of experiences Train on a sample of experiences from buffer at each time step Breaks up the smoothness of the environment aiding convergence Dino Ratcliffe Neural Networks in action