Slide 20
Slide 20 text
Training output
8600/33992 (epoch 2), train_loss = 1.209, time/batch = 2.264
8601/33992 (epoch 2), train_loss = 1.189, time/batch = 2.205
8602/33992 (epoch 2), train_loss = 1.198, time/batch = 2.482
8603/33992 (epoch 2), train_loss = 1.276, time/batch = 2.410
8604/33992 (epoch 2), train_loss = 1.213, time/batch = 2.367
8605/33992 (epoch 2), train_loss = 1.193, time/batch = 2.264
8606/33992 (epoch 2), train_loss = 1.218, time/batch = 2.291
8607/33992 (epoch 2), train_loss = 1.208, time/batch = 2.323
8608/33992 (epoch 2), train_loss = 1.195, time/batch = 2.336
8609/33992 (epoch 2), train_loss = 1.156, time/batch = 2.378
8610/33992 (epoch 2), train_loss = 1.236, time/batch = 2.468
8611/33992 (epoch 2), train_loss = 1.193, time/batch = 2.214
8612/33992 (epoch 2), train_loss = 1.222, time/batch = 2.368
8613/33992 (epoch 2), train_loss = 1.241, time/batch = 2.595
8614/33992 (epoch 2), train_loss = 1.208, time/batch = 2.730
8615/33992 (epoch 2), train_loss = 1.188, time/batch = 2.571