Deep Neural Networks have revolutionized image processing, but these successes could not simply be transferred to text processing - until 2017 when the Transformers were introduced. This new architecture has brought great advances in many areas of speech processing, and some, such as generating compelling text, is now possible. Together we look at how transfer learning and attention work with the Transformer architecture.