Building a Neural Machine Translation System from Scratch

Building a Neural Machine Translation System from Scratch Deep Learning
World 2019, Munich Natasha Latysheva, Welocalize

This talk • 1. Introduction to machine translation • 2.
Data for machine translation • 3. Representing words with embeddings • 4. Deep learning architectures • Recurrent neural networks • Transformers • 5. Some fun things about MT • 6. Tech stack for machine translation

Intro • Welocalize • Language services • 1500+ employees •
8th largest globally, 4th largest US • NLP engineering team • 13 people • Remote across US, Ireland, UK, Germany, China img

Intro • Lots of localisation (translation) • Often for tech
companies • Also do: life sciences, banking, patent/legal • International marketing, site optimisation • Various NLP things: text-to-speech, sentiment, topics, classification, NER, etc. img

What is machine translation? • Automated translation between languages •
MT challenges: • Language is very complex, flexible with lots of exceptions • Language pairs might be very different • Lots of ”non-standard” usage • Not always a lot of data • But if people can do it, a model should be able to learn to do it

What is machine translation? • Automated translation between languages •
MT challenges: • Language is very complex, flexible with lots of exceptions • Language pairs might be very different • Lots of ”non-standard” usage • Not always a lot of data • But if people can do it, a model should be able to learn to do it • Why bother? • Huge industry and market demand because communication is important • Humans are expensive and slow • Research side: understanding language is probably key to intelligence

Building a Neural Machine Translation System fr...

Building a Neural Machine Translation System from Scratch

More Decks by nslatysheva

Other Decks in Technology

Featured

Transcript