Deep Learning

Deep Learning for Natural Language Processing Sihem Romdhani

Agenda What is deep learning? Why is this important? How
to apply deep learning to natural language processing?

About Me Sihem Romdhani Software engineer at Veeva MEng in
Computer Engineering MSc in ECE at University of Waterloo - Deep learning - Speech/image recognition

What is machine learning? Training Text DOCUMENTS, IMAGES, SOUNDS, ...
Labels Features Vectors Machine Learning Algorithm

What is machine learning? Training Text DOCUMENTS, IMAGES, SOUNDS, ...
Labels Features Vectors Features Vector New Text DOCUMENTS, IMAGES, SOUNDS, ... Predictive Model Machine Learning Algorithm Expected Label

What is deep learning? image (pixel) edges object parts object
models Input Layer Output Layer SARA layers of abstraction DEEP NEURAL NETWORK (DNN)

Speech technology DeepMind beats the best in GO Nvidia to
drive car with Drive PX 2 super computer Apple’s Siri Amazon’s Alexa

Quiz Time!

Which is human? Which is machine? I’m in a love
affair I can’t share it ain’t fair Haha I’m just playin’ ladies you know I love you. I know my love is true and I know you love me too Girl I’m down for whatever cause my love is true SONG A For a chance at romance I would love to enhance But everything I love has turned to a tedious task One day we gonna have to leave our love in the past I love my fans but no one ever puts a grasp SONG B

DeepBeat.org Online tool for rap lyrics generator

Natural Language Processing (NLP) understanding and generating human language

Human language is difficult! variability kitty vs. cat ambiguity blue
jays vs. Blue Jays vague and latent “Eating babies can be messy.”

How can we represent words? AUDIO SPECTROGRAM IMAGE PIXELS VECTOR
OF REAL NUMBERS Audio Image Word

‘CAT’ 0 0.5 0 0.2 0.1 0 0.1 0 0.3
0 0.4 0.1 ‘KITTY’ Similarity? Embedding Embedding

The cat/kitty purrs Context defines similarity Infer the meaning of
words from the company they keep The cat/kitty hunts mice

Word to Vector The little cat hunts mice Embedding (‘cat’)
Context Window Predict

Word2Vec (Google) for learning embeddings 1. Collect large corpus of
text 2. Randomly initialize embeddings Each word is mapped into a fixed-size embedding with random values 3. Set up prediction task For each word, pick a random word from the context and consider it as the target (label) INPUT WORD → TARGET WORD ‘Cat’ → ‘Hunt’ 4. Learn the embeddings Train a model (logistic classifier) to predict the target (neighbours of each word) Update the embedding

Word2Vec Process ‘cat’ Embedding Linear model wVcat + b

‘cat’ Embedding Predicted target PREDICTION Linear model wVcat + b
Word2Vec Process

0 0 0 0 0 1 0 0 0 0
‘cat’ Embedding Predicted target Real target ‘hunt’ PREDICTION COMPARE Linear model wVcat + b Volcabulary Word2Vec Process

Similar words are nearby Word Embedding ‘cat’ ‘tiger’ ‘home’ ‘kitten’
‘dog’

Let’s discuss a real problem.

Pharma Company’s Market Expansion New product: pancreatic cancer treatment! Who
are the oncologists to inform about it?

Web scraping DIFFERENT DATA-EXTRACTION SCRIPTS FOR DIFFERENT WEB PAGES

Consolidating into one record

Extracting addresses from webpages abbreviations Avenue - Ave - Ave.
ambiguity in names main vs. Main Street missing elements mispellings misorder

Solution architecture WEB PAGE (HTML) TOKENIZATION WORD REPRESENTATION/ WORD EMBEDDING
DEEP NEURAL NETWORK CLASSIFIER POST PROCESSING OUTPUT EXTRACTED ADDRESS

TensorFlow for word embedding

Training a DNN classifier 123 Main Street, Florida 32209 123
Street Florida 32209 Main Context Words A O A: Address O: Other (non-address) Hidden Layer 3 Hidden Layer 2 Hidden Layer 1

Address extraction algorithm 1. Load the trained DNN model 2.
for each new web page do 3. The web page is pre-processed to remove the HTML tags, then tokenized 4. Each token (word) is replaced by its embedding vector 5. Create the DNN input by concatenating the embedding of the context 6. The DNN labels the central word of the context as A (ADDRESS) or O (OTHER) 7.Extract the sequence of tokens with A label 8.if the total number of tokens is within range then 9. Output the token block as extracted address 10.End if 11.End For Input: Trained DNN model and unlabeled web page Output: Extracted address

Applications of NLP Information extraction 文 A Social media analytics
Question & Answer Natural language generation Speech recognition

Useful Tools

Questions?

Slides available on Slack channel #veeva Stop by our booth
for questions and cool swag!

Deep Learning

Deep Learning

More Decks by Lucy

Other Decks in Education

Featured

Transcript