How to apply the NLP model to advertising

Goal • How to apply the NLP model to advertising
• Why BERT

Agenda • NLP models • Transformer • BERT • Sequence
analyze vs Human analyze • Advertising by analyzed human

Who am I? • Used Collaborative-Filtering for Recommendation • Predict
correct/incorrect answer for TOEIC • Learn “Transformer” from NLP experts • seq-to-seq • LSTM • encoder/decoder • RNN • query/key/value

• Because of their internal memory, RNN’s can remember important
things about the input they received, which allows them to be very precise in predicting what’s coming next. This is why they’re the preferred algorithm for sequential data like time series, speech, text, fi nancial data, audio, video, weather and much more Recurrent Neural Network

Recurrent Neural Network

sequence-to-sequence աח ೟Үী р׮ I go to school seq2seq

sequence-to-sequence աח ೟Үী р׮ I go to school seq2seq Encoder
Decoder Context

sequence-to-sequence աח ೟Үী р׮ I go to school seq2seq Encoder
Decoder Context RNN RNN RNN RNN I h1 go h2 to h3 school h4 Contex Vector Contex Vector RNN RNN RNN աח ೟Үী р׮

Vanishing Gradients Problem

Long short-term memory

sequence-to-sequence(LSTM) աח ೟Үী р׮ I go to school seq2seq Encoder
Decoder Context LSTM LSTM LSTM LSTM I h1 go h2 to h3 school h4 Contex Vector Contex Vector LSTM LSTM LSTM աח ೟Үী р׮

Long sentence problem

StateOfTheArt

Transformer

Transformer(Attention is all you need) • Abstract  “The dominant sequence
transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models are connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. …”

Transformer Architecture

Attention

Top-down / Bottom-up

Top-down attention

Attention map

Transformer Architecture I go to school աח ೟Үী р׮

Transformer Architecture ࠺о ৡ׮ it is raining

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding •
Abstract  “… Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. …”

BERT input representation

Fine-tuning tasks

Q&A about NLP models

Sequence vs Human

Human by meta data

Mutable vs Immutable

Handle Immutable object

Behavior

Intention

Use case

References • VIT • CLIP • BERT • GPT-3 •
Transformer • RNN • Sequence to Sequence • https://arxiv.org/pdf/1409.0473.pdf • https://arxiv.org/pdf/1905.10949.pdf • https://www.youtube.com/watch?v=I1wJ_-kEvNQ

How to apply the NLP model to advertising

How to apply the NLP model to advertising

More Decks by Buzzvil

Other Decks in Programming

Featured

Transcript