Ben Zamanzadeh - Data Science @ Datapop - LA Data Science meetup - January 2015

Marketing Knowledge Graph Data Science & Machine Learning Dr. Ben
Zamanzadeh CTO DataPop

2 • Distributed Store Front (Omni-Channel Retail) – Official Site, Search &
PLA, Mobile, Social, Affiliates, Ad Networks, Shopping Sites, Digital Catalogs, Streaming Networks, Gaming & TV Consoles, Brick2Click, etc. • Digital Merchandising – Manage online Perception of Products & Brands • Smart Catalog – Integrate Knowledge of Consumer with Catalog • Rapidly Changing Landscape – Dynamic Market Conditions, – New Advertising methods, New Channels, – More Sophisticated competition • Large & Complex Operational & Data Scale Evolution of eCommerce

3 Dark Data Lack of visibility caused by: •  Big
Dark Data •  Massive amount of noisy & complex data •  Waves in the Ocean Dilemma •  Disjointed operations •  Machine Learning Models lack of context •  Need Human to generate Actionable Insight

4 Improvement Cycle Convert Big Data into Knowledge Graph Generate
Actionable Insights Semantic Advertising Beats Studio Over-Ear Headphones – Red – Beats by Dre

5 • Marketing Knowledge Graph – Marketing Knowledge Graph (MKG) consists of
a semantic network of brands, products, retailers, consumer intent & sentiments in addition to the advertising performance statistics and history for all publisher’s and channels. – Marketing Knowledge Graph is a knowledge base used by DataPop’s Semantic Advertising to generate and enhance Advertising Campaigns. • SEMANTIC ADVERTISING – Semantic Advertising achieves meaningful and optimized advertising through the use of semantic networks linking brands, products, retailers, to the consumer intent and sentiments using performance metrics (AKA Marketing Knowledge Graph). Marketing Knowledge Graph & Semantic Advertising

6 • Search Knowledge Graph – Used for Semantic Search: It is
used by search ranking algorithms to return most relevant results for search queries. Also it is intended to engage Searcher to explore Knowledge Graph for better answers. KG is a loosely connected network of vast number of entities spread around a very wide range of topics. • Marketing Knowledge Graph – Used for online Marketing & Advertising of Products and Services. It is used by data mining and analysis systems as well as advertising campaign management systems. – MKG is a tightly coupled semantic network, which is limited to products, brands, consumer intent, retailers and various advertising channels & publishers. Advertising performance is a major component of MKG, which changes rapidly due to changes in consumer behavior and eCommerce competitive landscape. •  Difference Between MKG and Search Knowledge Graph

7 Marketing Knowledge Graph MOST PINNED PRODUCT HALLOWEEN MINDSET PRICE
MINDSET $

8 Going Strong with Marketing Knowledge Graph

9 • Named Entity Recognition – Conditional Random Fields – Support Vector Machine
• Supervised Category Recognition –  Support Vector Machine –  Max Entropy –  Naïve Bayes –  Ensemble Methods • Unsupervised Linguistic Category Derivation • Topic Modeling, Explicit Semantic Analysis – Semantic Relatedness –  Semantic “Area Density”, Semantic Distance Our R&D Focus : Natural Language Processing & Machine Learning

10 • Max Entropy – Bag of words and features, Minimize un-observed
assumptions – Maximize Information Entropy : Closest to uniformity • Advantages – Good for large set of categories – Fairly Fast predictor – Works better with mildly noisy data – Does not assume independence of words in the bag of words – Worked well with 3 M training set – Confidence score has almost linear relationship with F1 • Disadvantages – Slower training Comparison of Statistical Methods

11 Max Entropy Confidence Score

12 • Support Vector Machines – Vector based solutions, using Cosine Similarity
– Establish support vectors that isolate each category hyper-plane • Advantages – Strong learner (usually best in Vector based) – Fast training and prediction – F1 similar to Max Ent (slightly below) – Good for text, image, bioinformatics • Disadvantages – Complex training, Optimization – Scores not as useful – Sensitive to noise – Better with longer documents, more signal – Better with smaller number of classes Comparison of Statistical Methods

13 • Naïve Bayes – Bag of words, Statistical counting • Disadvantages: – Weaker
learner at 3 M samples – Assumed words are independent in bag of words (not true) • Advantages – Adaptive • Recent Research on Training Data Size Comparison of Statistical Methods

14 ü  “Good Data” is the King ü  Unsupervised ü 
Non-Human Supervised ü  Max Entropy ü  SVM ü  Ensemble ü  CRF ü  Mix of Rules & ML What Doesn’t Work Ⓧ Naïve Bayes Ⓧ Human Gen Supervised Ⓧ Standalone methods Ⓧ Just use ML methods Ⓧ Biased Data Ⓧ Noisy Data Named Entity Recognition & Categorization What Works

15 Sample Problem for Topic Modelling

16 • Lobbyist Objective –  Lobbyist gets five minutes on the
senate floor to address senators –  Lobbyist is given 5 different subjects that he can pitch to the senators –  He can only pitch one of the subjects –  Does not know who is in room ahead of time, senators are scattered in the room and are busy talking to each other in groups. • He is given few seconds and a fast laptop –  Either select a group of senators and join them to present his case (any of the 5). He has to choose the most influential group that also is most interested in one of the subjects. –  Or go on the podium and address all senators and take the risk that he may loose all of the audience quickly if majority of room is not interested. • What he knows about each senator –  Level of influence, subcommittee membership, Past voting history –  What subjects senator is interested (from 1000’s of possible subjects) –  What each group is talking about right now (conversation subject) –  He can also see who is in each group (Floor map) Statement of problem: Lobbyist on the Senate Floor

17 Visualization of Semantic “Area Density”

18 Technology Stacks & Languages

We’re Looking Research Engineer Position Must Have Experience: Java, UIMA,
Mallet, ClearTK NER, Ensemble Methods

Thank You Dr. Ben Zamanzadeh

Ben Zamanzadeh - Data Science @ Datapop - LA Da...

Ben Zamanzadeh - Data Science @ Datapop - LA Data Science meetup - January 2015

Data Science LA

More Decks by Data Science LA

Featured

Transcript

Marketing Knowledge Graph Data Science & Machine Learning Dr. Ben

2 • Distributed Store Front (Omni-Channel Retail) – Official Site, Search &

3 Dark Data Lack of visibility caused by: •  Big

4 Improvement Cycle Convert Big Data into Knowledge Graph Generate

5 • Marketing Knowledge Graph – Marketing Knowledge Graph (MKG) consists of

6 • Search Knowledge Graph – Used for Semantic Search: It is

7 Marketing Knowledge Graph MOST PINNED PRODUCT HALLOWEEN MINDSET PRICE

8 Going Strong with Marketing Knowledge Graph

9 • Named Entity Recognition – Conditional Random Fields – Support Vector Machine

10 • Max Entropy – Bag of words and features, Minimize un-observed

11 Max Entropy Confidence Score

12 • Support Vector Machines – Vector based solutions, using Cosine Similarity

13 • Naïve Bayes – Bag of words, Statistical counting • Disadvantages: – Weaker

14 ü  “Good Data” is the King ü  Unsupervised ü

15 Sample Problem for Topic Modelling

16 • Lobbyist Objective –  Lobbyist gets five minutes on the

17 Visualization of Semantic “Area Density”

18 Technology Stacks & Languages

We’re Looking Research Engineer Position Must Have Experience: Java, UIMA,

Thank You Dr. Ben Zamanzadeh