Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ben Zamanzadeh - Data Science @ Datapop - LA Data Science meetup - January 2015

Data Science LA
January 20, 2015
700

Ben Zamanzadeh - Data Science @ Datapop - LA Data Science meetup - January 2015

Data Science LA

January 20, 2015
Tweet

More Decks by Data Science LA

Transcript

  1. Marketing Knowledge Graph
    Data Science & Machine Learning
    Dr. Ben Zamanzadeh
    CTO DataPop

    View Slide

  2. 2
    • Distributed Store Front (Omni-Channel Retail)
    – Official Site, Search & PLA, Mobile, Social, Affiliates, Ad Networks,
    Shopping Sites, Digital Catalogs, Streaming Networks, Gaming & TV
    Consoles, Brick2Click, etc.
    • Digital Merchandising
    – Manage online Perception of Products & Brands
    • Smart Catalog
    – Integrate Knowledge of Consumer with Catalog
    • Rapidly Changing Landscape
    – Dynamic Market Conditions,
    – New Advertising methods, New Channels,
    – More Sophisticated competition
    • Large & Complex Operational & Data Scale
    Evolution of eCommerce

    View Slide

  3. 3
    Dark Data
    Lack of visibility caused by:
    •  Big Dark Data
    •  Massive amount of noisy & complex data
    •  Waves in the Ocean Dilemma
    •  Disjointed operations
    •  Machine Learning Models lack of context
    •  Need Human to generate Actionable Insight

    View Slide

  4. 4
    Improvement Cycle
    Convert Big Data into Knowledge Graph
    Generate Actionable Insights Semantic Advertising
    Beats Studio
    Over-Ear
    Headphones –
    Red –
    Beats by Dre

    View Slide

  5. 5
    • Marketing Knowledge Graph
    – Marketing Knowledge Graph (MKG) consists of a semantic network
    of brands, products, retailers, consumer intent & sentiments in
    addition to the advertising performance statistics and history for all
    publisher’s and channels.
    – Marketing Knowledge Graph is a knowledge base used by DataPop’s
    Semantic Advertising to generate and enhance Advertising
    Campaigns.
    • SEMANTIC ADVERTISING
    – Semantic Advertising achieves meaningful and optimized
    advertising through the use of semantic networks linking
    brands, products, retailers, to the consumer intent and
    sentiments using performance metrics (AKA Marketing
    Knowledge Graph).
    Marketing Knowledge Graph & Semantic Advertising

    View Slide

  6. 6
    • Search Knowledge Graph
    – Used for Semantic Search: It is used by search ranking algorithms
    to return most relevant results for search queries. Also it is
    intended to engage Searcher to explore Knowledge Graph for better
    answers. KG is a loosely connected network of vast number of
    entities spread around a very wide range of topics.
    • Marketing Knowledge Graph
    – Used for online Marketing & Advertising of Products and Services. It
    is used by data mining and analysis systems as well as advertising
    campaign management systems.
    – MKG is a tightly coupled semantic network, which is limited to
    products, brands, consumer intent, retailers and various advertising
    channels & publishers. Advertising performance is a major
    component of MKG, which changes rapidly due to changes in
    consumer behavior and eCommerce competitive landscape.
    • 
    Difference Between MKG and Search Knowledge Graph

    View Slide

  7. 7
    Marketing Knowledge Graph
    MOST PINNED
    PRODUCT
    HALLOWEEN
    MINDSET
    PRICE MINDSET
    $

    View Slide

  8. 8
    Going Strong with Marketing Knowledge Graph

    View Slide

  9. 9
    • Named Entity Recognition
    – Conditional Random Fields
    – Support Vector Machine
    • Supervised Category Recognition
    –  Support Vector Machine
    –  Max Entropy
    –  Naïve Bayes
    –  Ensemble Methods
    • Unsupervised Linguistic Category Derivation
    • Topic Modeling, Explicit Semantic Analysis
    – Semantic Relatedness
    –  Semantic “Area Density”, Semantic Distance
    Our R&D Focus : Natural Language Processing & Machine Learning

    View Slide

  10. 10
    • Max Entropy
    – Bag of words and features, Minimize un-observed
    assumptions
    – Maximize Information Entropy : Closest to uniformity
    • Advantages
    – Good for large set of categories
    – Fairly Fast predictor
    – Works better with mildly noisy data
    – Does not assume independence of words in the bag of words
    – Worked well with 3 M training set
    – Confidence score has almost linear relationship with F1
    • Disadvantages
    – Slower training
    Comparison of Statistical Methods

    View Slide

  11. 11
    Max Entropy Confidence Score

    View Slide

  12. 12
    • Support Vector Machines
    – Vector based solutions, using Cosine Similarity
    – Establish support vectors that isolate each category hyper-plane
    • Advantages
    – Strong learner (usually best in Vector based)
    – Fast training and prediction
    – F1 similar to Max Ent (slightly below)
    – Good for text, image, bioinformatics
    • Disadvantages
    – Complex training, Optimization
    – Scores not as useful
    – Sensitive to noise
    – Better with longer documents, more signal
    – Better with smaller number of classes
    Comparison of Statistical Methods

    View Slide

  13. 13
    • Naïve Bayes
    – Bag of words, Statistical counting
    • Disadvantages:
    – Weaker learner at 3 M samples
    – Assumed words are independent in bag of words
    (not true)
    • Advantages
    – Adaptive
    • Recent Research on Training Data Size
    Comparison of Statistical Methods

    View Slide

  14. 14
    ü  “Good Data” is the King
    ü  Unsupervised
    ü  Non-Human Supervised
    ü  Max Entropy
    ü  SVM
    ü  Ensemble
    ü  CRF
    ü  Mix of Rules & ML
    What Doesn’t Work
    Ⓧ Naïve Bayes
    Ⓧ Human Gen Supervised
    Ⓧ Standalone methods
    Ⓧ Just use ML methods
    Ⓧ Biased Data
    Ⓧ Noisy Data
    Named Entity Recognition & Categorization
    What Works

    View Slide

  15. 15
    Sample Problem for Topic Modelling

    View Slide

  16. 16
    • Lobbyist Objective
    –  Lobbyist gets five minutes on the senate floor to address senators
    –  Lobbyist is given 5 different subjects that he can pitch to the senators
    –  He can only pitch one of the subjects
    –  Does not know who is in room ahead of time, senators are scattered in the
    room and are busy talking to each other in groups.
    • He is given few seconds and a fast laptop
    –  Either select a group of senators and join them to present his case (any of
    the 5). He has to choose the most influential group that also is most
    interested in one of the subjects.
    –  Or go on the podium and address all senators and take the risk that he may
    loose all of the audience quickly if majority of room is not interested.
    • What he knows about each senator
    –  Level of influence, subcommittee membership, Past voting history
    –  What subjects senator is interested (from 1000’s of possible subjects)
    –  What each group is talking about right now (conversation subject)
    –  He can also see who is in each group (Floor map)
    Statement of problem: Lobbyist on the Senate Floor

    View Slide

  17. 17
    Visualization of Semantic “Area Density”

    View Slide

  18. 18
    Technology Stacks & Languages

    View Slide

  19. We’re Looking
    Research Engineer Position
    Must Have Experience:
    Java, UIMA, Mallet, ClearTK
    NER, Ensemble Methods

    View Slide

  20. Thank You
    Dr. Ben Zamanzadeh

    View Slide