Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building A Bot on Elasticsearch

Building A Bot on Elasticsearch

In this talk I describe how we built the Digit bot using Elasticsearch and Mongo.

Avatar for Emile Baizel

Emile Baizel

May 18, 2017
Tweet

Other Decks in Technology

Transcript

  1. I’d like to withdraw $20 from Digit Got it. I

    will withdraw $20 for you. How does Digit know what to save? Digit studies your average spending… How users engage with Digit
  2. In the beginning… only one word commands withdraw save balance

    checking I will withdraw $20 for you. Your Digit balance is $100. How much do you want to save? Your checking is $1000.
  3. what people actually message in… i wanna take out money

    save me more, digit whats my balance yo what is checking thx
  4. what people actually message in… irrelevant messages tell me a

    joke! happy mothers day! are you a real person? sigh
  5. what people actually message in… completely irrelevant texts i was

    going to pick Sharon up from the nail salon at 3 but after last night she can find her own way home. i’m over it! "
  6. How do we go from here to there? withdraw save

    balance checking i wanna take out money save me more, digit whats my balance yo what is checking thx
  7. Two Primary Constraints of Building a Bot Zero False Positives!

    You only get one chance to respond ‘i don’t understand’ is better than a wrong answer
  8. 1. Gather and Map actual user messages 2. Index messages

    into Elasticsearch (ES) 3. Query ES 4. Test It 5. Ship It Five Point Plan to Success
  9. 1: Gather And Map in Mongo withdraw i wanna take

    out money Map messages to the correct response Everything is in Mongo Create an Admin ui for this Example user message response
  10. ES is designed for searching Deconstructs documents into tokens 2:

    Index Messages in Elasticsearch Very rich query language Scores search results
  11. Tokenizing options (many more…) Stemming: accounting -> account Lowercase: BaLaNcE

    -> balance Your Digit balance is $150. 2: Index Messages in Elasticsearch Stop words (ignored): please, thanks Contractions: will not, would not -> won’t
  12. Term query Range query Match query Your Digit balance is

    $150. 3: Query Elasticsearch Fuzzy query Query string Prefix query Common terms query …and many more (visit elastic.co)
  13. Term query Range query Match query Your Digit balance is

    $150. Fuzzy query Query string Prefix query Common terms query …and many more (visit elastic.co) 3: Query Elasticsearch
  14. Simple. Existence of terms in a document. Term Query Example

    Indexed: Tell me about your security? Tokens: tell, me, about, your, secure Response: Your money is FDIC insured.
  15. Term Query Tell me about your security Your money is

    FDIC insured. Tell me are you secure? Your money is FDIC insured. Example
  16. Term Query Tell me a joke! Your money is FDIC

    insured. Unfortunately, it’s not enough. Matches too easily on `tell, me`
  17. Matches misspellings withdrae -> withdraw Your Digit balance is $150.

    Fuzzy Search balancd -> balance decure -> secure Example
  18. Your Digit balance is $150. Fuzzy Search I want to

    withdrae $20 from Digit. Got it. I will withdraw $20 for you. balancd Your Digit balance is $150. Example
  19. Matches misspellings Has a cute name But not super helpful

    Your Digit balance is $150. Fuzzy Search
  20. Matches misspellings Has a cute name But not super helpful

    Your Digit balance is $150. Fuzzy Search
  21. Weight words based on their frequency High frequency words Low

    frequency words Common Terms Query i, a, the, can, want balance, account, checking
  22. Accuracy improves with document count Common Terms Query - Use

    3% as the cutoff for high vs low - Require 60% of low frequency words to be present Your query specifies weight and scoring Example
  23. Your Digit balance is $150. Common Terms Query what is

    my balance (5.0) what is the balance in digit (3.0) what is your name (0.5) what is my balance Document matches w/ scores Query: Example what, is, my High Freq: balance Low Freq:
  24. Test for correct matches Adjust scores, weights to find right

    balance Your Digit balance is $150. 4: Test It Test for false positives Test for unknowns We decided to do filter, fuzzy, and common queries with every search.
  25. Your Digit balance is $150. 5: Ship It Continue adding

    to the cortex of responses Slow roll it out to your users, ~10% Qualitatively measure accuracy