Slide 1

Slide 1 text

a personal assistant who schedules meetings for you MACHINE LEARNING NYC FEBRUARY 2015 NEW YORK CITY x.ai

Slide 2

Slide 2 text

Outline of talk ● Product Intro ● Conceptual overview ● Current level of human involvement ● Time expressions in email text ● Performance ● Email classification ● Future Work

Slide 3

Slide 3 text

Product Intro ai trainer ai ● What is Amy or twin brother Andrew Amy is an automated assistant who schedules meetings for you ● What is meeting scheduling ● What involves scheduling a meeting ● Meeting negotiations happen over email ● What is not Amy ● Using Amy

Slide 4

Slide 4 text

Amy is a conversation model Host CC Amy Amy Guests New Meeting request Propose time to participant Request location from host Reject time / propose time Accept Time Location sent Meeting state needs to be determined - so that it can be resolved

Slide 5

Slide 5 text

Intermediate States Meeting Invite Accept Time Accept Location New Meeting preferences Initial State Changing meeting states trigger actions Meeting invite Conceptual Overview

Slide 6

Slide 6 text

Information Extraction Natural Language Processing Architectural design Current human involvement Natural Language Processing Preference analysis → The role of the AI Trainer Email Classification

Slide 7

Slide 7 text

Actual example of the simplest kind Sure thing. I’ve Cc’d Amy who can help us find a time with Matt on Monday Hi Matt, Happy to get something on Dennis’ calendar. Does Monday, Oct 13 at 11:00 AM EDT work ? Alternatively, Dennis is available on Monday, Oct 13 at 1:00 PM or 2:00 PM. Dennis’ offce is at 48 Wall Street, New York, NY 10005 (5th Floor). Amy 1-2 works Classification (request meeting). Information extraction Calendar preferences. Availability Classification (accept time). Information extraction

Slide 8

Slide 8 text

Temporal expression challenges : 1. Detection 2. Type 3. Coreference 4. Resolution Time expressions in Email text lets do Tuesday at around 4 Hour of Day, Day of Week, etc ... Merge Day of Week and Time February 25th, 2015 at 13:00 EST

Slide 9

Slide 9 text

Our Dataset Large dataset of fully annotated emails ⇒ We have undertaken a massive human annotation campaign where we fully annotate all emails going through our system to enable machine learning / training - Times, People, Location, Intents, etc … ← various frequencies for different cases (showing arbitrary slices)

Slide 10

Slide 10 text

Our Dataset

Slide 11

Slide 11 text

Temporal expression challenges : 1. Detection 2. Type 3. Coreference 4. Resolution Data-driven solutions Tokens Regex-based model combined with POS taggers (Conditional Random Fields) SUTime library We built our own set of cases based on top of Timex3 Defined a set of closed operators on time ‘cases’ to check whether they should be “merged” or not Next slide … :-) Use context ! Example of type merge operation TimeConstraint (13:00) + DayOfWeek(2) = WeekDayTime(2,13:00)

Slide 12

Slide 12 text

Our current approach ... → break complex logic into a set of simple binary questions Yes Yes Yes No Yes No No Yes No No Detected time entities Resolved times with email intent

Slide 13

Slide 13 text

This is where singular focus helps ... Accept or decline time ? Yes Yes Matches to any ? No No Previous proposed times ? ● Context ● Fuzzy matching ● Machine Learning

Slide 14

Slide 14 text

Performance Recall 85% Precision 97%

Slide 15

Slide 15 text

How does this compare to state-of-the-art ? Recall 85% Precision 97% x.ai on x.ai dataset using context → Ref: Context-dependent Semantic Parsing for Time Expressions, Kenton Lee , Yoav Artzi , Jesse Dodge∗, and Luke Zettlemoyer

Slide 16

Slide 16 text

Email Classification ● “One vs all” support vector machine for each intent ● Feature reduction through “mutual information” filter ● Optimise kernel params through automatic param survey Features used: ● N-grams (origin of Amy Ingram’s name… ) ● POS tagging ● Syntax rules ● Time-entities, people, locations ● Context !

Slide 17

Slide 17 text

Email Classification We are actively working on improving the performance of our classifiers … more data !

Slide 18

Slide 18 text

Future work Enhance Amy’s calendar analysis : ● Multiple people preferences ● Automatically detect patterns Meeting location model ● Suggest locations based on preferences ● Enhance travel time understanding Meeting social network ● Relative negotiator importance of meeting participants ● Relative importance of meetings

Slide 19

Slide 19 text

marcos @ x.ai chief data scientist and co-founder 48 Wall Street, 5th Floor New York, NY 10005 E: [email protected] T: @xdotai