Slide 1

Slide 1 text

funneljoin: Defining a Tidy Grammar of Funnels in R Emily Robinson @robinson_es

Slide 2

Slide 2 text

What is a funnel? A funnel is a set of events by users over time

Slide 3

Slide 3 text

E-Commerce company redesigning the homepage ➔ Potential questions: ➔ What’s the last page people visit before coming to the homepage? ➔ What are all the product pages people see after the homepage? ➔ How many people who visit the homepage go on to buy something? ➔ What if we limit that to buy within two days?

Slide 4

Slide 4 text

Some other “first this than that” questions ➔ Which salmon migrated to station 1 then station 3 before station 2? ➔ What drugs did people take in the last month before starting drug X? ➔ What was the last ad clicked before registering? ➔ What companies had their stock hit $100 per share then drop to $40?

Slide 5

Slide 5 text

Example question ➔ When was each user’s first landing and their first registration afterward?

Slide 6

Slide 6 text

Old workflow ➔ When was each user’s first landing and their first registration afterward? 1. Filter landed for the first row per user 2. Left join with registered on user_id 3. Filter for timestamp.y > timestamp.x or NA 4. Filter for first row of timestamp.y

Slide 7

Slide 7 text

Old workflow ➔ Who registered the first time ever after their last landing? 1. Filter landed for the last row per user 2. Filter registered for first row per user 3. Left join with registered on user_id 4. Filter for timestamp.y > timestamp.x or NA

Slide 8

Slide 8 text

Funneljoin package: github.com/robinsones/funneljoin

Slide 9

Slide 9 text

Goals of this talk: you and funnels 5 minutes ago After this talk Tomorrow

Slide 10

Slide 10 text

Funneljoin Overview

Slide 11

Slide 11 text

after_join() ➔ When was each user’s first landing and their first registration afterward?

Slide 12

Slide 12 text

after_join() ➔ When was each user’s first landing and their first registration afterward? ➔ Table 1 ➔ Table 2 ➔ User column names(s) ➔ Time column name(s) ➔ Type of afterjoin

Slide 13

Slide 13 text

after_join() structure ➔ When was each user’s first landing and their first registration afterward? ➔ Table 1 ➔ Table 2 ➔ User column names(s) ➔ Time column name(s) ➔ Type of afterjoin

Slide 14

Slide 14 text

Different funnels Time Ad Click Conversion

Slide 15

Slide 15 text

First ad click and first conversion afterwards? Time Ad Click Conversion

Slide 16

Slide 16 text

First ad click and first conversion afterwards? Time Ad Click Conversion

Slide 17

Slide 17 text

First ad click and first conversion afterwards? Time Ad Click Conversion

Slide 18

Slide 18 text

First ad click and first conversion afterwards? Time first-firstafter Ad Click Conversion

Slide 19

Slide 19 text

Most recent ad click and all conversions afterward? Time Ad Click Conversion

Slide 20

Slide 20 text

Most recent ad click and all conversions afterward? Time Ad Click Conversion

Slide 21

Slide 21 text

Most recent ad click and all conversions afterward? Time Ad Click Conversion

Slide 22

Slide 22 text

Most recent ad click and all conversions afterward? Time last-any Ad Click Conversion

Slide 23

Slide 23 text

All ad clicks and all the conversions afterward Time Ad Click Conversion

Slide 24

Slide 24 text

All ad clicks and all the conversions afterward Time Ad Click Conversion

Slide 25

Slide 25 text

All ad clicks and all the conversions afterward Time any-any Ad Click Conversion

Slide 26

Slide 26 text

Different funnels Time first-firstafter any-any last-any Ad Click Conversion

Slide 27

Slide 27 text

16 types of funnels Table 1 type Table 2 type First (ever) First (ever) Last (ever) Last (ever) Any (all) Any (all) Lastbefore Firstafter Any combination of:

Slide 28

Slide 28 text

Demo

Slide 29

Slide 29 text

Analyzing StackOverflow R questions https://www.kaggle.com/stackoverflow/rquestions/downloads/rquestions.zip/3

Slide 30

Slide 30 text

How many people who ask a question later answer one?

Slide 31

Slide 31 text

How long does it take for people to answer their first question?

Slide 32

Slide 32 text

How long does it take for people to answer their first question?

Slide 33

Slide 33 text

What percent answer within a week of asking their first question?

Slide 34

Slide 34 text

Who answers a question before asking one?

Slide 35

Slide 35 text

Conclusion

Slide 36

Slide 36 text

Funneljoin goals “Impossible” Possible

Slide 37

Slide 37 text

Funneljoin goals Time- consuming & error-prone Quick & easy

Slide 38

Slide 38 text

Funneljoin goals Limited Creativity Asking & answering new questions

Slide 39

Slide 39 text

Learn more https://robinsones.github.io/funneljoin/ https://hookedondata.org/introducing-the- funneljoin-package/

Slide 40

Slide 40 text

Thank you! hookedondata.org @robinson_es github.com/robinsones/funneljoin Datascicareer.com