Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fitting-Humans-Stories-in-List-Columns_eRum

 Fitting-Humans-Stories-in-List-Columns_eRum

eRum 2018 Talk

62321e5935c9c0731462b8178a7423f8?s=128

OmaymaS

May 16, 2018
Tweet

Transcript

  1. FITTING HUMANS STORIES IN LIST COLUMNS Cases from an Online

    Recruitment Platform Omayma Said @OmaymaS
  2. The Leading Job Site in EGYPT

  3. 19th Century Adolphe Quetelet

  4. THE AVERAGE MAN (L’homme Moyen) 19th Century Adolphe Quetelet

  5. THE AVERAGE MAN Physical Weight, Height (Body Mass Index)

  6. THE AVERAGE MAN Social Marriage

  7. The AVERAGE MAN Moral Crimes

  8. PERFECTION THE AVERAGE MAN = For Quetelet

  9. If an individual at any given epoch of society possessed

    all the qualities of the AVERAGE MAN, he would represent all that is great, good, or beautiful. “ ” Adolphe Quetelet
  10. Who Is The “AVERAGE MAN” in Your Society?

  11. Are You Just a Deviant from The “AVERAGE MAN” ?

  12. Many Disagree !

  13. Now...

  14. Now... Tremendous Growth of Data

  15. Misuse of SUMMARY STATISTICS

  16. Misuse of SUMMARY STATISTICS

  17. None
  18. Misuse of SUMMARY STATISTICS

  19. None
  20. None
  21. None
  22. The Leading Job Site in EGYPT

  23. None
  24. What Do We Optimize For? Quality Quantity Relevance Matching Jobs

    & Job Seekers
  25. Let’s talk about DATA KPIs METRICS

  26. “The average job seeker applies for N jobs per month”

    Me:
  27. “The average number of applications per job this month is

    GREAT” Me:
  28. What AVERAGE Do You Measure?

  29. Who is The AVERAGE Job Seeker?

  30. Can We Tell Better STORIES About Our Users?

  31. None
  32. Effective Data Analysis Contextual Understanding + We can tell better

    stories with….
  33. Effective Data Analysis Contextual Understanding Culture Socioeconomic Status Market Dynamics

    +
  34. Effective Data Analysis Contextual Understanding Mindset Workflow Framework/Tools +

  35. Effective Data Analysis Mindset Workflow Framework/Tools + Contextual Understanding Culture

    Socioeconomic Status Market Dynamics
  36. Effective Data Analysis Effective Data Analysis Contextual Understanding Contextual Understanding

    Better Stories = +
  37. Effective Data Analysis Effective Data Analysis Contextual Understanding Contextual Understanding

    Actionable Insights = +
  38. Framework/Tools https://speakerdeck.com/hadley/tidyverse Compatible Packages +

  39. Main Concepts The Tidyverse Let’s focus on

  40. Tidy Data Three Main Concepts by: @_inundata & @jcheng

  41. A variable in a column An observation in a row

    Tidy your data And here you go! Tidy Data Three Main Concepts [tibble, tidyr, dplyr, and friends ]
  42. Data comes from different SOURCES And more...

  43. Data comes in different FORMATS And more...

  44. Data comes in different FORMATS DATAFRAME (TIBBLE) Read Tidy

  45. user job_id job_title company application_date Sara A1234 Software Developer Company

    A 2017-01-02 Sara A1568 Senior Software Engineer Company B 2017-03-02 Sara A1590 Software Engineer Company C 2017-03-03 …... ….. …. …. …. Omar A1234 Software Developer Company A 2017-01-03 Omar A1580 Android Developer Company C 2017-01-20 ….. …. …. …. ….. Tidy Data
  46. Nested Data Three Main Concepts

  47. Nested Data One row per group Instead of One row

    per observation Three Main Concepts [tidyr ]
  48. user job_id job_title company application_date Sara A1234 Software Developer Company

    A 2017-01-02 Sara A1568 Senior Software Engineer Company B 2017-03-02 Sara A1590 Software Engineer Company C 2017-03-03 …... ….. …. …. …. Omar A1234 Software Developer Company A 2017-01-03 Omar A1580 Android Developer Company C 2017-01-20 ….. …. …. …. ….. user_data %>% group_by(user) %>% nest(.key = “applications”) user applications Sara <Tibble [3 x 4]> Omar <Tibble [2 x 4]> …. …... Nested Data
  49. user job_id job_title company application_date Sara A1234 Software Developer Company

    A 2017-01-02 Sara A1568 Senior Software Engineer Company B 2017-03-02 Sara A1590 Software Engineer Company C 2017-03-03 …... ….. …. …. …. Omar A1234 Software Developer Company A 2017-01-03 Omar A1580 Android Developer Company C 2017-01-20 ….. …. …. …. ….. job_data %>% group_by(job_id) %>% nest(.key = “applications”) job_id applications A1234 <Tibble [2 x 4]> A1568 <Tibble [30 x 4]> A1590 <Tibble [100 x 4]> A1580 <Tibble [120 x 4]> Nested Data
  50. Functional Programming Three Main Concepts

  51. Functional Programming Handle iteration problems powerfully and emphasize the actions

    rather than the objects Three Main Concepts [purrr ]
  52. Let’s store models in columns job_id applications app_count A5638 <tibble

    [362 x 27]> 362 A8957 <tibble [110 x 27]> 110 ….. ….. ….. job_app_data<- job_app_data %>% mutate(glm_model = map(app_data, ~ glm(viewed ~ app_day, data = .x, family = binomial)))
  53. job_id applications app_count glm_model A5638 <tibble [362 x 27]> 362

    <S3: glm> A8957 <tibble [110 x 27]> 110 <S3: glm> ….. ….. ….. …. job_app_data<- job_app_data %>% mutate(glm_model = map(app_data, ~ glm(viewed ~ app_day, data = .x, family = binomial))) Let’s store models in columns
  54. user_data <- user_data %>% mutate(common_jobs = map2(applications, preferences, ~intersect(.x[[“job_title”],.y[[“job_title”]]) Iterate

    and answer more questions user applications preferences Sara <tibble [2 x 10]> <tibble [4 x 10]> Omar <tibble [2 x 15]> <tibble [2 x 10]> ….. ….. ….
  55. user applications preferences common_jobs Sara <tibble [2 x 10]> <tibble

    [4 x 10]> <chr [2]> Omar <tibble [2 x 15]> <tibble [2 x 10]> <chr [0]> ….. ….. …. Iterate and answer more questions user_data <- user_data %>% mutate(common_jobs = map2(applications, preferences, ~intersect(.x[[“job_title”],.y[[“job_title”]])
  56. Let’s Look Closer !

  57. Problem Shortage in applications for certain Software Development jobs Overall

    growth and good KPIs
  58. Problem Shortage in applications for certain Software Development jobs Dissatisfied

    Employers
  59. Problem Flagged by different sources Shortage in applications for certain

    Software Development jobs
  60. Problem Masked by high-level metrics Shortage in applications for certain

    Software Development jobs
  61. None
  62. Talent Shortage Hypotheses What if we just have a small

    pool of job seekers who are interested in the affected jobs?
  63. Hypotheses Irrelevant Jobs Maybe employers are not catching up with

    the global trends or job seekers aspirations!
  64. Hypotheses Hidden Jobs What if some jobs do not get

    enough exposure in the search/recommendation pages?
  65. st The Job’s Side Investigation

  66. What about applications details per job? The Job’s Side

  67. The Job’s Side Job applications details

  68. What about iOS job applications? The Job’s Side

  69. Job Applications Growth over time iOS Developers Jobs

  70. What happens to job posts on day X? Day 7

    iOS Developers Jobs
  71. What is special about these jobs? Mobile Developer (iOS, Android)

    iOS Developers Jobs
  72. What about the rest? iOS Developers Jobs

  73. More with Shiny... *Sample of Wuzzuf Job Posts

  74. nd The Job Seeker’s Side Investigation

  75. How do job seekers fill their profiles? The Job Seeker’s

    Side tidytext
  76. The Job Seeker’s Side How do job seekers fill their

    profiles? Details of job seeker’s keywords
  77. What about the repetition in the extracted keywords? The Job

    Seeker’s Side
  78. The Job Seeker’s Side What about the repetition in the

    extracted keywords? Summaries from Job Seeker's Keywords
  79. Which jobs match each user’s profile? The Job Seeker’s Side

    solrium
  80. Which jobs match each user’s profile? The Job Seeker’s Side

  81. Which jobs match each user’s profile? The Job Seeker’s Side

    Recommended Jobs Details
  82. What ACTIONS Did This Analysis Trigger?

  83. Talent Shortage Recommended Actions - Acquire more senior developers -

    Activate the existing developers - Support the community
  84. Irrelevant Jobs - Advise employers about the market Recommended Actions

    - Revisit preference-based matching
  85. Hidden Jobs - Revisit text fields indexing - Tune field

    weights for scoring - Improve mail recommendation Recommended Actions
  86. Main Concepts Tidy Data Nested Data Functional Programming Effective Data

    Analysis Contextual Understanding + = Actionable Insights @OmaymaS
  87. FITTING HUMANS STORIES IN LIST COLUMNS Cases from an Online

    Recruitment Platform Omayma Said @OmaymaS