Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Insight Demo

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Insight Demo

Avatar for Ali Rohani

Ali Rohani

June 22, 2017
Tweet

Other Decks in Education

Transcript

  1. Therein lies the problem Successful applicant – job post matching

    Employment Agency Higher net income Increased popularity Higher profile
  2. Therein lies the problem Successful applicant – job post matching

    Employment Agency Higher net income Increased popularity Higher profile Consulting Project: A method to rate applicants for different job posts
  3. Question Does the applicant have a good chance for the

    job application? Assumption Same applicants have equal chance on the same job application Q
  4. A model to rate the applicants for the job posts

    M Requirements: Scale as the company grows over time Capture the independence of each job post Q The need for a model
  5. A model to rate the applicants for the job posts

    M Requirements: Scale as the company grows over time Capture the independence of each job post Q Logistic Regression Proprietary Rating model Naïve Bayes The need for a model
  6. A model to rate the applicants for the job posts

    M Requirements: Scale as the company grows over time Capture the independence of each job post Q Logistic Regression Proprietary Rating model Naïve Bayes The need for a model
  7. 250k+ tags: 2-5 years experience, MS office, BA degree, ….

    M Understanding the data before modeling Study the data •  All categorical data •  2000 distinct features
  8. Cleaning the data •  Feature Engineering •  Creating applicants’ tag

    vectors •  Clean up redundant tags 250k+ tags: 2-5 years experience, MS office, BA degree, …. M Preparing the data for modeling Study the data •  All categorical data •  2000 distinct features Reducing feature space: 20X reduc<on in feature space size
  9. Cleaning the data •  Feature Engineering •  Creating applicants’ tag

    vectors •  Clean up redundant tags Compute similarity matrices •  User-User •  Job-Job 250k+ tags: 2-5 years experience, MS office, BA degree, …. M Building the model Study the data •  All categorical data •  2000 distinct features Cosine similarity Reducing feature space: 20X reduc<on in feature space size
  10. Cleaning the data •  Feature Engineering •  Creating applicants’ tag

    vectors •  Clean up redundant tags Compute similarity matrices •  User-User •  Job-Job 250k+ tags: 2-5 years experience, MS office, BA degree, …. M Measuring the chances of success or failure •  Previous application history on the job •  Similarity of the applicants Building the model Study the data •  All categorical data •  2000 distinct features Cosine similarity Similarity weighted average Reducing feature space: 20X reduc<on in feature space size
  11. V Are the predictions acceptable? Need based quality metrics • 

    Accuracy •  Precision •  recall Method Multiple holdouts M Q Validating the model
  12. V Are the predictions acceptable? Need based quality metrics • 

    Accuracy •  Precision •  recall Method Multiple holdouts M Q Validating the model
  13. V Are the predictions acceptable? Need based quality metrics • 

    Accuracy •  Precision •  recall Method Multiple holdouts M Q Validating the model Company’s success rate 38% Model Precision: 76%
  14. V Are the predictions acceptable? Need based quality metrics • 

    Accuracy •  Precision •  recall Method Multiple holdouts M Q Validating the model Company’s success rate 38% Model Precision: 76% 2X Enhancement
  15. Deliverables and Influence Recommend jobs posts with high chance of

    success Rate & Rank the applicants for the job post
  16. Deliverables and Influence Recommend jobs posts with high chance of

    success Rate & Rank the applicants for the job post $ 1 million More revenue just in NYC office