Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Detecting Fraudulent Skype Users via Machine Learning

Detecting Fraudulent Skype Users via Machine Learning

This is a short presentation I gave about fraud detection on Skype. It is based on an excellent paper by a team from Microsoft Research (not my own research!)

Research Paper: http://research.microsoft.com/pubs/205472/aisec10-leontjeva.pdf

My blog post: http://www.dataschool.io/detecting-fraudulent-skype-users-via-machine-learning/


Kevin Markham

March 13, 2014

More Decks by Kevin Markham

Other Decks in Research


  1. Detecting Fraudulent Skype Users via Machine Learning Presentation by Kevin

    Markham March 17, 2014 Based on the Research Paper: “Early Security Classification of Skype Users via Machine Learning” http://research.microsoft.com/pubs/205472/aisec10- leontjeva.pdf Paper and figures are copyright 2013 ACM
  2. What is Skype? • Tool for: – Voice-over-IP calls –

    Webcam videos – Instant messaging • Released in 2003, Microsoft bought in 2011 • At least 250 million monthly users
  3. Fraud on Skype • Credit card fraud • Online payment

    fraud • Spam instant messages • etc.
  4. Detecting Fraud on Skype Skype already employs techniques for detecting

    fraud: • “Majority of fraudulent users are detected within one day” Some challenges in fraud detection: • Legitimate accounts get hijacked and don’t necessarily “look” fraudulent • Sparse data
  5. Improving Fraud Detection Why is it worth improving? • Manual

    fraud detection is very expensive Who wrote this paper? • Team from Microsoft Research What was their goal? • “Detect stealthy fraudulent users” that fool Skype’s existing defenses for a long period of time
  6. Classification • Classification problem: Predicting whether a user is fraudulent

    (yes or no) • Data consists of features (or “variables” or “predictors”) and a response • Contrasts with regression problem: Predicting a continuous response like stock price
  7. Data Used in the Study • Anonymized snapshot provided by

    Skype • “Does not contain information about individual calls and their contents”
  8. Classification Workflow

  9. Feature Type 1: Profile Information • Gender • Age •

    Country • OS platform • etc.
  10. Feature Type 2: Skype Product Usage • Activity logs: –

    Connected days – Audio call days – Video call days – Chat days • Data is not “rich”: – Only indicates the number of days per month that the user performed that activity – Does not distinguish which pair of users communicated, number of calls per day, etc.
  11. Feature Type 3: Local Social Activity • Activity logs (graph

    data): – Adding a user – Being added by a user – Deleting a user – Being deleted by a user • Number of connections in their list • Acceptance rate of outbound friend requests
  12. Type 4: Global Social Activity • “PageRank” and “local clustering

    coefficient” computed for each user
  13. Classification Workflow • Pre-processing is unnecessary for profile info, but

    necessary for other feature types
  14. Pre-processing Activity Logs • Why? – Activity logs are time

    series data – Doesn’t make sense to use every data point as a feature – Makes more sense to “compress” the data into a single number • How? – For a given feature (e.g., audio calls), build a model of what “normal” user activity looks like and another model of what fraudulent activity looks like – For each user, score them based upon which model they are closer to – This is called computing “log-likelihood ratios”
  15. Computing Global Social Scores • PageRank: – Invented by Google

    – Give users a high score if they have many connections and if they have connections from other high-scoring users • Local clustering coefficient: – Measure of how well your connections are connected to one another
  16. Classification Workflow

  17. Choosing a Classifier • Trained several classifiers: Random Forests, support

    vector machines, logistic regression • Estimated prediction accuracy using cross- validation • Chose Random Forests because it had the best initial performance
  18. Rating Model Accuracy • ROC curve: Plots “true positive” rate

    vs “false positive” rate • Ideal classifier hugs top left corner
  19. Rating Model Accuracy (cont’d) • Best result is obtained by

    using all four feature types • At a false positive rate of 5%, true positive rate was 68% • Acceptable false positive rate is a business decision
  20. Projected Model Effects

  21. Performance on Different Fraud Types • Fraud types are defined

    by Skype but are not public • Type II is most common, and the classifier works best on that type
  22. Possible Model Improvements • Optimize separate models for each fraud

    type • Attempt to detect points in time when accounts are hijacked • Prevent fraudsters from evading the model
  23. Other Possible Applications • Predicting credit card fraud • Predicting

    failure in data center disks • Any environment in which user behavior can be monitored and fraudulent behavior “looks different” from normal behavior
  24. Thank You! Research Paper: http://research.microsoft.com/pubs/205472/aisec10- leontjeva.pdf My blog post: http://www.dataschool.io/detecting-fraudulent-skype-