Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Detecting Fraudulent Skype Users via Machine Learning

Detecting Fraudulent Skype Users via Machine Learning

This is a short presentation I gave about fraud detection on Skype. It is based on an excellent paper by a team from Microsoft Research (not my own research!)

Research Paper: http://research.microsoft.com/pubs/205472/aisec10-leontjeva.pdf

My blog post: http://www.dataschool.io/detecting-fraudulent-skype-users-via-machine-learning/


Kevin Markham

March 13, 2014


  1. Detecting Fraudulent Skype Users via Machine Learning Presentation by Kevin

    Markham March 17, 2014 Based on the Research Paper: “Early Security Classification of Skype Users via Machine Learning” http://research.microsoft.com/pubs/205472/aisec10- leontjeva.pdf Paper and figures are copyright 2013 ACM
  2. What is Skype? • Tool for: – Voice-over-IP calls –

    Webcam videos – Instant messaging • Released in 2003, Microsoft bought in 2011 • At least 250 million monthly users
  3. Fraud on Skype • Credit card fraud • Online payment

    fraud • Spam instant messages • etc.
  4. Detecting Fraud on Skype Skype already employs techniques for detecting

    fraud: • “Majority of fraudulent users are detected within one day” Some challenges in fraud detection: • Legitimate accounts get hijacked and don’t necessarily “look” fraudulent • Sparse data
  5. Improving Fraud Detection Why is it worth improving? • Manual

    fraud detection is very expensive Who wrote this paper? • Team from Microsoft Research What was their goal? • “Detect stealthy fraudulent users” that fool Skype’s existing defenses for a long period of time
  6. Classification • Classification problem: Predicting whether a user is fraudulent

    (yes or no) • Data consists of features (or “variables” or “predictors”) and a response • Contrasts with regression problem: Predicting a continuous response like stock price
  7. Data Used in the Study • Anonymized snapshot provided by

    Skype • “Does not contain information about individual calls and their contents”
  8. Classification Workflow

  9. Feature Type 1: Profile Information • Gender • Age •

    Country • OS platform • etc.
  10. Feature Type 2: Skype Product Usage • Activity logs: –

    Connected days – Audio call days – Video call days – Chat days • Data is not “rich”: – Only indicates the number of days per month that the user performed that activity – Does not distinguish which pair of users communicated, number of calls per day, etc.
  11. Feature Type 3: Local Social Activity • Activity logs (graph

    data): – Adding a user – Being added by a user – Deleting a user – Being deleted by a user • Number of connections in their list • Acceptance rate of outbound friend requests
  12. Type 4: Global Social Activity • “PageRank” and “local clustering

    coefficient” computed for each user
  13. Classification Workflow • Pre-processing is unnecessary for profile info, but

    necessary for other feature types
  14. Pre-processing Activity Logs • Why? – Activity logs are time

    series data – Doesn’t make sense to use every data point as a feature – Makes more sense to “compress” the data into a single number • How? – For a given feature (e.g., audio calls), build a model of what “normal” user activity looks like and another model of what fraudulent activity looks like – For each user, score them based upon which model they are closer to – This is called computing “log-likelihood ratios”
  15. Computing Global Social Scores • PageRank: – Invented by Google

    – Give users a high score if they have many connections and if they have connections from other high-scoring users • Local clustering coefficient: – Measure of how well your connections are connected to one another
  16. Classification Workflow

  17. Choosing a Classifier • Trained several classifiers: Random Forests, support

    vector machines, logistic regression • Estimated prediction accuracy using cross- validation • Chose Random Forests because it had the best initial performance
  18. Rating Model Accuracy • ROC curve: Plots “true positive” rate

    vs “false positive” rate • Ideal classifier hugs top left corner
  19. Rating Model Accuracy (cont’d) • Best result is obtained by

    using all four feature types • At a false positive rate of 5%, true positive rate was 68% • Acceptable false positive rate is a business decision
  20. Projected Model Effects

  21. Performance on Different Fraud Types • Fraud types are defined

    by Skype but are not public • Type II is most common, and the classifier works best on that type
  22. Possible Model Improvements • Optimize separate models for each fraud

    type • Attempt to detect points in time when accounts are hijacked • Prevent fraudsters from evading the model
  23. Other Possible Applications • Predicting credit card fraud • Predicting

    failure in data center disks • Any environment in which user behavior can be monitored and fraudulent behavior “looks different” from normal behavior
  24. Thank You! Research Paper: http://research.microsoft.com/pubs/205472/aisec10- leontjeva.pdf My blog post: http://www.dataschool.io/detecting-fraudulent-skype-