Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning from Biometric Fingerprints to prevent...

Learning from Biometric Fingerprints to prevent Cyber Security Threats

User identification is paramount to guarantee the security of a web system.
However, classical information to identify users (e.g. user credentials or
browsing history) are not reliable enough. Biometrics
markers (e.g. keystroke patterns), on the other hand, represent a viable
reference to correctly identify users. These data can be
easily collected - no disruptive change in user experience - and can be
derived from users' interaction with the system.
In this talk I will present how Machine and Deep Learning techniques can be effectively used to learn unique identification markers from keystroke
behavioural patterns of users aiming to prevent user-account
hijacking frauds.

Valerio Maggio

January 31, 2018
Tweet

More Decks by Valerio Maggio

Other Decks in Research

Transcript

  1. LOREM I P S U M LEARNING FROM BIOMETRICS To

    prevent #CyberSecurity threats Valerio Maggio @leriomaggio Data Scientist & Pythonistas @ FBK [email protected]
  2. DOLOR S I T A M E T SORRY, WHO?

    • Post Doc Researcher • Background in CS • Interested in Machine & Deep Learning • Core in Biomedicine & Environment
 here We’re looking for students for 
 Internship & (PhD) Thesis • Applied Machine Learning (a.k.a. Data Science) https://mpbalab.fbk.eu
  3. DONEC F I N I B U S A C

    • Geek & Nerd • Fellow Pythonista since 2006 this is a better me !-) SORRY, WHO? 100K points if you get this pun !-) github.com/leriomaggio
  4. NULLA C O N G U E S A P

    I E N WHAT THE CLOUDS SAY
  5. VITAE A U G U E C O N S

    E C T E T U R WHAT THE CLOUDS KEEP SAYING…
  6. AT CONVALLIS M I A U C T O R

    . WHAT THE CLOUDS STILL SAY…
  7. FUSCE F E U G I A T WHAT THE

    CLOUDS 
 FINALLY SAY! Learning from Data for future predictions
  8. SED SUSCIPIT I N E L I T M O

    L L I S SUPERVISED SETTING • Input Data are accompanied with labels the ML model can learn from • i.o.w. labels are reference for the model to estimate the expected outcomes
  9. FRINGILLA M A E C E N A S G

    R A V I D A S UNSUPERVISED SETTING • No label is provided • Learning directly from data • e.g. Clustering
  10. FUSCE F E U G I A T WHAT THE

    CLOUDS 
 FINALLY SAY!
  11. IPSUM E G E T A U C T O

    R APPLIED ML IN 5 STEPS • Collect the Data 1. Look at the Data & Clean the Data 2. Prepare the data 3. Train your model(s) 4. Predict using your best model using unseen data 
 (namely: data NOT used in training) 5. Deploy your system in production
  12. KEYSTROKE DYNAMICS Keystroke dynamics consists in analysing the way a

    user types by monitoring keyboard inputs thousand of times per second, and processing this data through an algorithm, which then defines a pattern for future comparison Identifying an individual based on their way of typing on a physical or virtual keyboard
  13. KEYSTROKE DYNAMICS Time between two key pressures Time between one

    pressure and one release Time between one release and one pressure Time between two key release Intuition: 
 Users have unique ways to type on keyboards
 (i.e. typing patterns)
  14. KEYSTROKE DYNAMIC Time between two key pressures 
 Down-Down Time

    Time between one pressure and one release- 
 Dwell Time Time between one release and one pressure
 Flight Time Time between two key release
 Up-Up Time
  15. DATA COLLECTION Time between two key pressures 
 Down-Down Time

    Time between one pressure and one release- 
 Dwell Time Time between one release and one pressure
 Flight Time Time between two key release
 Up-Up Time • Dataset Statistics: • 50 different users • 450+ patterns each
  16. DONEC M E N U S U R N A

    STEP 1: LOOK AT THE DATA AND CLEAN THEM
  17. PULVINAR V I T A E E L I T

    . STEP 2:PREPARE THE DATA TRAIN-TEST CUT
  18. VIVAMUS F I N I B U S R I

    S U S STEP 3-4:TRAIN AND TEST ML MODEL
  19. Deep AutoEncoder Encoder Decoder DEEPKS 1. AUTOENCODER Trained on genuine

    keystroke patterns Unsupervised Machine (Deep) Learning
  20. SAMPLE SIZE TEST Q: How many patterns would I need

    to be confident about the accuracy of the model ?
  21. NON DIAM B L A N D I T F

    E R M E N T U M . STEP 5:DEPLOY YOUR SOLUTION
  22. Models Database Model Service Feature Database Data Collector Feature Detection

    Orchestration Model Training Service Feature Extraction Alarms Dashboard Models Models Features + Labels Features Features Raw Data Alarm Prediction Request Labels 1 2 3 9 SOC Alarms Database 4 5 6 7 Score Confirmation/ Rejection Features 8 10 11 12
  23. EUROSCIPY 2018 Fondazione Bruno Kessler | Associazione Python Italia
 University

    of Trento Northern Italy | Trentino Region Tentative dates: 
 Aug. 28 - Sept. 01 2018 Be posted on euroscipy.org