Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analyzing Human Computer Interaction (Johannes Schneider, University of Liechtenstein)

Analyzing Human Computer Interaction (Johannes Schneider, University of Liechtenstein)

In this talk we discuss three case studies of human computer interaction and their analysis using recorded mouse movements and key strokes. First, we look at logs of more than 25000 h of software developers in an industrial company. The main goal is to assess whether developers are using developer tools effectively. Second, we use human computer interaction data to identify plagiarism in an academic setting. Third, we discuss how websites, in particular online shops, might extend their use of mouse and key logging to get a better understanding of customers.

Presentation given at the 4ländereck Data Science Meetup: https://www.meetup.com/4laendereck-Data-Science-Meetup/events/238393988/

Transcript

  1. 2 MSc + PhD in Computer Science MSc of Adv.

    Studies in Management, Tech. & Innovation Corporate Research CV Johannes Schneider
  2. 3 1. Analyzing executed “Commands” in Industry > Are software

    developers using tools effectively? 2. … in Academia > Detecting Plagiarism 3. Analyzing Mouse Movements and Keystrokes > Online Shopping: What product features matter for a customer? Human Computer Interaction Johannes Schneider
  3. 6 Data Preprocessing – Remove time, remove (some) repetitions, reduce

    granularity View.OnChangeScrollInfo Edit View.OnChangeCaretLine Build.BuildBegin Build.BuildSolution Build.BuildDone View.OnChangeCaretLine Edit Build.BuildBegin Build.BuildSolution Build.BuildDone View.OnChangeCaretLine Johannes Schneider
  4. 7 Pattern Mining – Frequent pattern mining – Run on

    Cluster using Hadoop – ~ 80000 patterns 40 view.find_symbol_results edit.findallreferences view.find_symbol_results view.find_symbol_results edit 30 view.find_symbol_results edit.findallreferences view.find_symbol_results view.find_symbol_results view.solution_ex 33 edit.gotodefinition edit.findallreferences view.find_symbol_results view.find_symbol_results view.find_symbol_resu 37 view.solution_explorer edit.findallreferences view.find_symbol_results view.find_symbol_results view.find_symbol_ 25 edit edit.findallreferences view.find_symbol_results view.find_symbol_results view.find_symbol_results 43 view.find_and_replace view.find_results_ view.find_and_replace view.find_results_ edit.gotodefinition Johannes Schneider
  5. 8 Pattern Filtering – Eg. Remove short patterns <8 commands

    – ~ 17000 patterns Johannes Schneider
  6. 9 Pattern Clustering – Distance metric: Two patterns are similar,

    if > both have same long subsequence or contain a rare command Johannes Schneider view.find_symbol_results edit.findallreferences view.find_symbol_results view.find_symbol_results edit view.find_symbol_results edit.findallreferences view.find_symbol_results view.find_symbol_results view.solution_explo edit.gotodefinition edit.findallreferences view.find_symbol_results view.hexadecimal view.find_symbol_results view.solution_explorer edit debug.start edit view.hexadecimal
  7. 10 Pattern Clustering – K-Means > Cluster represented by average

    of all points in cluster – K-Medoid > Cluster represented by data point near cluster center Johannes Schneider
  8. 11 Expert Analysis of Clusters – Manual investigation – Developer

    interviews to verify findings Johannes Schneider
  9. 14 Usage Smells for Debugging • Repetitive behavior • Eg.

    ~100 step commands instead of 1 “run to cursor” • ~30s vs. 3s • Due to lack of knowledge • The less knowledgeable the more repetitive
  10. 15 White-collar Productivity – Not all people use tools well

    – Train people to leverage the IDE > Recommendation of commands
  11. 16 Similar work at our institute – Can we improve

    our software based on customer interaction logs?
  12. 17 Detecting Plagiarism – Detecting copied (modified) parts in thesis,

    reports… – Standard Approach: Use final work – Novel: Use creation process > “Honest” creation process ≠ Plagiarism process > Less work: Less Restructuring, Less Navigation, Less Debugging… Certain commands are used more or less often
  13. 20 Plagiarism Detection  Find logs that are outliers >

    Assumption: Majority honest students, Outliers = Plagiarist > Eg. Lots of copy & paste, but little typing  Find (modified) copies of logs > Logs were copied and altered a bit Johannes Schneider
  14. 22 Plagiarism Detection – Histogram of used commands > Outlier:

    Very low or very frequent usage of some commands > Copies: Almost exact match of frequencies Johannes Schneider
  15. 23 The Smart Web Shop Johannes Schneider Online sales in

    US ~ 400 000 000 000 $ Amazon: 35% of sales are recommended products
  16. 24 Customer buying process – Customer chooses products based on

    features that matter most to him Bestbuy.com Amazon.com Johannes Schneider
  17. 25 How much do specific product features matter for a

    customer? 1) Ask customer explicitly 2) Infer implicitly using a special website design  Measure time a customer hovers over attribute Johannes Schneider
  18. 26 Does this work ? – Experiment: ~ 50 participants

    – Choose among smartphones and cars – Ask for self-reported feature preferences Johannes Schneider
  19. 27 There is a correlation… …between hovering time and attribute

    weight Next step: Derive recommendations and make customers pay more… Johannes Schneider
  20. 28 Other mouse tracking studies (at our institute) – Insurance

    fraud > Theory: If you tell a lie you need to think of another story > People type slower, move the mouse slower, perform more clicks… – Detecting emotions Johannes Schneider
  21. 29 Work with us! - Tell us about your data,

    problems and ideas! - Small to big projects with funding opportunities Johannes Schneider ...
  22. 30 Working with Uni.Li – Cooperation models – Small feasibility

    study > 1 - 5 days – Innovation Lab > Groups of 3-4 students > 3-4 weeks – Master Thesis (Bachelor) > 4-5 months – Research grants, eg. KTI > PhD, 1 - 2 years Johannes Schneider …
  23. 31 Working with Uni.Li – Cooperation models – Small feasibility

    study > 1 - 5 days – Innovation Lab > Groups of 3-4 students > 3-4 weeks – Master Thesis (Bachelor) > 4-5 months – Research grants, eg. KTI > PhD, 1 - 2 years Johannes Schneider … Thank you ! Questions?
  24. 32