Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Popular not Significant

Popular not Significant

What is the difference between being popular and significant in data analysis.

Pablo Musa

May 18, 2017
Tweet

More Decks by Pablo Musa

Other Decks in Technology

Transcript

  1. Pablo Musa • MSc. Computer Science • Backend Developer •

    Software Architect • Infra Lover • 2 years Hadoop DevOps • 3 years Elastic Enthusiast 2
  2. Motivation • Marketing, Recommendation, Analysis,... • It is all about

    understanding the "audience", the group, the niche • Sometimes understanding the group can be hard • In which city should I focus my Marketing Campaign? • A user watched "Mar Adentro" what should we recommend next? 3
  3. Popularity • Are blue cars more common in London or

    Birmingham? • We will probably have more blue cars in London. However, not because Londoners like it more than others, but because London has a huge population. • Are there more people accessing "elastic.co" in London or Birmingham? 4
  4. Popularity • A user watched "Mar Adentro" what movies should

    we recommend next? • We can look into the most common movies that users that watched "Mar Adentro" also watched and use it as recommendation. Will it be good? 5
  5. Popular • Are blue cars more common in London or

    Birmingham? • We can look into the most common movies that...? • Common: occurring, found, or done often; prevalent 6
  6. Significant • Are blue cars more significant in London or

    Birmingham? • We can look into the most significant movies that...? • Significant: sufficiently great or important to be worthy of attention; noteworthy. 8
  7. 11 999 users watched 708 also watched 701 also watched

    699 also watched Recommended Watched
  8. 12 999 users watched 371 also watched 263 also watched

    354 also watched Recommended Watched
  9. 13 999 users watched 371 also watched 263 also watched

    354 also watched Recommended Watched bg:2437
 sc:7.45 bg:1311
 sc:7.04 bg:2619
 sc:6.28
  10. Another way to understand the coldplay effect 20 q=[mecano, amaral,

    pereza, coldplay] q=[mecano, amaral, pereza, coldplay] s=all s=500
  11. Another way to understand the coldplay effect 21 q=[mecano, amaral,

    pereza, coldplay] q=[mecano, amaral, pereza, coldplay] s=all s=500 NOISE SIGNAL
  12. References • https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search- aggregations-bucket- • terms-aggregation.html • significantterms-aggregation.html • sampler-aggregation.html

    • Datasets • https://grouplens.org/datasets/movielens/20M/ • http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/ lastfm-360K.html • http://data.dft.gov.uk/ 22
  13. Thanks!! * JONTB - 16:30 - Mollete Hall - Managing

    your Black Friday Logs Pablo Musa @pablitomusa