Slide 1

Slide 1 text

So,  What  Does  a  Data  Scien/st  do?   A  Data  Scien/st  in  the  Music  Industry   Dr  Jameel  Syed   March  2012   h>p://jasyed.com/datascience/  

Slide 2

Slide 2 text

Overview   – Musicmetric  CTO   – InforSense  founding  member   •  PhD  in  Workflows  for  Life  Sciences  Analysis   – Co-­‐organiser  Big  Data  London  meetup  

Slide 3

Slide 3 text

Some  ques/ons...  

Slide 4

Slide 4 text

Music  has  moved  online   •  The  world  has  changed   –  Do  you  buy  vinyl/tapes/CDs  of  music?   –  Do  you  buy  music  downloads?   –  Do  you  download  illegal  content  from  bi>orrent?   –  Do  you  listen  to  music  on  YouTube?   –  Do  you  “like”  bands  on  Facebook?   –  Do  you  subscribe  to  Spo/fy?   –  Do  you  listen  on  the  radio  to  the  weekly  charts  on  a   Sunday  aWernoon?   •  What’s  happening  online?  

Slide 5

Slide 5 text

How  popular  am  I?  

Slide 6

Slide 6 text

Who  are  my  fans?  

Slide 7

Slide 7 text

Where  are  my  fans?  

Slide 8

Slide 8 text

What  is  the  press  saying?  

Slide 9

Slide 9 text

 Who  is  popular?    

Slide 10

Slide 10 text

A  Data  Scien/st  in  the  Music  Industry   •  Raw  Data  -­‐>  Derived  Data  -­‐>  Insight   –  Who  is  popular  right  now/in  the  immediate  future?   –  What  was  the  effect  of  appearing  at  a  fes/val?   –  Which  ar/sts  are  (becoming)  popular  with  listeners   with  certain  demographics  (in  a  region)?   •  Data  processing,  machine  learning  &  sta/s/cal   methods   –  Sen/ment  analysis   –  Named  En/ty  Recogni/on   –  Ranking   –  Segmenta/on   •  One-­‐offs   –  Infographics  and  microsites  for  events   –  Brand  alignment  via  demographics   –  Music  Hack  Days   •  Product   –  Daily  charts   –  Sen/ment  scoring  web  crawled  reviews  

Slide 11

Slide 11 text

What  is  a  Data  Scien/st?  

Slide 12

Slide 12 text

Have  we  been  here  before?   •  Sta/s/cian   •  Data  Analyst   •  Quan/ta/ve  analyst   •  Bioinforma/cian   •  Data  Miner   •  Business  Intelligence  consultant   •  Computa/onal  physicst  

Slide 13

Slide 13 text

A  Life  Sciences  digression...  

Slide 14

Slide 14 text

What’s  new?   •  Data  provides  the  opportunity   –  Old:  Collect  and  store  data  presupposing  how  it  will  be  used   –  New:  Collect  raw  data  &  explore  which  deriva/ons  are   interes/ng;  integra/ng  data  from  mul/ple  online  sources.   –  Big  Data  technology  to  cope  with  data  volume   •  Programming  is  essen/al   –  APIs   –  Heterogeneous  environment(s)   •  Method  of  presenta/on   –  Infographics   –  Interac/ve  (web)  applica/ons   –  (Raw  data)  

Slide 15

Slide 15 text

Data  Scien/st   •  “Jack  of  all  trades”   – “Hacker”  mentality:  learn  new  technology  and   approaches  for  a  project  on  short  no/ce   – Crea/ve  self-­‐starters   – Work  alongside  other  experts  (data,  domain,   soWware  engineering)  

Slide 16

Slide 16 text

A  Data  Scien/st  is  good  at  knieng?   •  Not  building  from  scratch,  knieng  together  pre-­‐exis/ng  parts   •  Data   –  Databases  (rela/onal/NoSQL)   –  Files   –  APIs   •  Algorithms   –  Open  source  libraries   –  Off  the  shelf  tools   •  Compute   –  Linux   –  AWS?   •  Languages   –  Many,  especially  “scrip/ng”  languages