Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The solr Power

The solr Power

Motivations for using solr as a NoSQL backend

tarequeh

March 10, 2012
Tweet

Other Decks in Programming

Transcript

  1. What  about  it?   •  We  always  associate  solr  with

     searching   •  solr  can  also  serve  as  your  non-­‐relational   data  layer  
  2. Hmmm  why  not?   •  Hey  solr  is  already  part

     of  my  stack   •  I  love  solr   •  It’s  fast,  scalable  and  there  are  some  great   python              interfaces  out  there  
  3. When  would  you  consider  it?   •  You  have  a

     DB  that’s  frequently  read  and   infrequently  written   •  You  want  robust  search  &  filtering  on  your   data   •  You  want  to  leverage  the  faceting  feature   •  You  want  an  awesome  scalable  data  layer  
  4. What’s  not  so  cool?   •  Doesn’t  support  transactions  

    •  Not  all  SQL  queries  can  be  translated  into   solr  queries   •  Generating  indices  can  take  a  long  time   •  Index  optimization  can  take  a  long  time  
  5. But..   •  You  don’t  have  to  give  up  your

     relational   data  layer   •  Create  a  non-­‐relational  layer  on  top  of  your   relational  data  layer   •  Get  best  of  the  both  worlds  
  6. Why  did  we  choose  solr?   •  We  deal  with

     medical  survey  data   •  Say:   – About  300  multiple  choice  questions   – Responses  can  be  multi-­‐dimensional   – 7000+  different  answer  choices  per  question   – 2000+  respondents  per  survey   – 15+  surveys  and  growing  
  7. Osteoarthritis   Rheumatoid   Arthritis   Traumatic   Arthritis  

    Psoriatic   Arthritis   Other   Less  than  a   year  ago   þ   ☐   ☐   ☐   ☐   More  than  a   year  ago   ☐   ☐   þ   ☐   ☐   When  were  you  diagnosed  with  the  following  types  of   Arthri5s?   What  a  survey  question  looks  like  
  8. When  were  you  diagnosed  with  the  following  types  of  

    Arthri5s?   Osteoarthritis   Rheumatoid   Arthritis   Traumatic   Arthritis   Psoriatic   Arthritis   Other   Less  than  a   year  ago   1   0   0   0   0   More  than  a   year  ago   0   0   1   0   0   Storing  a  single  response  
  9. When  were  you  diagnosed  with  the  following  types  of  

    Arthri5s?   Osteoarthritis   Rheumatoid   Arthritis   Traumatic   Arthritis   Psoriatic   Arthritis   Other   Less  than  a   year  ago   63   155   19   27   268   More  than  a   year  ago   190   46   8   213   325   Aggregating  over  2000  responses  
  10. What  did  we  do?   •  Each  survey  response  =

     solr  document   •  Add  respondent  meta  information:  age,   profession,  interests   •  Up  to  3000  boolean  variables  per  document   indicating  chosen  answers  
  11. What  did  we  do?   •  Filter  by  age,  interest,

     profession   •  Facet  across  boolean  field   •  Result:  what  group  of  people  chose  what   group  of  answers    
  12. Why  solr  is  awesome..   •  Faceting  across  boolean  field

     uses  very  little   memory   •  Combining  3000  fields  for  2000  documents   takes  1  ~  2  ms   •  Allowed  us  to  reduce  API  response  time   from  a  variable  of  2  ~  15  seconds  (sucked!)  to   an  almost  constant  ~50  ms    
  13. Good  to  know..   •  sunburnt:  Awesome  python  solr  interface

             github.com/tow/sunburnt   •  Programmatic  querying  as  well  as  raw   queries   •  Supports  most  advanced  solr  options   •  If  you  only  required  facets,  specify  rows=0