Upgrade to Pro — share decks privately, control downloads, hide ads and more …

better searching with elasticsearch - PHPConfPL

better searching with elasticsearch - PHPConfPL

Elasticsearch is a distributed, schemaless, document oriented, Lucene based search engine with a REST API. This talk looks at what that all that actually means in practice moving from interacting with it directly with cURL to integrating it into PHP applications using Elastica.

Richard Miller

October 26, 2013
Tweet

More Decks by Richard Miller

Other Decks in Technology

Transcript

  1. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance Sunday, 3 November 2013
  2. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for relevant reviews Sunday, 3 November 2013
  3. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for similar words Sunday, 3 November 2013
  4. Given I am a hungry person When I search for

    somewhere to eat near me Then I should see results close to my location Sunday, 3 November 2013
  5. Given I am a hungry person When I typo entering

    my search terms Then I should see “did you mean?” suggestions Sunday, 3 November 2013
  6. Given I am a hungry person When I start entering

    my search terms Then I should see suggestions straight away Sunday, 3 November 2013
  7. Given I am a hungry person When I start searching

    for somewhere to eat Then I should be able to filter my results Sunday, 3 November 2013
  8. Given I am a hungry person When I start view

    details of somewhere to eat Then I should see suggestions of similar eateries Sunday, 3 November 2013
  9. Given I am a restaurateur When I upload my PDF

    menu Then hungry people should be able to search it Sunday, 3 November 2013
  10. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance Sunday, 3 November 2013
  11. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance Sunday, 3 November 2013
  12. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance Sunday, 3 November 2013
  13. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for relevant reviews Sunday, 3 November 2013
  14. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for similar words Sunday, 3 November 2013
  15. Given I am a hungry person When I search for

    somewhere to eat near me Then I should see results close to my location Sunday, 3 November 2013
  16. { "ok" : true, "status" : 200, "name" : "Mr.

    Wu", "version" : { "number" : "0.90.5", "build_hash" : "c8714e8...dedee", "build_timestamp" : "2013-09-17T12:50:20Z", "build_snapshot" : false, "lucene_version" : "4.4" }, "tagline" : "You Know, for Search" } response Sunday, 3 November 2013
  17. 0 1 Instance 1 Instance 2 Instance 3 2 3

    4 Sunday, 3 November 2013
  18. 0 1 2 3 4 Instance 1 Instance 2 Instance

    3 0 1 2 3 4 Sunday, 3 November 2013
  19. { "ok" : true, "_index" : "eatly", "_type" : "eateries",

    "_id" : "cmElnobYSy6386TOUVbGZQ", "_version" : 1 } Sunday, 3 November 2013
  20. { "ok" : true, "_index" : "eatly", "_type" : "eateries",

    "_id" : "1", "_version" : 1 } Sunday, 3 November 2013
  21. { "_index" : "eatly", "_type" : "eateries", "_id" : "1",

    "_version" : 1, "exists" : true, "_source" : { "name" : "Jeff's Burgers", "desc" : "Blah Blah dirty burgers..." } } Sunday, 3 November 2013
  22. { "ok" : true, "_index" : "eatly", "_type" : "eateries",

    "_id" : "1", "_version" : 2 } Sunday, 3 November 2013
  23. { "_index" : "eatly", "_type" : "eateries", "_id" : "1",

    "_version" : 2, "exists" : true, "_source" : { "name" : "Jeff's Burger Joint" } } Sunday, 3 November 2013
  24. { "_index" : "eatly", "_type" : "eateries", "_id" : "1",

    "_version" : 2, "exists" : false } Sunday, 3 November 2013
  25. { "took" : 7, "timed_out" : false, "_shards" : {

    "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.15342641, "hits" : [ { "_index" : "eatly", "_type" : "eateries", "_id" : "1", "_score" : 0.15342641, "_source" : {"name" : "Jeff's Burger Joint"} } ] } } Sunday, 3 November 2013
  26. curl -XPUT 'http://localhost:9200/eatly/' -d '{ "settings" : { "index" :

    { "number_of_shards" : 4 "number_of_replicas" : 2 } } }' Sunday, 3 November 2013
  27. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance Sunday, 3 November 2013
  28. curl -XPUT '.../eatly/eateries/_mapping' -d '{ "eateries" : { "properties" :

    { "name" : {"type" : "string", "boost" : "1.5"}, "desc" : {"type" : "string"} } } }' Sunday, 3 November 2013
  29. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for relevant reviews Sunday, 3 November 2013
  30. curl -XPUT '.../eatly/eateries/_mapping' -d '{ "eateries" : { "properties" :

    { "name" : {"type" : "string", "boost" : "1.5"}, "desc" : {"type" : "string"} "reviews" : { "properties" : { "review" : {"type" : "string"}, "reviewer" : {"type" : "string"} } } } } }' Sunday, 3 November 2013
  31. curl -XPOST 'http://localhost:9200/eatly/eateries/' -d '{ "name" : "Jeff''s Burgers", "desc"

    : "Blah Blah dirty burgers...", "reviews" : [ { "review" : "Yadda, yadda, yadda", "reviewer" : "John Smith" }, { "review" : "na na na", "reviewer" : "Billy Badger" } ] }' Sunday, 3 November 2013
  32. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for similar words Sunday, 3 November 2013
  33. burgers I really enjoyed Jeff's Burger Joint, the burgers were

    excellent! can't fault it standard tokenizer I really enjoyed Jeff's Burger Joint the were excellent can't fault it Sunday, 3 November 2013
  34. excellent! burgers I really enjoyed Jeff's Burger Joint, the burgers

    were excellent! can't fault it whitespace tokenizer I really enjoyed Jeff's Burger Joint, the were can't fault it Sunday, 3 November 2013
  35. burgers I really enjoyed Jeff's Burger Joint, the burgers were

    excellent! can't fault it letter tokenizer I really enjoyed Jeff Burger Joint the were excellent can fault it s t Sunday, 3 November 2013
  36. burgers I really enjoyed Jeff's Burger Joint, the burgers were

    excellent! can't fault it standard tokenizer + lowercase filter i really enjoyed jeff's burger joint the were excellent can't fault it Sunday, 3 November 2013
  37. burgers I really enjoyed Jeff's Burger Joint, the burgers were

    excellent! can't fault it standard tokenizer + lowercase filter + stop filter i really enjoyed jeff's burger joint were excellent can't fault Sunday, 3 November 2013
  38. burger I really enjoyed Jeff's Burger Joint, the burgers were

    excellent! can't fault it standard tokenizer + lowercase filter + stop filter + stemmer filter i realli enjoi jeff burger joint were excel can't fault Sunday, 3 November 2013
  39. curl -XPUT 'http://localhost:9200/eatly/' -d '{ "settings" : { "index" :

    { "analysis" : { "analyzer" : { "snowball_analyzer" : { "type" : "snowball" } } } } } }' Sunday, 3 November 2013
  40. curl -XPUT '.../eatly/eateries/_mapping' -d '{ "eateries" : { "properties" :

    { "name" : {"type" : "string", "boost" : "1.5"}, "desc" : { "type" : "string", "analyzer": "snowball_analyzer" } } } }' Sunday, 3 November 2013
  41. curl -XPUT 'http://localhost:9200/eatly/' -d '{ "settings" : { "index" :

    { "analysis" : { "analyzer" : { "custom_analyzer" : { "type" : "custom", "tokenizer" : "lowercase" "filter" : ["stop", "extra_stop"] } }, "filter" : { "extra_stop":{ "type" : "stop", "stopwords" : ["food"] } } } } } }' Sunday, 3 November 2013
  42. curl -XPUT '.../eatly/eateries/_mapping' -d '{ "eateries" : { "properties" :

    { "name" : {"type" : "string", "boost" : "1.5"}, "desc" : { "type" : "string", "analyzer": "custom_analyzer" } } } }' Sunday, 3 November 2013
  43. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "match" :

    { "name" : "burger" } } }' Sunday, 3 November 2013
  44. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "match" :

    { "name" : "burger bar" } } }' Sunday, 3 November 2013
  45. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "match" :

    { "name" : { "query" : "burger bar", "operator" : "and" } } } }' Sunday, 3 November 2013
  46. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "multi_match" :

    { "query" : "burger bar", "fields" : ["name","desc","reviews.review"] } } }' Sunday, 3 November 2013
  47. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "multi_match" :

    { "query" : "burger bar", "fields" : ["name^2","desc","reviews.review"] } } }' Sunday, 3 November 2013
  48. Given I am a hungry person When I start searching

    for somewhere to eat Then I should be able to filter my results Sunday, 3 November 2013
  49. curl -XPUT 'http://localhost:9200/eatly/eateries/1' -d '{ "name" : "Jeff''s Burger Joint",

    "desc" : "Blah Blah dirty burgers...", "tags" : ["greasy", "retro"] }' Sunday, 3 November 2013
  50. curl -XGET '.../eatly/eateries/_search' -d '{ "query" : { "multi_match" :

    { "query" : "burger bar", "fields" : ["name^2", "desc", "reviews.review"] } }, "filter" : { "not" : { "terms" : { "tags" : ["greasy", "meaty"]} } } }' Sunday, 3 November 2013
  51. Given I am a hungry person When I search for

    somewhere to eat near me Then I should see results close to my location Sunday, 3 November 2013
  52. curl -XPUT '.../eatly/eateries/_mapping' -d '{ "eateries" : { "properties" :

    { "name" : {"type" : "string", "boost" : "1.5"}, "desc" : {"type" : "string"} "reviews" : { "properties" : { "review" : {"type" : "string"}, "reviewer" : {"type" : "string"} } }, "location" : { "type" : "geo_point" } } } }' Sunday, 3 November 2013
  53. curl -XPOST 'http://localhost:9200/eatly/eateries/' -d '{ "name" : "Jeff''s Burgers", "desc"

    : "Blah Blah dirty burgers...", "reviews" : [ { "review" : "Yadda, yadda, yadda", "reviewer" : "John Smith" }, { "review" : "na na na", "reviewer" : "Billy Badger" } ], "location" : { "lat" : 41.12, "lon" : -71.34 } }' Sunday, 3 November 2013
  54. curl -XGET '.../eatly/eateries/_search' -d '{ "sort" : [ { "_geo_distance"

    : { "eatery.location" : [-40, 70], "order" : "asc", "unit" : "miles" } } ], "query" : { "multi_match" : { "query" : "burger bar", "fields" : ["name^2","desc","reviews.review"] } } }' Sunday, 3 November 2013
  55. $type->addDocument( new \Elastica\Document( $id, array( "id" => $id, "name" =>

    "Jeff's Burger Joint", "desc" => "Blah Blah..." ) ) ); Sunday, 3 November 2013
  56. $multiMatchQuery = new \Elastica\Query\MultiMatch(); $multiMatchQuery ->setQuery("burgers bar") ->setFields( array("name^2", "desc",

    "reviews.review") ) ; $query = new \Elastica\Query($multiMatchQuery); $query->setFilter( new \Elastica\Filter\BoolNot( new \Elastica\Filter\Terms( "tags", array("greasy", "meaty") ) ) ); $results = $type->search($query); Sunday, 3 November 2013
  57. $mapping = new \Elastica\Type\Mapping(); $mapping->setType($type); $mapping->setProperties( array( "name" => array(

    "type" => "string" ), "desc" => array( "type" => "string", "boost" => 1.5 ) ) ); $mapping->send(); Sunday, 3 November 2013
  58. Given I am a hungry person When I search for

    somewhere to eat Then I should see relevant results And they should be ordered by relevance ✔ Sunday, 3 November 2013
  59. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for relevant reviews ✔ Sunday, 3 November 2013
  60. Given I am a hungry person When I search for

    somewhere to eat Then I should see results for similar words ✔ Sunday, 3 November 2013
  61. Given I am a hungry person When I search for

    somewhere to eat near me Then I should see results close to my location ✔ Sunday, 3 November 2013
  62. Given I am a hungry person When I start searching

    for somewhere to eat Then I should be able to filter my results ? Sunday, 3 November 2013
  63. $facets = $resultSet->getFacets(); foreach ($facets["tags"]["terms"] as $facet) { printf( "%s:

    %s\n", $facet["term"], $facet["count"] ); } thai: 4 italian: 8 greek:4 Sunday, 3 November 2013
  64. Given I am a hungry person When I start view

    details of somewhere to eat Then I should see suggestions of similar eateries Sunday, 3 November 2013
  65. Given I am a restaurateur When I upload my PDF

    menu Then hungry people should be able to search it Sunday, 3 November 2013
  66. Given I am a hungry person When I typo entering

    my search terms Then I should see “did you mean?” suggestions Sunday, 3 November 2013
  67. $query = new \Elastica\Query( array( "query" => array(...), "suggest" =>

    array( "check1" => array( "text" => "berger", "term" => array( "field" => "name" ) ) ) ) ); Sunday, 3 November 2013
  68. Array ( [check1] => Array ( [0] => Array (

    [text] => berger [offset] => 0 [length] => 6 [options] => Array ( [0] => Array ( [text] => burger [score] => 0.8333333 [freq] => 1 ) ) ) ) ) Sunday, 3 November 2013
  69. Given I am a hungry person When I start entering

    my search terms Then I should see suggestions straight away Sunday, 3 November 2013
  70. $mapping->setProperties( array( "name" => array( "type" => "string" ), "desc"

    => array( "type" => "string", "boost" => 1.5 ), "name_suggest" => array( "type" => "completion" ) ) ); Sunday, 3 November 2013
  71. new \Elastica\Document( $id, array( "id" => $id, "name" => "Jeff's

    Burger Joint", "name_suggest" => "Jeff's Burger Joint", "desc" => "Blah Blah..." ) ); Sunday, 3 November 2013
  72. $query = array( "eateries" => array( "text" => 'j', "completion"

    => array( "field" => "name_suggest", ) ) ); Sunday, 3 November 2013
  73. Array ( [0] => Array ( [text] => j [offset]

    => 0 [length] => 1 [options] => Array ( [0] => Array ( [text] => Jeffs Burger Joint [score] => 1 ) ) ) ) Sunday, 3 November 2013