Slide 1

Slide 1 text

Elasticsearch for PHP Developers Shaun Farrell June 29, 2012 Saturday, June 30, 12

Slide 2

Slide 2 text

What Is Elasticsearch? • Storage Engine • Schema Free • Document Oriented • Built on top of Lucene • Opensource • RESTFul (JSON over HTTP) • Multi-tenancy Saturday, June 30, 12

Slide 3

Slide 3 text

Simple, Easy, and Fast! Saturday, June 30, 12

Slide 4

Slide 4 text

What We will Cover • Indexes and Types • Mappings • Search • Elastica - PHP Library • Examples • Resources Saturday, June 30, 12

Slide 5

Slide 5 text

Getting Started Saturday, June 30, 12

Slide 6

Slide 6 text

Quick & Easy Installation • Download Elasticsearch • http://elasticsearch.org/ • Extract • Run • Service, Background, Foreground Saturday, June 30, 12

Slide 7

Slide 7 text

Indexes, Types & Data Saturday, June 30, 12

Slide 8

Slide 8 text

Indexes & Types • Index: Group of Items (Types) • Types: Relevant Data in a Group • Amazon: Books, Movies, Clothes, Etc. • Airlines: American, Delta, KLM, etc. • Each Type & Index can have different data elements Saturday, June 30, 12

Slide 9

Slide 9 text

URL Structure http://localhost:9200/dfw/beer/ Saturday, June 30, 12

Slide 10

Slide 10 text

URL Structure http://localhost:9200/dfw/beer/ Elasticsearch Location Saturday, June 30, 12

Slide 11

Slide 11 text

URL Structure http://localhost:9200/dfw/beer/ Index Elasticsearch Location Saturday, June 30, 12

Slide 12

Slide 12 text

URL Structure http://localhost:9200/dfw/beer/ Index Type Elasticsearch Location Saturday, June 30, 12

Slide 13

Slide 13 text

Create & Dele Indexes curl -XPOST 'http://localhost:9200/dfw/’ curl -XDELETE 'http://localhost:9200/dfw/' Saturday, June 30, 12

Slide 14

Slide 14 text

Create Type & Add Data Saturday, June 30, 12

Slide 15

Slide 15 text

Create Type & Add Data curl -XPOST 'http://localhost:9200/dfw/beer/1' -d ' { "name": "Deep Ellum IPA" } ' Saturday, June 30, 12

Slide 16

Slide 16 text

Create Type & Add Data curl -XPOST 'http://localhost:9200/dfw/beer/1' -d ' { "name": "Deep Ellum IPA" } ' curl -XPOST 'http://localhost:9200/dfw/beer/2' -d ' { "name": "Double Brown Stout" } ' Saturday, June 30, 12

Slide 17

Slide 17 text

curl -XPOST 'http://localhost:9200/dfw/brewery/1' - d ' { "name": "Deep Ellum Brewing Company", "beers": [ "Deep Ellum IPA", "Double Brown Stout" ] } ' Saturday, June 30, 12

Slide 18

Slide 18 text

PUT vs POST • You define ID - POST or PUT • ElasticSearch Define Id - POST • Using PUT with no ID will throw error Saturday, June 30, 12

Slide 19

Slide 19 text

PUT vs POST POST {"ok":true,"_index":"dfw","_type":"brewery ","_id":"Iw9kfa3vSx2FyFen- uK26Q","_version":1} POST OR PUT {"ok":true,"_index":"dfw","_type":"brewery ","_id":"1","_version":1} PUT No handler found for uri [/dfw/brewery/] and method [PUT] Saturday, June 30, 12

Slide 20

Slide 20 text

Update Data curl -XPUT 'http://localhost:9200/dfw/beer/1' -d ' { "name": "Deep Ellum IPA", "style": "American-Style India Pale Ale" } ' {"ok":true,"_index":"dfw","_type":"beer","_id":"1","_version":2} Saturday, June 30, 12

Slide 21

Slide 21 text

Get Data curl -XGET 'http://localhost:9200/dfw/brewery/1' Saturday, June 30, 12

Slide 22

Slide 22 text

{ "_index": "dfw", "_type": "brewery", "_id": "1", "_version": 1, "exists": true, "_source": { "name": "Deep Ellum Brewing Company", "beers": [ "Deep Ellum IPA", "Double Brown Stout" ] } } Saturday, June 30, 12

Slide 23

Slide 23 text

Delete Data curl -XDELETE 'http://localhost:9200/dfw/beer/1' Saturday, June 30, 12

Slide 24

Slide 24 text

Elasticsearch API’s Saturday, June 30, 12

Slide 25

Slide 25 text

Elasticsearch API’s • Allow you to perform operations • Search, Add Mappings, Status, Refresh & Optimization • It’s just another endpoint • Checkout Elasticsearch.org for lots more Saturday, June 30, 12

Slide 26

Slide 26 text

URL Structure http://localhost:9200/dfw/beer/ Index Type ElasticSearch Location Saturday, June 30, 12

Slide 27

Slide 27 text

URL Structure http://localhost:9200/dfw/beer/ Index Type ElasticSearch Location _{API} API Method Saturday, June 30, 12

Slide 28

Slide 28 text

Status • Displays Comprehensive Status Information on an indices. • Can be done at all levels (es, index, type) • Endpoint: _status Saturday, June 30, 12

Slide 29

Slide 29 text

Refresh & Optimize • Refresh: refresh data for near real-time search. • All Levels • Endpoint: _refresh • Optimize: Optimizes Lucene segments for faster searching. • All Levels • Endpoint: _optimize Saturday, June 30, 12

Slide 30

Slide 30 text

Mappings Saturday, June 30, 12

Slide 31

Slide 31 text

Mapping • Defines how the document is mapped to search engine • You don’t have to define this. It’s dynamic. But you can.... • You can define them at the Index and Type level • Endpoint: _mapping Saturday, June 30, 12

Slide 32

Slide 32 text

What you can Define • What fields are searchable • Fields dataType • How they are stored • How they are tokenized (index, analyzed) • etc... Saturday, June 30, 12

Slide 33

Slide 33 text

Mapping Types • Core Data Types • string, integer/long, float/double, boolean, and null • Arrays • IP Addresses • Geo Point • Attachment Saturday, June 30, 12

Slide 34

Slide 34 text

curl -XPOST 'http://localhost:9200/dfw/brewery/ _mapping' -d ' { "brewery" : { "properties" : { "name" : { "type" : "string", "store" : "yes", "index" : "not_analyzed" }, "established" : { "type" : "date", "format" : "YYYY" } } } } Saturday, June 30, 12

Slide 35

Slide 35 text

Dynamic Mapping { "brewery" : { "properties" : { "name" : { "type" : "string" }, "established" : { "type" : "string" } } } } Saturday, June 30, 12

Slide 36

Slide 36 text

Delete Mapping curl -XDELETE 'http://localhost:9200/dfw/brewery/ _mapping' Important to note that you are just deleting the mapping not the data. Saturday, June 30, 12

Slide 37

Slide 37 text

Search Saturday, June 30, 12

Slide 38

Slide 38 text

Two Types of Search • URI Request • Limited Searching • Request Body • Full functionality • JSON requests Saturday, June 30, 12

Slide 39

Slide 39 text

URI Request • Performed through a web request or curl request. • Simple & Limited • No Filter, Facet, etc. http://localhost:9200/dfw/brewery/ _search/?name:Deep Ellum Brewing Company Saturday, June 30, 12

Slide 40

Slide 40 text

Request Body • Uses the Query DSL • Allows for Filters, Facets, Boosting, More Like this, Fuzzy, etc. • Is a JSON Request Saturday, June 30, 12

Slide 41

Slide 41 text

Searching Across Indexes & Types http://localhost:9200/lse,rdu,dfw/_search... http://localhost:9200/rdu/brewery,beer/_search... Saturday, June 30, 12

Slide 42

Slide 42 text

Faceted Searching • Facets are “Logical Groupings” that allow easier search navigation. • Drill down searching • Think Amazon or NewEgg.com • Type of Facets in Elasticsearch • Terms, Range, Histogram, Date Histogram, Statistical, & Geo Saturday, June 30, 12

Slide 43

Slide 43 text

Faceted Searching Saturday, June 30, 12

Slide 44

Slide 44 text

Faceted Searching This is Faceted Searching Saturday, June 30, 12

Slide 45

Slide 45 text

Geo Capabilities Saturday, June 30, 12

Slide 46

Slide 46 text

Geo Bounding Box Saturday, June 30, 12

Slide 47

Slide 47 text

Geo Bounding Box Saturday, June 30, 12

Slide 48

Slide 48 text

Geo Bounding Box Saturday, June 30, 12

Slide 49

Slide 49 text

Geo Bounding Box Saturday, June 30, 12

Slide 50

Slide 50 text

Geo Distance Saturday, June 30, 12

Slide 51

Slide 51 text

Geo Distance Saturday, June 30, 12

Slide 52

Slide 52 text

Geo Distance Saturday, June 30, 12

Slide 53

Slide 53 text

Geo Distance Range Can Also be a Facet Saturday, June 30, 12

Slide 54

Slide 54 text

Geo Distance Range Can Also be a Facet Saturday, June 30, 12

Slide 55

Slide 55 text

Geo Distance Range Can Also be a Facet Saturday, June 30, 12

Slide 56

Slide 56 text

Geo Distance Range Can Also be a Facet Saturday, June 30, 12

Slide 57

Slide 57 text

Geo Distance Range Can Also be a Facet Saturday, June 30, 12

Slide 58

Slide 58 text

Geo Polygon Saturday, June 30, 12

Slide 59

Slide 59 text

Geo Polygon Saturday, June 30, 12

Slide 60

Slide 60 text

Geo Polygon Saturday, June 30, 12

Slide 61

Slide 61 text

Other Features • Highlighting • TTL • Routing - Tell were to look at node/shard, etc. • Scripting • Scrolling - Pagination of results • Plugins - Rivers & Attachments Saturday, June 30, 12

Slide 62

Slide 62 text

Elastica Saturday, June 30, 12

Slide 63

Slide 63 text

Elastica • PHP Library • Opensource Project • GitHub - https://github.com/ruflin/Elastica • Follows ZF Standards • Alternative to cUrl • Not a lot of documentation but there are TESTS! Saturday, June 30, 12

Slide 64

Slide 64 text

Elastica • Everything is an Object • Inject objects to create Queries. • Under the hood - Array Based converted to JSON. • Debug • echo json_encode($query->toArray()) Saturday, June 30, 12

Slide 65

Slide 65 text

Examples • ElasticSearch Query DSL vs. Elastica • Preloaded Index (dfw) with two types (brewery, beer). • Uses BreweryDB data through the API • Examples available on Github Saturday, June 30, 12

Slide 66

Slide 66 text

Examples Saturday, June 30, 12

Slide 67

Slide 67 text

Resources Saturday, June 30, 12

Slide 68

Slide 68 text

Resources • Elastica - Github (http://ruflin.github.com/Elastica/) • Elasticsearch - http://www.elasticsearch.org/ • Elasticsearch GitHub - https://github.com/elasticsearch/ elasticsearch • Google Groups (ES and Elastica) • Slideshare • Slides: http://farrelley.github.com/ElasticSearch-For-PHP/ • Examples - https://github.com/farrelley/ElasticSearch- For-PHP Saturday, June 30, 12

Slide 69

Slide 69 text

Questions? Saturday, June 30, 12

Slide 70

Slide 70 text

Thank You! • farrelley - Twitter, Github • Follow me on Mojo Live • Joind.in - http://joind.in/6341 Saturday, June 30, 12