Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch for PHP - Lone Star PHP 2012

Elasticsearch for PHP - Lone Star PHP 2012

Elasticsearch for PHP - Lone Star PHP in Dallas, Texas
June 29 & 30, 2012

Shaun Farrell

June 30, 2012
Tweet

More Decks by Shaun Farrell

Other Decks in Technology

Transcript

  1. Elasticsearch
    for
    PHP Developers
    Shaun Farrell
    June 29, 2012
    Saturday, June 30, 12

    View Slide

  2. What Is Elasticsearch?
    • Storage Engine
    • Schema Free
    • Document Oriented
    • Built on top of Lucene
    • Opensource
    • RESTFul (JSON over HTTP)
    • Multi-tenancy
    Saturday, June 30, 12

    View Slide

  3. Simple, Easy, and Fast!
    Saturday, June 30, 12

    View Slide

  4. What We will Cover
    • Indexes and Types
    • Mappings
    • Search
    • Elastica - PHP Library
    • Examples
    • Resources
    Saturday, June 30, 12

    View Slide

  5. Getting Started
    Saturday, June 30, 12

    View Slide

  6. Quick & Easy Installation
    • Download Elasticsearch
    • http://elasticsearch.org/
    • Extract
    • Run
    • Service, Background, Foreground
    Saturday, June 30, 12

    View Slide

  7. Indexes, Types & Data
    Saturday, June 30, 12

    View Slide

  8. Indexes & Types
    • Index: Group of Items (Types)
    • Types: Relevant Data in a Group
    • Amazon: Books, Movies, Clothes, Etc.
    • Airlines: American, Delta, KLM, etc.
    • Each Type & Index can have different
    data elements
    Saturday, June 30, 12

    View Slide

  9. URL Structure
    http://localhost:9200/dfw/beer/
    Saturday, June 30, 12

    View Slide

  10. URL Structure
    http://localhost:9200/dfw/beer/
    Elasticsearch
    Location
    Saturday, June 30, 12

    View Slide

  11. URL Structure
    http://localhost:9200/dfw/beer/
    Index
    Elasticsearch
    Location
    Saturday, June 30, 12

    View Slide

  12. URL Structure
    http://localhost:9200/dfw/beer/
    Index
    Type
    Elasticsearch
    Location
    Saturday, June 30, 12

    View Slide

  13. Create & Dele Indexes
    curl -XPOST 'http://localhost:9200/dfw/’
    curl -XDELETE 'http://localhost:9200/dfw/'
    Saturday, June 30, 12

    View Slide

  14. Create Type & Add Data
    Saturday, June 30, 12

    View Slide

  15. Create Type & Add Data
    curl -XPOST 'http://localhost:9200/dfw/beer/1' -d '
    {
    "name": "Deep Ellum IPA"
    }
    '
    Saturday, June 30, 12

    View Slide

  16. Create Type & Add Data
    curl -XPOST 'http://localhost:9200/dfw/beer/1' -d '
    {
    "name": "Deep Ellum IPA"
    }
    '
    curl -XPOST 'http://localhost:9200/dfw/beer/2' -d '
    {
    "name": "Double Brown Stout"
    }
    '
    Saturday, June 30, 12

    View Slide

  17. curl -XPOST 'http://localhost:9200/dfw/brewery/1' -
    d '
    {
    "name": "Deep Ellum Brewing Company",
    "beers": [
    "Deep Ellum IPA",
    "Double Brown Stout"
    ]
    }
    '
    Saturday, June 30, 12

    View Slide

  18. PUT vs POST
    • You define ID - POST or PUT
    • ElasticSearch Define Id - POST
    • Using PUT with no ID will throw error
    Saturday, June 30, 12

    View Slide

  19. PUT vs POST
    POST
    {"ok":true,"_index":"dfw","_type":"brewery
    ","_id":"Iw9kfa3vSx2FyFen-
    uK26Q","_version":1}
    POST OR PUT
    {"ok":true,"_index":"dfw","_type":"brewery
    ","_id":"1","_version":1}
    PUT
    No handler found for uri [/dfw/brewery/]
    and method [PUT]
    Saturday, June 30, 12

    View Slide

  20. Update Data
    curl -XPUT 'http://localhost:9200/dfw/beer/1' -d '
    {
    "name": "Deep Ellum IPA",
    "style": "American-Style India Pale Ale"
    }
    '
    {"ok":true,"_index":"dfw","_type":"beer","_id":"1","_version":2}
    Saturday, June 30, 12

    View Slide

  21. Get Data
    curl -XGET 'http://localhost:9200/dfw/brewery/1'
    Saturday, June 30, 12

    View Slide

  22. {
    "_index": "dfw",
    "_type": "brewery",
    "_id": "1",
    "_version": 1,
    "exists": true,
    "_source": {
    "name": "Deep Ellum Brewing Company",
    "beers": [
    "Deep Ellum IPA",
    "Double Brown Stout"
    ]
    }
    }
    Saturday, June 30, 12

    View Slide

  23. Delete Data
    curl -XDELETE 'http://localhost:9200/dfw/beer/1'
    Saturday, June 30, 12

    View Slide

  24. Elasticsearch API’s
    Saturday, June 30, 12

    View Slide

  25. Elasticsearch API’s
    • Allow you to perform operations
    • Search, Add Mappings, Status, Refresh
    & Optimization
    • It’s just another endpoint
    • Checkout Elasticsearch.org for lots
    more
    Saturday, June 30, 12

    View Slide

  26. URL Structure
    http://localhost:9200/dfw/beer/
    Index
    Type
    ElasticSearch
    Location
    Saturday, June 30, 12

    View Slide

  27. URL Structure
    http://localhost:9200/dfw/beer/
    Index
    Type
    ElasticSearch
    Location
    _{API}
    API
    Method
    Saturday, June 30, 12

    View Slide

  28. Status
    • Displays Comprehensive Status
    Information on an indices.
    • Can be done at all levels (es, index,
    type)
    • Endpoint: _status
    Saturday, June 30, 12

    View Slide

  29. Refresh & Optimize
    • Refresh: refresh data for near real-time search.
    • All Levels
    • Endpoint: _refresh
    • Optimize: Optimizes Lucene segments for
    faster searching.
    • All Levels
    • Endpoint: _optimize
    Saturday, June 30, 12

    View Slide

  30. Mappings
    Saturday, June 30, 12

    View Slide

  31. Mapping
    • Defines how the document is mapped
    to search engine
    • You don’t have to define this. It’s
    dynamic. But you can....
    • You can define them at the Index and
    Type level
    • Endpoint: _mapping
    Saturday, June 30, 12

    View Slide

  32. What you can Define
    • What fields are searchable
    • Fields dataType
    • How they are stored
    • How they are tokenized (index,
    analyzed)
    • etc...
    Saturday, June 30, 12

    View Slide

  33. Mapping Types
    • Core Data Types
    • string, integer/long, float/double,
    boolean, and null
    • Arrays
    • IP Addresses
    • Geo Point
    • Attachment
    Saturday, June 30, 12

    View Slide

  34. curl -XPOST 'http://localhost:9200/dfw/brewery/
    _mapping' -d '
    {
    "brewery" : {
    "properties" : {
    "name" : {
    "type" : "string",
    "store" : "yes",
    "index" : "not_analyzed"
    },
    "established" : {
    "type" : "date",
    "format" : "YYYY"
    }
    }
    }
    }
    Saturday, June 30, 12

    View Slide

  35. Dynamic Mapping
    {
    "brewery" : {
    "properties" : {
    "name" : {
    "type" : "string"
    },
    "established" : {
    "type" : "string"
    }
    }
    }
    }
    Saturday, June 30, 12

    View Slide

  36. Delete Mapping
    curl -XDELETE 'http://localhost:9200/dfw/brewery/
    _mapping'
    Important to note that you are just deleting the mapping not
    the data.
    Saturday, June 30, 12

    View Slide

  37. Search
    Saturday, June 30, 12

    View Slide

  38. Two Types of Search
    • URI Request
    • Limited Searching
    • Request Body
    • Full functionality
    • JSON requests
    Saturday, June 30, 12

    View Slide

  39. URI Request
    • Performed through a web request or curl
    request.
    • Simple & Limited
    • No Filter, Facet, etc.
    http://localhost:9200/dfw/brewery/
    _search/?name:Deep Ellum Brewing Company
    Saturday, June 30, 12

    View Slide

  40. Request Body
    • Uses the Query DSL
    • Allows for Filters, Facets, Boosting, More
    Like this, Fuzzy, etc.
    • Is a JSON Request
    Saturday, June 30, 12

    View Slide

  41. Searching Across Indexes &
    Types
    http://localhost:9200/lse,rdu,dfw/_search...
    http://localhost:9200/rdu/brewery,beer/_search...
    Saturday, June 30, 12

    View Slide

  42. Faceted Searching
    • Facets are “Logical Groupings” that
    allow easier search navigation.
    • Drill down searching
    • Think Amazon or NewEgg.com
    • Type of Facets in Elasticsearch
    • Terms, Range, Histogram, Date
    Histogram, Statistical, & Geo
    Saturday, June 30, 12

    View Slide

  43. Faceted Searching
    Saturday, June 30, 12

    View Slide

  44. Faceted Searching
    This is Faceted Searching
    Saturday, June 30, 12

    View Slide

  45. Geo Capabilities
    Saturday, June 30, 12

    View Slide

  46. Geo Bounding Box
    Saturday, June 30, 12

    View Slide

  47. Geo Bounding Box
    Saturday, June 30, 12

    View Slide

  48. Geo Bounding Box
    Saturday, June 30, 12

    View Slide

  49. Geo Bounding Box
    Saturday, June 30, 12

    View Slide

  50. Geo Distance
    Saturday, June 30, 12

    View Slide

  51. Geo Distance
    Saturday, June 30, 12

    View Slide

  52. Geo Distance
    Saturday, June 30, 12

    View Slide

  53. Geo Distance Range
    Can Also be a Facet
    Saturday, June 30, 12

    View Slide

  54. Geo Distance Range
    Can Also be a Facet
    Saturday, June 30, 12

    View Slide

  55. Geo Distance Range
    Can Also be a Facet
    Saturday, June 30, 12

    View Slide

  56. Geo Distance Range
    Can Also be a Facet
    Saturday, June 30, 12

    View Slide

  57. Geo Distance Range
    Can Also be a Facet
    Saturday, June 30, 12

    View Slide

  58. Geo Polygon
    Saturday, June 30, 12

    View Slide

  59. Geo Polygon
    Saturday, June 30, 12

    View Slide

  60. Geo Polygon
    Saturday, June 30, 12

    View Slide

  61. Other Features
    • Highlighting
    • TTL
    • Routing - Tell were to look at node/shard,
    etc.
    • Scripting
    • Scrolling - Pagination of results
    • Plugins - Rivers & Attachments
    Saturday, June 30, 12

    View Slide

  62. Elastica
    Saturday, June 30, 12

    View Slide

  63. Elastica
    • PHP Library
    • Opensource Project
    • GitHub - https://github.com/ruflin/Elastica
    • Follows ZF Standards
    • Alternative to cUrl
    • Not a lot of documentation but there are
    TESTS!
    Saturday, June 30, 12

    View Slide

  64. Elastica
    • Everything is an Object
    • Inject objects to create Queries.
    • Under the hood - Array Based
    converted to JSON.
    • Debug
    • echo json_encode($query->toArray())
    Saturday, June 30, 12

    View Slide

  65. Examples
    • ElasticSearch Query DSL vs. Elastica
    • Preloaded Index (dfw) with two types
    (brewery, beer).
    • Uses BreweryDB data through the API
    • Examples available on Github
    Saturday, June 30, 12

    View Slide

  66. Examples
    Saturday, June 30, 12

    View Slide

  67. Resources
    Saturday, June 30, 12

    View Slide

  68. Resources
    • Elastica - Github (http://ruflin.github.com/Elastica/)
    • Elasticsearch - http://www.elasticsearch.org/
    • Elasticsearch GitHub - https://github.com/elasticsearch/
    elasticsearch
    • Google Groups (ES and Elastica)
    • Slideshare
    • Slides: http://farrelley.github.com/ElasticSearch-For-PHP/
    • Examples - https://github.com/farrelley/ElasticSearch-
    For-PHP
    Saturday, June 30, 12

    View Slide

  69. Questions?
    Saturday, June 30, 12

    View Slide

  70. Thank You!
    • farrelley - Twitter, Github
    • Follow me on Mojo Live
    • Joind.in - http://joind.in/6341
    Saturday, June 30, 12

    View Slide