Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming Your Data With Elasticsearch - PHP[world]

derek-b
November 15, 2018

Taming Your Data With Elasticsearch - PHP[world]

Are you searching unstructured data or text fields? Do you need to aggregate and summarize your geo, financial, or other numeric data? Do you want to query your structured data in new and exciting ways? If so, Elasticsearch may be right for you. Let's explore the many ways you can ask questions of your data and have it make sense to you and your users. We'll sort through millions of rows in milliseconds and give you the tools to take your data analysis to the next level. You will learn how to use basic RESTful API calls to store, search, and aggregate your data.

derek-b

November 15, 2018
Tweet

More Decks by derek-b

Other Decks in Technology

Transcript

  1. @DerekB_WI [email protected]
    Taming Your Data
    with Elasticsearch

    View Slide

  2. Hello! I Am Derek Binkley
    Senior Engineer with TurnTo Networks
    Volunteer with Community Justice
    @DerekB_WI [email protected]

    View Slide

  3. Customer Generated Content

    View Slide

  4. @DerekB_WI [email protected]
    Fast Searching
    Scalability
    Finding Value within
    a Sea of Data

    View Slide

  5. @DerekB_WI [email protected]
    What is it?
    open-source, RESTful, distributed
    search and analytics engine built
    on Apache Lucene
    Elasticsearch
    Tool for querying and exploring
    data
    Kibana

    View Slide

  6. @DerekB_WI [email protected]
    How is it stored?
    A grouping of JSON documents
    with similar structure.
    Index
    Defines what is contained in a
    document
    Mapping
    A JSON document stores each
    data element.
    Document

    View Slide

  7. @DerekB_WI [email protected]
    Storing Data

    View Slide

  8. @DerekB_WI [email protected]
    Store new document
    POST

    View Slide

  9. @DerekB_WI [email protected]
    Specify ID to update or insert
    PUT

    View Slide

  10. @DerekB_WI [email protected]
    Created automatically or manually
    Updated automatically
    Mapping

    View Slide

  11. @DerekB_WI [email protected]
    Mapping

    View Slide

  12. @DerekB_WI [email protected]
    Define empty index
    Setup document structure
    https:/
    /www.elastic.co/guide/en/
    elasticsearch/reference/current/indices-
    put-mapping.html
    Put Mapping

    View Slide

  13. @DerekB_WI [email protected]
    Storing Data with PHP

    View Slide

  14. @DerekB_WI [email protected]
    Guzzle converts array to
    JSON body
    Put Mapping

    View Slide

  15. @DerekB_WI [email protected]
    Guzzle converts array to
    JSON body
    Post

    View Slide

  16. @DerekB_WI [email protected]
    Update Data

    View Slide

  17. @DerekB_WI [email protected]
    Automatically assigned - POST
    Manually assigned - PUT
    ID

    View Slide

  18. @DerekB_WI [email protected]
    Replaces entire document if exists
    Adds new if not exists
    PUT DOC

    View Slide

  19. @DerekB_WI [email protected]
    Only updates named fields
    Update Fields

    View Slide

  20. @DerekB_WI [email protected]
    Painless scripting language
    Script Update

    View Slide

  21. @DerekB_WI [email protected]
    Searching Data

    View Slide

  22. @DerekB_WI [email protected]
    Define query in JSON body
    match_all finds everything
    Query Keyword

    View Slide

  23. @DerekB_WI [email protected]
    Looking for best results
    Find a Match

    View Slide

  24. @DerekB_WI [email protected]
    Results are scored
    Find a Match

    View Slide

  25. @DerekB_WI [email protected]
    Results are scored
    Search Within Text

    View Slide

  26. @DerekB_WI [email protected]
    Results are scored
    Search Within Text

    View Slide

  27. @DerekB_WI [email protected]
    Damereau-Levenshtein
    Distance
    Fuzziness

    View Slide

  28. @DerekB_WI [email protected]
    more_like_this query
    Similar Documents

    View Slide

  29. @DerekB_WI [email protected]
    Paginating Data

    View Slide

  30. @DerekB_WI [email protected]
    Skip 100 and limit results to
    100.
    Only for first 10,000 hits
    Skip Results

    View Slide

  31. @DerekB_WI [email protected]
    Aggregating

    View Slide

  32. @DerekB_WI [email protected]
    Query unique results or keywords
    What’s In a Field

    View Slide

  33. @DerekB_WI [email protected]
    Query unique results or keywords that
    get sorted into “buckets”
    What’s In a Field

    View Slide

  34. @DerekB_WI [email protected]
    Calculate summary
    values such as max, min,
    average
    Metrics

    View Slide

  35. @DerekB_WI [email protected]
    Calculate summary
    values such as max, min,
    average
    Metrics

    View Slide

  36. @DerekB_WI [email protected]
    Group documents into buckets
    Buckets with Metrics

    View Slide

  37. @DerekB_WI [email protected]
    Group documents into buckets
    Buckets with Metrics

    View Slide

  38. @DerekB_WI [email protected]
    Geo Points

    View Slide

  39. @DerekB_WI [email protected]
    Find results with a distance of a point
    Distance Search

    View Slide

  40. @DerekB_WI [email protected]
    Filter by geo, aggregate by term
    Distance Aggregation

    View Slide

  41. @DerekB_WI [email protected]
    Filter by geo, aggregate by term
    Distance Aggregation

    View Slide

  42. @DerekB_WI [email protected]
    Sort by distance
    Distance Sort

    View Slide

  43. @DerekB_WI [email protected]
    Sort by distance
    Distance Sort

    View Slide

  44. @DerekB_WI [email protected]
    Complex mapping
    applications can be created
    by using four types of
    queries
    Uses GeoJSON to define shape
    GeoShape
    Define top_left and bottom_right
    Geo Bounding Box
    Geo searches
    Previous example
    Geo Distance
    Define points to create a polygon
    Geo Polygon

    View Slide

  45. @DerekB_WI [email protected]
    ANY QUESTIONS?
    You can find me at
    @DerekB_WI
    [email protected]
    derekb-wi.com
    Thanks!

    View Slide

  46. @DerekB_WI [email protected]
    https:/
    /joind.in/talk/171c3
    THANKS!

    View Slide

  47. Thanks to Our
    Sponsors
    2018

    View Slide

  48. @DerekB_WI [email protected]
    https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up
    https://en.wikipedia.org/wiki/Damerau-Levenshtein_distance
    https://lucene.apache.org/
    https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-getting-started.html
    http://geojson.org/
    Resources

    View Slide