Slide 1

Slide 1 text

@DerekB_WI [email protected] Taming Your Data with Elasticsearch

Slide 2

Slide 2 text

Hello! I Am Derek Binkley Senior Engineer with TurnTo Networks Volunteer with Community Justice @DerekB_WI [email protected]

Slide 3

Slide 3 text

Customer Generated Content

Slide 4

Slide 4 text

@DerekB_WI [email protected] Fast Searching Scalability Finding Value within a Sea of Data

Slide 5

Slide 5 text

@DerekB_WI [email protected] What is it? open-source, RESTful, distributed search and analytics engine built on Apache Lucene Elasticsearch Tool for querying and exploring data Kibana

Slide 6

Slide 6 text

@DerekB_WI [email protected] How is it stored? A grouping of JSON documents with similar structure. Index Defines what is contained in a document Mapping A JSON document stores each data element. Document

Slide 7

Slide 7 text

@DerekB_WI [email protected] Storing Data

Slide 8

Slide 8 text

@DerekB_WI [email protected] Store new document POST

Slide 9

Slide 9 text

@DerekB_WI [email protected] Specify ID to update or insert PUT

Slide 10

Slide 10 text

@DerekB_WI [email protected] Created automatically or manually Updated automatically Mapping

Slide 11

Slide 11 text

@DerekB_WI [email protected] Mapping

Slide 12

Slide 12 text

@DerekB_WI [email protected] Define empty index Setup document structure https:/ /www.elastic.co/guide/en/ elasticsearch/reference/current/indices- put-mapping.html Put Mapping

Slide 13

Slide 13 text

@DerekB_WI [email protected] Storing Data with PHP

Slide 14

Slide 14 text

@DerekB_WI [email protected] Guzzle converts array to JSON body Put Mapping

Slide 15

Slide 15 text

@DerekB_WI [email protected] Guzzle converts array to JSON body Post

Slide 16

Slide 16 text

@DerekB_WI [email protected] Update Data

Slide 17

Slide 17 text

@DerekB_WI [email protected] Automatically assigned - POST Manually assigned - PUT ID

Slide 18

Slide 18 text

@DerekB_WI [email protected] Replaces entire document if exists Adds new if not exists PUT DOC

Slide 19

Slide 19 text

@DerekB_WI [email protected] Only updates named fields Update Fields

Slide 20

Slide 20 text

@DerekB_WI [email protected] Painless scripting language Script Update

Slide 21

Slide 21 text

@DerekB_WI [email protected] Searching Data

Slide 22

Slide 22 text

@DerekB_WI [email protected] Define query in JSON body match_all finds everything Query Keyword

Slide 23

Slide 23 text

@DerekB_WI [email protected] Looking for best results Find a Match

Slide 24

Slide 24 text

@DerekB_WI [email protected] Results are scored Find a Match

Slide 25

Slide 25 text

@DerekB_WI [email protected] Results are scored Search Within Text

Slide 26

Slide 26 text

@DerekB_WI [email protected] Results are scored Search Within Text

Slide 27

Slide 27 text

@DerekB_WI [email protected] Damereau-Levenshtein Distance Fuzziness

Slide 28

Slide 28 text

@DerekB_WI [email protected] more_like_this query Similar Documents

Slide 29

Slide 29 text

@DerekB_WI [email protected] Paginating Data

Slide 30

Slide 30 text

@DerekB_WI [email protected] Skip 100 and limit results to 100. Only for first 10,000 hits Skip Results

Slide 31

Slide 31 text

@DerekB_WI [email protected] Aggregating

Slide 32

Slide 32 text

@DerekB_WI [email protected] Query unique results or keywords What’s In a Field

Slide 33

Slide 33 text

@DerekB_WI [email protected] Query unique results or keywords that get sorted into “buckets” What’s In a Field

Slide 34

Slide 34 text

@DerekB_WI [email protected] Calculate summary values such as max, min, average Metrics

Slide 35

Slide 35 text

@DerekB_WI [email protected] Calculate summary values such as max, min, average Metrics

Slide 36

Slide 36 text

@DerekB_WI [email protected] Group documents into buckets Buckets with Metrics

Slide 37

Slide 37 text

@DerekB_WI [email protected] Group documents into buckets Buckets with Metrics

Slide 38

Slide 38 text

@DerekB_WI [email protected] Geo Points

Slide 39

Slide 39 text

@DerekB_WI [email protected] Find results with a distance of a point Distance Search

Slide 40

Slide 40 text

@DerekB_WI [email protected] Filter by geo, aggregate by term Distance Aggregation

Slide 41

Slide 41 text

@DerekB_WI [email protected] Filter by geo, aggregate by term Distance Aggregation

Slide 42

Slide 42 text

@DerekB_WI [email protected] Sort by distance Distance Sort

Slide 43

Slide 43 text

@DerekB_WI [email protected] Sort by distance Distance Sort

Slide 44

Slide 44 text

@DerekB_WI [email protected] Complex mapping applications can be created by using four types of queries Uses GeoJSON to define shape GeoShape Define top_left and bottom_right Geo Bounding Box Geo searches Previous example Geo Distance Define points to create a polygon Geo Polygon

Slide 45

Slide 45 text

@DerekB_WI [email protected] ANY QUESTIONS? You can find me at @DerekB_WI [email protected] derekb-wi.com Thanks!

Slide 46

Slide 46 text

@DerekB_WI [email protected] https:/ /joind.in/talk/171c3 THANKS!

Slide 47

Slide 47 text

Thanks to Our Sponsors 2018

Slide 48

Slide 48 text

@DerekB_WI [email protected] https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up https://en.wikipedia.org/wiki/Damerau-Levenshtein_distance https://lucene.apache.org/ https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-getting-started.html http://geojson.org/ Resources