Slide 1

Slide 1 text

Taming Your Data with Elasticsearch

Slide 2

Slide 2 text

Hello! I Am Derek Binkley Senior Engineer with TurnTo Networks Volunteer with Community Justice @DerekB_WI [email protected]

Slide 3

Slide 3 text

Customer Generated Content

Slide 4

Slide 4 text

Fast Searching Scalability Finding Value within a Sea of Data

Slide 5

Slide 5 text

What is it? open-source, RESTful, distributed search and analytics engine built on Apache Lucene Elasticsearch Tool for querying and exploring data Kibana

Slide 6

Slide 6 text

How is it stored? A grouping of JSON documents with similar structure. Index Defines what is contained in a document Mapping A JSON document stores each data element. Document

Slide 7

Slide 7 text

Storing Data

Slide 8

Slide 8 text

Store new document POST

Slide 9

Slide 9 text

Specify ID to update or insert PUT

Slide 10

Slide 10 text

Created automatically or manually Updated automatically Mapping

Slide 11

Slide 11 text

Mapping

Slide 12

Slide 12 text

Define empty index Setup document structure https:/ /www.elastic.co/guide/en/ elasticsearch/reference/current/indices- put-mapping.html Put Mapping

Slide 13

Slide 13 text

Storing Data with PHP

Slide 14

Slide 14 text

Guzzle converts array to JSON body Put Mapping

Slide 15

Slide 15 text

Guzzle converts array to JSON body Post

Slide 16

Slide 16 text

Update Data

Slide 17

Slide 17 text

Automatically assigned - POST Manually assigned - PUT ID

Slide 18

Slide 18 text

Replaces document if exists Adds new if not exists PUT DOC

Slide 19

Slide 19 text

Only updates named fields Update Fields

Slide 20

Slide 20 text

Replaces document if exists Adds new if not exists Script Update

Slide 21

Slide 21 text

Searching Data

Slide 22

Slide 22 text

Define query in JSON body match_all finds everything Query Keyword

Slide 23

Slide 23 text

Looking for best results Find a Match

Slide 24

Slide 24 text

Results are scored Find a Match

Slide 25

Slide 25 text

Results are scored Search Within Text

Slide 26

Slide 26 text

Results are scored Search Within Text

Slide 27

Slide 27 text

Damereau-Levenshtein Distance Fuzziness

Slide 28

Slide 28 text

more_like_this query Similar Documents

Slide 29

Slide 29 text

Paginating Data

Slide 30

Slide 30 text

Aggregating

Slide 31

Slide 31 text

Query unique results or keywords What’s In a Field

Slide 32

Slide 32 text

Query unique results or keywords that get sorted into “buckets” What’s In a Field

Slide 33

Slide 33 text

Calculate summary values such as avg Metrics

Slide 34

Slide 34 text

Calculate summary values such as avg Metrics

Slide 35

Slide 35 text

Group documents into buckets Buckets with Metrics

Slide 36

Slide 36 text

Group documents into buckets Buckets with Metrics

Slide 37

Slide 37 text

Geo Points

Slide 38

Slide 38 text

Find results with a distance of a point Distance Search

Slide 39

Slide 39 text

Filter by geo, aggregate by term Distance Aggregation

Slide 40

Slide 40 text

Filter by geo, aggregate by term Distance Aggregation

Slide 41

Slide 41 text

Complex mapping applications can be created by using four types of queries Uses GeoJSON to define shape GeoShape Define top_left and bottom_right Geo Bounding Box Geo searches Previous example Geo Distance Define points to create a polygon Geo Polygon

Slide 42

Slide 42 text

ANY QUESTIONS? You can find me at @DerekB_WI [email protected] Thanks!

Slide 43

Slide 43 text

https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up https://en.wikipedia.org/wiki/Damerau-Levenshtein_distance https://lucene.apache.org/ https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-getting-started.html http://geojson.org/ Resources