Hello! I Am Derek Binkley
Senior Engineer with TurnTo Networks
Volunteer with Community Justice
@DerekB_WI [email protected]
Slide 3
Slide 3 text
Customer Generated Content
Slide 4
Slide 4 text
@DerekB_WI [email protected]
Fast Searching
Scalability
Finding Value within
a Sea of Data
Slide 5
Slide 5 text
@DerekB_WI [email protected]
What is it?
open-source, RESTful, distributed
search and analytics engine built
on Apache Lucene
Elasticsearch
Tool for querying and exploring
data
Kibana
Beats and Logstash
Tool for ingesting data from
specific sources
Slide 6
Slide 6 text
@DerekB_WI [email protected]
How is it stored?
A grouping of JSON documents
with similar structure.
Index
Defines what is contained in a
document
Mapping
A JSON document stores each
data element.
Document
@DerekB_WI [email protected]
Complex mapping
applications can be created
by using four types of
queries
Uses GeoJSON to define shape
GeoShape
Define top_left and bottom_right
Geo Bounding Box
Geo searches
Previous example
Geo Distance
Define points to create a polygon
Geo Polygon
Slide 47
Slide 47 text
@DerekB_WI [email protected]
Find results with a distance of a point
Distance Search
Slide 48
Slide 48 text
@DerekB_WI [email protected]
Filter by geo, aggregate by term
Distance Aggregation
Slide 49
Slide 49 text
@DerekB_WI [email protected]
Filter by geo, aggregate by term
Distance Aggregation
@DerekB_WI [email protected]
Elasticsearch is read and
search optimized at the
expense of expensive writes
Use batch API to insert many records
Batches
Strategy for queuing up data for batching
Message Queues
Sync with database
Batch by range
Ranges of data
Slide 54
Slide 54 text
@DerekB_WI [email protected]
Cannot update mapping manually
Must setup destination index
Reindex mapping
Slide 55
Slide 55 text
@DerekB_WI [email protected]
Can use alias to help with cutover
Reindex mapping