Manage Your Content with Elasticsearch

Slide 1

Slide 1 text

Managing Your Content With Elasticsearch Samantha Quiñones / @ieatkillerbees

Slide 2

Slide 2 text

About Me • Software Engineer & Data Nerd since 1997 • Doing “media stuff” since 2012 • Principal @ AOL since 2014 • @ieatkillerbees • http://samanthaquinones.com

Slide 3

Slide 3 text

What We’ll Cover • Intro to Elasticsearch • CRUD • Creating Mappings • Analyzers • Basic Querying & Searching • Scoring & Relevance • Aggregations Basics

Slide 4

Slide 4 text

But First… • Download - https://www.elastic.co/downloads/elasticsearch • Clone - https://github.com/squinones/elasticsearch-tutorial.git

Slide 5

Slide 5 text

What is Elasticsearch? • Near real-time (documents are available for search quickly after being indexed) search engine powered by Lucene • Clustered for H/A and performance via federation with shards and replicas

Slide 6

Slide 6 text

What’s it Used For? • Logging (we use Elasticsearch to centralize trafﬁc logs, exception logs, and audit logs) • Content management and search • Statistical analysis

Slide 7

Slide 7 text

Installing Elasticsearch $ curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/ distribution/tar/elasticsearch/2.1.1/elasticsearch-2.1.1.tar.gz $ tar -zxvf elasticsearch* $ cd elasticsearch-2.1.1/bin $ ./elasticsearch

Slide 8

Slide 8 text

Connecting to Elasticsearch • Via Java, there are two native clients which connect to an ES cluster on port 9300 • Most commonly, we access Elasticsearch via HTTP API

Slide 9

Slide 9 text

HTTP API curl -X GET "http://localhost:9200/?pretty"

Slide 10

Slide 10 text

Data Format • Elasticsearch is a document-oriented database • All operations are performed against documents (object graphs expressed as JSON)

Slide 11

Slide 11 text

Analogues Elasticsearch MySQL MongoDB Index Database Database Type Table Collection Document Row Document Field Column Field

Slide 12

Slide 12 text

Index Madness • Index is an overloaded term. • As a verb, to index a document is store a document in an index. This is analogous to an SQL INSERT operation. • As a noun, an index is a collection of documents. • Fields within a document have inverted indexes, similar to how a column in an SQL table may have an index.

Slide 13

Slide 13 text

Indexing Our First Document curl -X PUT "http://localhost:9200/test_document/test/1" -d '{ "name": "test_name" }’

Slide 14

Slide 14 text

Retrieving Our First Document curl -X GET "http://localhost:9200/test_document/test/1"

Slide 15

Slide 15 text

Let’s Look at Some Stackoverflow Posts! $ vi queries/bulk_insert_so_data.json

Slide 16

Slide 16 text

Bulk Insert curl -X PUT "http://localhost:9200/_bulk" --data-binary "@queries/ bulk_insert_so_data.json"

Slide 17

Slide 17 text

First Search curl -X GET "http://localhost:9200/stack_overflow/_search"

Slide 18

Slide 18 text

Query String Searches curl -X GET "http://localhost:9200/stack_overflow/_search?q=title:php"

Slide 19

Slide 19 text

Query DSL curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query" : { "match" : { "title" : "php" } } }'

Slide 20

Slide 20 text

Compound Queries curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query" : { "filtered": { "query" : { "match" : { "title" : "(php OR python) AND (flask OR laravel)" } }, "filter": { "range": { "score": { "gt": 3 } } } } } }'

Slide 21

Slide 21 text

Full-Text Searching curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query" : { "match" : { "title" : "php loop" } } }'

Slide 22

Slide 22 text

Relevancy • When searching (in query context), results are scored by a relevancy algorithm • Results are presented in order from highest to lowest score

Slide 23

Slide 23 text

Phrase Searching curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query" : { "match" : { "title": { "query": "for loop", "type": "phrase" } } } }'

Slide 24

Slide 24 text

Highlighting Searches curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query" : { "match" : { "title": { "query": "for loop", "type": "phrase" } } }, "highlight": { "fields" : { "title" : {} } } }'

Slide 25

Slide 25 text

Aggregations • Run statistical operations over your data • Also near real-time! • Complex aggregations are abstracted away behind simple interfaces— you don’t need to be a statistician

Slide 26

Slide 26 text

Analyzing Tags curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "size": 0, "aggs": { "all_tags": { "terms": { "field": "tags", "size": 0 } } } }'

Slide 27

Slide 27 text

Nesting Aggregations curl -X POST “http://localhost:9200/stack_overflow/_search" -d '{ "size": 0, "aggs": { "all_tags": { "terms": { "field": "tags", "size": 0 }, "aggs": { "avg_score": { "avg": { "field": "score"} } } } } }'

Slide 28

Slide 28 text

Break Time!

Slide 29

Slide 29 text

Under the Hood • Elasticsearch is designed from the ground-up to run in a distributed fashion. • Indices (collections of documents) are partitioned in to shards. • Shards can be stored on a single or multiple nodes. • Shards are balanced across the cluster to improve performance • Shards are replicated for redundancy and high availability

Slide 30

Slide 30 text

What is a Cluster? • One or more nodes (servers) that work together to… • serve a dataset that exceeds the capacity of a single server… • provide federated indexing (writes) and searching (reads)… • provide H/A through sharing and replication of data

Slide 31

Slide 31 text

What are Nodes? • Individual servers within a cluster • Can providing indexing and searching capabilities

Slide 32

Slide 32 text

What is an Index? • An index is logically a collection of documents, roughly analogous to a database in MySQL • An index is in reality a namespace that points to one or more physical shards which contain data • When indexing a document, if the speciﬁed index does not exist, it will be created automatically

Slide 33

Slide 33 text

What are Shards? • Low-level units that hold a slice of available data • A shard represents a single instance of lucene and is fully- functional, self-contained search engine • Shards are either primary or replicas and are assigned to nodes

Slide 34

Slide 34 text

What is Replication? • Shards can have replicas • Replicas primarily provide redundancy for when shards/nodes fail • Replicas should not be allocated on the same node as the shard it replicates

Slide 35

Slide 35 text

Default Topology • 5 primary shards per index • 1 replica per shard

Slide 36

Slide 36 text

NODE Clustering & Replication NODE R1 P2 P3 R2 R3 P4 R5 P1 R4 P5

Slide 37

Slide 37 text

Cluster Health curl -X GET “http://localhost:9200/_cluster/health" curl -X GET "http://localhost:9200/_cat/health?v"

Slide 38

Slide 38 text

_cat API • Display human-readable information about parts of the ES system • Provides some limited documentation of functions

Slide 39

Slide 39 text

aliases > $ http GET ':9200/_cat/aliases?v' alias index filter routing.index routing.search posts posts_561729df8ce4e * - - posts.public posts_561729df8ce4e * - - posts.write posts_561729df8ce4e - - - Display all conﬁgured aliases

Slide 40

Slide 40 text

allocation > $ http GET ':9200/_cat/allocation?v' shards disk.used disk.avail disk.total disk.percent host 33 2.6gb 21.8gb 24.4gb 10 host1 33 3gb 21.4gb 24.4gb 12 host2 34 2.6gb 21.8gb 24.4gb 10 host3 Show how many shards are allocated per node, with disk utilization info

Slide 41

Slide 41 text

count > $ http GET ':9200/_cat/count?v' epoch timestamp count 1453790185 06:36:25 182763 > $ http GET ‘:9200/_cat/count/posts?v’ epoch timestamp count 1453790467 06:41:07 164169 > $ http GET ‘:9200/_cat/count/posts.public?v’ epoch timestamp count 1453790472 06:41:12 164169= Display a count of documents in the cluster, or a speciﬁc index

Slide 42

Slide 42 text

fielddata > $ http -b GET ':9200/_cat/fielddata?v' id host ip node total site_id published 7tjeJNY3TMajqRkmYsJyrA host1 10.97.183.146 node1 1.1mb 170.1kb 996.5kb __xrpsKAQW6yyCY8luLQdQ host2 10.97.180.138 node2 1.6mb 329.3kb 1.3mb bdoNNXHXRryj22YqjnqECw host3 10.97.181.190 node3 1.1mb 154.7kb 991.7kb Shows how much memory is allocated to ﬁelddata (metadata used for sorts)

Slide 43

Slide 43 text

health > $ http -b GET ':9200/_cat/health?v' epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks 1453829723 17:35:23 ampehes_prod_cluster green 3 3 100 50 0 0 0 0

Slide 44

Slide 44 text

indices > $ http -b GET 'eventhandler-prod.elasticsearch.amppublish.aws.aol.com:9200/_cat/indices?v' health status index pri rep docs.count docs.deleted store.size pri.store.size green open posts_561729df8ce4e 5 1 468629 20905 4gb 2gb green open slideshows 5 1 3893 6 86mb 43mb

Slide 45

Slide 45 text

master > $ http -b GET ':9200/_cat/master?v' id host ip node 7tjeJNY3TMajqRkmYsJyrA host1 10.97.183.146 node1

Slide 46

Slide 46 text

nodes > $ http -b GET ':9200/_cat/nodes?v' host ip heap.percent ram.percent load node.role master name 127.0.0.1 127.0.0.1 50 100 2.47 d * Mentus

Slide 47

Slide 47 text

pending tasks % curl 'localhost:9200/_cat/pending_tasks?v' insertOrder timeInQueue priority source 1685 855ms HIGH update-mapping [foo][t] 1686 843ms HIGH update-mapping [foo][t] 1693 753ms HIGH refresh-mapping [foo][[t]] 1688 816ms HIGH update-mapping [foo][t] 1689 802ms HIGH update-mapping [foo][t] 1690 787ms HIGH update-mapping [foo][t] 1691 773ms HIGH update-mapping [foo][t]

Slide 48

Slide 48 text

shards > $ http -b GET ':9200/_cat/shards?v' index shard prirep state docs store ip node posts_561729df8ce4e 2 r STARTED 94019 410.5mb 10.97.180.138 host1 posts_561729df8ce4e 2 p STARTED 94019 412.7mb 10.97.181.190 host2 posts_561729df8ce4e 0 p STARTED 93307 413.6mb 10.97.183.146 host3 posts_561729df8ce4e 0 r STARTED 93307 415mb 10.97.180.138 host1 posts_561729df8ce4e 3 p STARTED 94182 407.1mb 10.97.183.146 host2 posts_561729df8ce4e 3 r STARTED 94182 403.4mb 10.97.180.138 host1 posts_561729df8ce4e 1 r STARTED 94130 447.1mb 10.97.180.138 host1 posts_561729df8ce4e 1 p STARTED 94130 447mb 10.97.181.190 host2 posts_561729df8ce4e 4 r STARTED 93299 421.5mb 10.97.183.146 host3 posts_561729df8ce4e 4 p STARTED 93299 398.8mb 10.97.181.190 host2

Slide 49

Slide 49 text

segments > $ http -b GET ':9200/_cat/segments?v' index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound posts_561726fecd9c6 0 p 10.97.183.146 _a 10 24 0 227.7kb 69554 true true 4.10.4 true posts_561726fecd9c6 0 p 10.97.183.146 _b 11 108 0 659.1kb 103242 true true 4.10.4 false posts_561726fecd9c6 0 p 10.97.183.146 _c 12 7 0 90.7kb 54706 true true 4.10.4 true posts_561726fecd9c6 0 p 10.97.183.146 _d 13 6 0 82.2kb 49706 true true 4.10.4 true posts_561726fecd9c6 0 p 10.97.183.146 _e 14 8 0 119kb 67162 true true 4.10.4 true posts_561726fecd9c6 0 p 10.97.183.146 _f 15 1 0 35.9kb 32122 true true 4.10.4 true posts_561726fecd9c6 0 r 10.97.180.138 _a 10 24 0 227.7kb 69554 true true 4.10.4 true posts_561726fecd9c6 0 r 10.97.180.138 _b 11 108 0 659.1kb 103242 true true 4.10.4 false

Slide 50

Slide 50 text

CRUD Operations

Slide 51

Slide 51 text

Document Model • Documents represent objects • By default, all ﬁelds in all documents are analyzed, and indexed

Slide 52

Slide 52 text

Metadata • _index - The index in which a document resides • _type - The class of object that a document represents • _id - The document’s unique identiﬁer. Auto-generated when not provided

Slide 53

Slide 53 text

Retrieving Documents curl -X GET "http://localhost:9200/test_document/test/1" curl -X HEAD “http://localhost:9200/test_document/test/1" curl -X HEAD "http://localhost:9200/test_document/test/2"

Slide 54

Slide 54 text

Updating Documents curl -X PUT "http://localhost:9200/test_document/test/1" -d '{ "name": "test_name", "conference": "php benelux" }' curl -X GET "http://localhost:9200/test_document/test/1"

Slide 55

Slide 55 text

Explicit Creates curl -X PUT "http://localhost:9200/test_document/test/1/_create" -d '{ "name": "test_name", "conference": "php benelux" }'

Slide 56

Slide 56 text

Auto-Generated IDs curl -X POST "http://localhost:9200/test_document/test" -d '{ "name": "test_name", "conference": "php benelux" }'

Slide 57

Slide 57 text

Deleting Documents curl -X DELETE "http://localhost:9200/test_document/test/1"

Slide 58

Slide 58 text

Bulk API • Perform many operations in a single request • Efﬁcient batching of actions • Bulk queries take the form of a stream of single-line JSON objects that deﬁne actions and document bodies

Slide 59

Slide 59 text

Bulk Actions • create - Index a document IFF it doesn’t exist already • index - Index a document, replacing it if it exists • update - Apply a partial update to a document • delete - Delete a document

Slide 60

Slide 60 text

Bulk API Format { action: { metadata }}\n { request body }\n { action: { metadata }}\n { request body }\

Slide 61

Slide 61 text

Sizing Bulk Requests • Balance quantity of documents with size of documents • Docs list the sweet-spot between 5-15 MB per request • AOL Analytics Cluster indexes 5000 documents per batch (approx 7MB)

Slide 62

Slide 62 text

Searching Documents • Structured queries - queries against concrete fields like “title” or “score” which return specific documents. • Full-text queries - queries that find documents which match a search query and return them sorted by relevance

Slide 63

Slide 63 text

Search Elements • Mappings - Deﬁnes how data in ﬁelds are interpreted • Analysis - How text is parsed and processed to make it searchable • Query DSL - Elasticsearch’s query language

Slide 64

Slide 64 text

About Queries • Leaf Queries - Searches for a value in a given ﬁeld. These queries are standalone. Examples: match, range, term • Compound Queries - Combinations of leaf queries and other compound queries which combine operations together either logically (e.g. bool queries) or alter their behavior (e.g. score queries)

Slide 65

Slide 65 text

Empty Search curl -X GET "http://localhost:9200/stack_overflow/_search" curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "query": { "match_all": {} } }'

Slide 66

Slide 66 text

Timing Out Searches curl -X GET "http://localhost:9200/stack_overflow/_search?timeout=1s" curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "timeout": "1s", "query": { "match_all": {} } }'

Slide 67

Slide 67 text

Multi-Index/Type Searches curl -X GET "http://localhost:9200/test_document,stack_overflow/_search"

Slide 68

Slide 68 text

Multi-Index Use Cases • Dated indices for logging • Roll-off indices for content-aging • Analytic roll-ups

Slide 69

Slide 69 text

Pagination curl -X GET "http://localhost:9200/stack_overflow/_search?size=5&from=5" curl -X POST "http://localhost:9200/stack_overflow/_search" -d '{ "size": 5, "from": 5, "query": { "match_all": {} } }'

Slide 70

Slide 70 text

Pagination Concerns • Since searches are distributed across multiple shards, paged queries must be sorted at each shard, combined, and resorted • The cost of paging in distributed data sets can increase exponentially • It is a wise practice to set limits to how many pages of results can be returned

Slide 71

Slide 71 text

Full Text Queries • match - Basic term matching query • multi_match - Match which spans multiple ﬁelds • common_terms - Match query which preferences uncommon words • query_string - Match documents using a search “mini-dsl” • simple_query_string - A simpler version of query_string that never throws exceptions, suitable for exposing to users

Slide 72

Slide 72 text

Term Queries • term - Search for an exact value • terms - Search for an exact value in multiple fields • range - Find documents where a value is in a certain range • exists - Find documents that have any non-null value in a field • missing - Inversion of èxists` • prefix - Match terms that begin with a string • wildcard - Match terms with a wildcard • regexp - Match terms against a regular expression • fuzzy - Match terms with configurable fuzziness

Slide 73

Slide 73 text

Compound Queries • constant_score - Wraps a query in filter context, giving all results a constant score • bool - Combines multiple leaf queries with `must`, `should`, `must_not` and `filter` clauses • dis_max - Similar to bool, but creates a union of subquery results scoring each document with the maximum score of the query that produced it • function_score - Modifies the scores of documents returned by a query . Useful for altering the distribution of results based on recency, popularity, etc. • boosting - Takes a `positive` and `negative` query, returning the results of `positive` while reducing the scores of documents that also match `negative` • filtered - Combines a query clause in query context with one in filter context • limit - Perform the query over a limited number of documents in each shard

Slide 74

Slide 74 text

What are Mappings? • Similar to schemas, they define the types of data found in fields • Determines how individual fields are analyzed & stored • Sets the format of date fields • Sets rules for mapping dynamic fields

Slide 75

Slide 75 text

Mapping Types • Indices have one or more mapping types which group documents logically. • Types contain meta ﬁelds, which can be used to customize metadata like _index, _id, _type, and _source • Types can also list ﬁelds that have consistent structure across types.

Slide 76

Slide 76 text

Data Types • Scalar Values - string, long, double, boolean • Special Scalars - date, ip • Structural Types - object, nested • Special Types - geo_shape, geo_point, completion • Compound Types - string arrays, nested objects

Slide 77

Slide 77 text

Dynamic vs Explicit Mapping • Dynamic fields are not defined prior to indexing • Elasticsearch selects the most likely type for dynamic fields, based on configurable rules • Explicit fields are defined exactly prior to indexing • Types cannot accept data that is the wrong type for an explicit mapping

Slide 78

Slide 78 text

Shared Fields • Fields that are deﬁned in multiple mapping types must be identical if: • They have the same name • Live in the same index • Map to the same ﬁeld internally

Slide 79

Slide 79 text

Examining Mappings curl -X GET "http://localhost:9200/stack_overflow/post/_mapping"

Slide 80

Slide 80 text

Dynamic Mappings • Mappings are generated when a type is created, if no mapping was previously specified. • Elasticsearch is good at identifying fields much of the time, but it’s far from perfect! • Fields can contain basic data-types, but importantly, mappings optimize a field for either structured (exact) or full-text searching

Slide 81

Slide 81 text

Structured Data vs Full Text • Exact values contain exact strings which are not subject to natural language interpretation. • Full-text values must be interpreted in the context of natural language

Slide 82

Slide 82 text

Exact Value • “[email protected]” is an email address in all contexts

Slide 83

Slide 83 text

Natural Language • “us” can be interpreted differently in natural language • Abbreviation for “United States” • The English dative personal pronoun • An alternative symbol for µs • The French word us

Slide 84

Slide 84 text

Analyzing Text • Elasticsearch is optimized for full text search • Text is analyzed in a two-step process • First, text is tokenized in to individual terms • Second, terms are normalized through a ﬁlter

Slide 85

Slide 85 text

Analyzers • Analyzers perform the analysis process • Character ﬁlters clean up text, removing or modifying the text • Tokenizers break the text down in to terms • Token ﬁlters modify, remove, or add terms

Slide 86

Slide 86 text

Standard Analyzer • General purpose analyzer that works for most natural language. • Splits text on word boundaries, removes punctuation, and lowercases all tokens.

Slide 87

Slide 87 text

Standard Analyzer curl -X GET "http://localhost:9200/_analyze?analyzer=standard&text="Reverse+text+with +strrev($text)!""

Slide 88

Slide 88 text

Whitespace Analyzer • Analyzer that splits on whitespace and lowercases all tokens

Slide 89

Slide 89 text

Whitespace Analyzer curl -X GET "http://localhost:9200/_analyze?analyzer=whitespace&text="Reverse+text+with +strrev($text)!""

Slide 90

Slide 90 text

Keyword Analyzer • Tokenizes the entire text as a single string. • Used for things that should be kept whole, like ID numbers, postal codes, etc

Slide 91

Slide 91 text

Keyword Analyzer curl -X GET "http://localhost:9200/_analyze?analyzer=keyword&text="Reverse+text+with +strrev($text)!""

Slide 92

Slide 92 text

Language Analyzers • Analyzers optimized for speciﬁc natural languages. • Reduce tokens to stems (jumper, jumped → jump)

Slide 93

Slide 93 text

Language Analyzers curl -X GET "http://localhost:9200/_analyze?analyzer=english&text="Reverse+text+with +strrev($text)!""

Slide 94

Slide 94 text

Analyzers • Analyzers are applied when documents are indexed • Analyzers are applied when a full-text search is performed against a ﬁeld, in order to produce the correct set of terms to search for

Slide 95

Slide 95 text

Character Filters • html_strip - Removes HTML from text • mapping - Filter based on a map of original → new ( { “ph”: “f” }) • pattern_replace - Similar to mapping, using regular expressions

Slide 96

Slide 96 text

Index Templates • Template mappings that are applied to newly created indices • Templates also contain index conﬁguration information • Powerful when combined with dated indices

Slide 97

Slide 97 text

Scoring • Scoring is based on a boolean model and scoring function • Boolean model applies AND/OR logic to an inverse index to produce a list of matching documents

Slide 98

Slide 98 text

Term Frequency • Terms that appear frequently in a document increase the document’s relevancy score. • term_frequency(term in document) = √number_of_appearances

Slide 99

Slide 99 text

Inverse Document Frequency • Terms that appear in many documents reduce a document’s relevancy score • inverse_doc_frequency(term) = 1 + log(number_of_docs / (frequency + 1))

Slide 100

Slide 100 text

Field Length Normalization • Terms that appear in shorter ﬁelds increase the relevancy of a document. • norm(document) = 1 / √number_of_terms

Slide 101

Slide 101 text

Example from the Docs • Given the text “quick brown fox” the term “fox” scores… • Term Frequency: 1.0 • Inverse Doc Frequency: 0.30685282 • Field Norm: 0.5 • Score: 0.15342641

Slide 102

Slide 102 text

Basic Relevancy { "size": 100, "query": { "filtered": { "query": { "match": { "contents": "miley cyrus" } }, "filter": { "and": [ { "terms": { "site_id": [ 698 ] } } ] } } } }

Slide 103

Slide 103 text

Non-Preferenced Result Recency

Slide 104

Slide 104 text

Recency-Adjusted Query { "query": { "function_score": { "functions": [ { "gauss": { "published": { "origin": "now", "scale": "10d", "offset": "1d", "decay": 0.3 } } } ], "query": { "filtered": { "query": { "match": { "contents": "miley cyrus" } }, "filter": { "and": [ { "terms": { "site_id": [ 698 ] } } ] } } } } } }

Slide 105

Slide 105 text

Preferenced Result Recency

Slide 106

Slide 106 text

Aggregations & Analytics

Slide 107

Slide 107 text

Importing Energy Data curl -X PUT "http://localhost:9200/energy_use" --data-binary "@queries/ mapping_energy.json" curl -X PUT "http://localhost:9200/_bulk" --data-binary "@queries/ bulk_insert_energy_data.json" curl -X GET "http://localhost:9200/energy_use/_search"

Slide 108

Slide 108 text

Average Energy Use curl -X POST "http://localhost:9200/energy_use/_search" -d '{ "size": 0, "aggs": { "average_laundry_use": { "avg": { "field": "laundry" } }, "average_kitchen_use": { "avg": { "field": "kitchen" } }, "average_heater_use": { "avg": { "field": "heater" } }, "average_other_use": { "avg": { "field": "other" } } } }'

Slide 109

Slide 109 text

Multiple Aggregations curl -X POST “http://localhost:9200/energy_use/_search" -d '{ "size": 0, "aggs": { "average_laundry_use": { "avg": { "field": "laundry" } }, "min_laundry_use": { "min": { "field": "laundry"} }, "max_laundry_use": { "max": { "field": "laundry"} } } }'

Slide 110

Slide 110 text

Nesting Aggregations curl -X POST “http://localhost:9200/energy_use/_search" -d '{ "size": 0, "aggs": { "by_date": { "terms": { "field": "date" }, "aggs": { "average_laundry_use": { "avg": { "field": "laundry" } }, "min_laundry_use": { "min": { "field": "laundry"} }, "max_laundry_use": { "max": { "field": "laundry"} } } } } }'

Slide 111

Slide 111 text

Stats/Extended Stats curl -X POST "http://localhost:9200/energy_use/_search" -d '{ "size": 0, "aggs": { "by_date": { "terms": { "field": "date" }, "aggs": { "laundry_stats": { "extended_stats": { "field": "laundry" } } } } } }'

Slide 112

Slide 112 text

Bucket Aggregations • Date Histogram • Term/Terms • Geo* • Signiﬁcant Terms

Slide 113

Slide 113 text

Questions? Use Cases? Exploration Ideas? https://joind.in/talk/e2e4b