Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ILM Meetup Presentation

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for AleBroff AleBroff
March 10, 2020

ILM Meetup Presentation

In this presentation we will describe the need as well as the advantage of usage of Index Lifecycle Management (ILM) with Time Series Data

Avatar for AleBroff

AleBroff

March 10, 2020
Tweet

More Decks by AleBroff

Other Decks in Education

Transcript

  1. © Elasticsearch BV 2015-2017. All rights reserved. Time Series Data

    • Logs, social media streams, time-based events • Timestamp + Data • Do not change • Typically search for recent events • Older documents become less important • Hard to predict the data size • Time-based Indices is the best option ‒ create a new index each day, week, month, year, ... ‒ search the indices you need in the same request 3
  2. © Elasticsearch BV 2015-2017. All rights reserved. How do we

    handle Time Series Data ? 7 my_cluster logs-day1 read-alias write-alias users Application
  3. © Elasticsearch BV 2015-2017. All rights reserved. How do we

    handle Time Series Data ? 8 my_cluster logs-day1 read-alias write-alias users Application logs-day2
  4. © Elasticsearch BV 2015-2017. All rights reserved. How do we

    handle Time Series Data ? 9 my_cluster logs-day1 read-alias write-alias users Application logs-day2 logs-day3
  5. © Elasticsearch BV 2015-2020. All rights reserved. Challenges with Time

    Series Data 10 Data Validity Controlling Index Size • Data loses importance over time.Older indexes are queried less frequently. • Hard to predict Index Size Can we optimise the storage for this data ? Can we control the size and time validity of an Index ?
  6. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Data Nodes • Data nodes have two main features: ‒ they hold the shards that contain the documents you have indexed ‒ they execute data related operations like CRUD, search, and aggregations • All nodes are data nodes by default ‒ configured using the node.data property 12
  7. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Hot/Warm/Cold Architecture • You can configure data nodes in your cluster to use a hot/ warm/cold architecture ‒ useful for scenarios where you want to control which nodes perform indexing vs. query handling • Fine-grained control over data allocation • Data nodes can be used as: ‒ hot nodes: for supporting the indices with new documents being written to ‒ warm nodes: for handling read-only indices that are not as likely to be queried frequently ‒ cold nodes: Indexes are frozen and do not occupy anymore memory, searches are really slow. Data is deleted after a retention period 13
  8. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Hot Nodes • Use hot nodes for the indexing ‒ indexing is a CPU and IO intensive operation, so hot nodes should be powerful servers ‒ faster storage than the warm nodes 14 my_cluster { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "volume": 46965, "high": 31.56, "stock_symbol": "ALL", "low": 30.68, "close": 30.91, "trade_date": "2010-01-15T07:00:00.000Z", { "username" : "kimchy", "tweet" : "Search is something that any application should have", "tweet_time" : "2010-02-17T23:09:00Z" } hot_node1 hot_node2 hot_node3
  9. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Warm Nodes • Use warm nodes for older, read-only indices ‒ Indexes are shrinked with _shrink API to occupy less space ‒ tend to utilize large attached disks (usually spinning disks) ‒ larger amounts of data may require additional nodes to meet performance requirements 15 my_cluster hot_node1 hot_node2 hot_node3 warm_node1 warm_node2 warm_node3 warm_node4 warm_node5 warm_node6 GET tweets*/_search { "query": { "match": { "tweet": "elastic" } } } GET tweets*/_search { "query": { "match": { "tweet": "elastic" } } } GET tweets*/_search { "query": { "match": { "tweet": "elastic" } } }
  10. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Cold Nodes • Use cold nodes for frozen indexes ‒ A frozen index has almost no overhead on the cluster, its shard memory is moved to persistence storage ‒ You can still search but it will be longer! my_cluster hot_node1 hot_node2 hot_node3 hot_node4 hot_node5 warm_node1 warm_node2 warm_node3 warm_node4 cold_node1 cold_node2 Hot nodes: for indexing and heavy searching Warm nodes: no indexing and moderate searching Cold nodes: no indexing and rare searching
  11. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited ILM = Index Lifecycle Management • ILM simplify managing indexes in hot-warm-cold architectures, allowing to define a lifecycle policy that controls how an index moves between phases. • Available in Kibana UI or via ILM REST APIs 18
  12. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Lifecycle Policies • The “what to do” and “when to do it” are defined by lifecycle policies ‒ defined either using the API or the Kibana UI 22
  13. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited The process in a nutshell…. • To automate rollover and management of time-series indices with ILM, you: ‒ Create a lifecycle policy ‒ Create an index template to apply the policy for every new index, this specify the Rollover alias ‒ Bootstrap an index as the initial write index (Set up an Alias) ‒ Verify indexes are moving through the lifecycle phases 23
  14. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Demo Environment 24 Node3 (WARM) Node4 (WARM) Node1 (HOT) Node2 (HOT)
  15. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Demo Environment 25 Node3 (WARM) Node4 (WARM) Node1 (HOT) Node2 (HOT) logs-day-test-00001
  16. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Demo Environment 26 Node3 (WARM) Node4 (WARM) Node1 (HOT) Node2 (HOT) logs-day-test-00002 shrink-logs-day- test-00001
  17. Copyright Elasticsearch BV 2015-2020 Copying, publishing and/or distributing without written

    permission is strictly prohibited Demo Environment 27 Node3 (WARM) Node4 (WARM) Node1 (HOT) Node2 (HOT) logs-day-test-00003 shrink-logs-day- test-00001 shrink-logs-day- test-00002