5 years of running Elasticsearch in production

5 years of running Elasticsearch in production

At todays "Search Usergroup Berlin" ([1]) I gave this talk about how we operate Elasticsearch in production here at Infopark ([2]).

In this presentation I show our Elasticsearch cluster setup and what lessons we learned over the years.

This presentation was prepared by Anne Schulz ([3]) and me ([4]). All not referenced cat pictures are by Anne.

[1] https://www.meetup.com/de-DE/Search-UG-Berlin/events/239101829/
[2] https://infopark.com/
[3] https://twitter.com/AnneMoneSchulz
[4] https://twitter.com/_apepper

5c327a2952c9f00c6ab96524529ae85e?s=128

Alexander Pepper

May 30, 2017
Tweet

Transcript

  1. 5 years of running Elasticsearch in production

  2. Alexander Pepper alexander.pepper@infopark.de +49 30 747993-0 @apepper @_apepper

  3. 5 years of running Elasticsearch in production

  4. None
  5. Index Size • ~8 million documents • ~45 GB data

    • ~300 search requests/min • ~120 index requests/min
  6. Our history with Elasticsearch • 2011: started with version 0.17

    • 2014: migrated to 1.x (with new setup, regular maintenance and backups) • 2016: migrated to 2.x
  7. Index Size • 100 shards • 2 replica

  8. Overview • Infrastructure • Installation • Backup • Monitoring •

    Maintenance • Pitfalls
  9. Infrastructure

  10. Cluster Location • Amazon Web Services (AWS) • Region: eu-west-1

    (Ireland) • Using AWS Elastic Cloud Computing (EC2) • Management by AWS OpsWorks • Not accessible via the internet
  11. None
  12. 3x EC2 Instances • r3.xlarge instance type • CPU: Intel

    Xeon 2,5 GHz • RAM: 30 GB • Hard drive: 80 GB SSD • OS: Amazon Linux (based on Red Hat)
  13. Cluster Discovery • Internal • cloud-aws Plugin • Based on

    ec2 tags
  14. Cluster Discovery • External • Private instances inside a Virtual

    Private Cloud (VPC) • AWS Elastic Load Balancer (ELB) - only accessible from the VPC • API instances do have access to the ELB
  15. None
  16. VPC Pitfalls • Network Address Translation (NAT) instance needed •

    Disable OpsWorks auto healing (for private instances)
  17. Installation

  18. Installation • OpsWorks uses Chef Cookbooks • Comparable to ansible

    and puppet • Standard Cookbooks from
 https://supermarket.chef.io • Custom Cookbooks
  19. None
  20. None
  21. Packaging • On AWS Simple Storage Service (S3): • Cookbooks

    • Java • Elasticsearch • Elasticsearch plugins
  22. Cookbooks • disable swapiness • mount data volume • install

    Java • install Elasticsearch (with Monit) • install Elasticsearch plugins (Kibana, Marvel, Sense, etc.) • install backups • install monitoring
  23. Backup

  24. Backup • Snapshots since Elasticsearch version 1.0 • Point in

    time copy of all Elasticsearch data
  25. Backup Cronjob • ruby script • only backup on master

    node • daily snapshot repository on AWS S3 • 30 days data retention • 1st of month 365 days data retention • data retention via S3 lifecycle rules • hourly incremental backup • current size per day: 50 GB
  26. Restore • ruby script • clones OpsWorks stack • starts

    instances • restores requested backup • Current runtime: • instance boot ~7 min • restore snapshot ~22 min
  27. Monitoring

  28. Monitoring • Pingdom Server Monitoring (formerly known as Scout) •

    CPU • Diskspace/Open files • Memory/Swap • Cluster status • Number of nodes • Backup ("Say cheese") • AWS ELB
  29. None
  30. None
  31. Monitoring • API Monitoring • via Honeybadger • Warn about

    slow requests (slower than 2 seconds)
  32. Maintenance

  33. Maintenance • Quarterly • Check for new versions • OS

    • Cookbooks • Java • Elasticsearch • Plugins (Kibana, Marvel, etc.)
  34. Maintenance • Check restore • Full reindex • For other

    product: snapshot restore + partial reindex
  35. Pitfalls

  36. Pitfalls • Minimum Master Nodes • 50% RAM for Elasticsearch

    • VPC: Network Address Translation (NAT) instance needed • Private VPC instance: Disable OpsWorks auto healing • OpsWorks: start Elasticsearch via monit
  37. Thank you for your attention! Alexander Pepper alexander.pepper@infopark.de +49 30

    747993-0 @apepper @_apepper
  38. Picture Sources • https://www.flickr.com/photos/sigalrm/31560595165/ • https://www.flickr.com/photos/selda_eigler/8686009651/ • https://www.flickr.com/photos/aon/7817771968/ • https://www.flickr.com/photos/nathanf/2314676429/

    • https://www.flickr.com/photos/renarl/3400468165 • https://www.flickr.com/photos/aon/6272938468/ • https://www.flickr.com/photos/muratlivaneli/6104145120 • https://www.flickr.com/photos/30884177@N08/4107269864/ • https://www.flickr.com/photos/aon/7817811212/ • https://www.flickr.com/photos/29278394@N00/4689679306/ • https://www.flickr.com/photos/pustovit/15867520885/