Upgrade to Pro — share decks privately, control downloads, hide ads and more …

🔪 How we cut our AWS costs in half

🔪 How we cut our AWS costs in half

E-commerce-search engines are resource eating monsters. Fast response times are key thus a lot of both heap and cpu is needed. For a customer we were challenged with reducing their overall AWS budget spent on driving their Solr cluster. In this project report we'll navigate through a classic cloud architecture way down to the foundations of garbage collection in the JVM. We'll explain the benefits of using ARM over Intel processors. We achieved a breakthrough by analysing the Solr internals using the Java Flight Recorder, upgrading the JDK to version 19 and experimenting with the Z garbage collector. In the end it worked: we cut the customers's AWS bill in half.

Torsten Bøgh Köster

April 03, 2023
Tweet

More Decks by Torsten Bøgh Köster

Other Decks in Technology

Transcript

  1. @debe software engineer, architect and performance enthusiast #jvm #high performance

    #rust @tboeghk Freelance Search & Operations Engineer #solr #observability #ops #java 🤝
  2. 📚 A holistic approach Project profile Architectural challenges Infrastructure challenges

    Software challenges Lessons learned Foto von Agnieszka Kowalczyk auf Unsplash
  3. 🐳 Solr cluster topology Split API & shop traffic API

    cluster absorbes request peaks Medium cluster utilization 💰 1k per instance/month
  4. 🐳 Solr Cluster topology Goal: get rid of redundant infrastructure

    How to enable Solr to handle request spikes?
  5. 🦾 ARM our lord and savior Docker buildx to build

    
 multi-arch builds Custom arm64 AMI Multi-Arch ASGs via MixedInstances in LaunchTemplate Graviton AMD Intel
  6. 💰 ARM vs AMD Graviton2 AMD Linux load1 (less is

    better) AMD power management ⚡
  7. 🙌 Why software challenges? Response time is tied to cpu

    utilization 😱 Foto von Jeremy Lapak auf Unsplash
  8. 👨✈ Java Flight Recorder Event based tracing framework built into

    the JVM Very low overhead < 1% Designed for production use Free to use Foto von Richard Cartmell auf Unsplash
  9. 🐾 Lessons learned • Have strong observability in place •

    Involve yourself in the OSS software you use – yes maybe you are the first one having this exact problem • Test in prod or live a lie