Upgrade to Pro — share decks privately, control downloads, hide ads and more …

🔪 How we cut our AWS costs in half

🔪 How we cut our AWS costs in half

E-commerce-search engines are resource eating monsters. Fast response times are key thus a lot of both heap and cpu is needed. For a customer we were challenged with reducing their overall AWS budget spent on driving their Solr cluster. In this project report we'll navigate through a classic cloud architecture way down to the foundations of garbage collection in the JVM. We'll explain the benefits of using ARM over Intel processors. We achieved a breakthrough by analysing the Solr internals using the Java Flight Recorder, upgrading the JDK to version 19 and experimenting with the Z garbage collector. In the end it worked: we cut the customers's AWS bill in half.

Avatar for Torsten Bøgh Köster

Torsten Bøgh Köster

April 03, 2023
Tweet

More Decks by Torsten Bøgh Köster

Other Decks in Technology

Transcript

  1. @debe software engineer, architect and performance enthusiast #jvm #high performance

    #rust @tboeghk Freelance Search & Operations Engineer #solr #observability #ops #java 🤝
  2. 📚 A holistic approach Project profile Architectural challenges Infrastructure challenges

    Software challenges Lessons learned Foto von Agnieszka Kowalczyk auf Unsplash
  3. 🐳 Solr cluster topology Split API & shop traffic API

    cluster absorbes request peaks Medium cluster utilization 💰 1k per instance/month
  4. 🐳 Solr Cluster topology Goal: get rid of redundant infrastructure

    How to enable Solr to handle request spikes?
  5. 🦾 ARM our lord and savior Docker buildx to build

    
 multi-arch builds Custom arm64 AMI Multi-Arch ASGs via MixedInstances in LaunchTemplate Graviton AMD Intel
  6. 💰 ARM vs AMD Graviton2 AMD Linux load1 (less is

    better) AMD power management ⚡
  7. 🙌 Why software challenges? Response time is tied to cpu

    utilization 😱 Foto von Jeremy Lapak auf Unsplash
  8. 👨✈ Java Flight Recorder Event based tracing framework built into

    the JVM Very low overhead < 1% Designed for production use Free to use Foto von Richard Cartmell auf Unsplash
  9. 🐾 Lessons learned • Have strong observability in place •

    Involve yourself in the OSS software you use – yes maybe you are the first one having this exact problem • Test in prod or live a lie