Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ElastiCache Antipatterns - AWS Community Day UAE

Aram
October 25, 2023

ElastiCache Antipatterns - AWS Community Day UAE

Aram

October 25, 2023
Tweet

More Decks by Aram

Other Decks in Programming

Transcript

  1. $whoami • Kingdom of Himalayas • Built on AWS since

    2010 • Scale technology and teams • Head Engineering @ Consolidate Health • Mobility, Ed-Tech, E-Commerce etc. • @phoenixwizard in most places
  2. What to Expect? • A primer • I have a

    hammer anti-pattern • Missing my car’s battery anti-pattern • Duct-taped life boat anti-pattern • Santa stuck in chimney anti-patten
  3. • Make Redis a turtle anti-pattern • The annoying neighbour

    anti-pattern • Flying blind into the hurricane • Kayaking in the pacific Ocean
  4. ElasticCache: The Mythical Silver Bullet • Redis is an open

    source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. • Memcached is Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
  5. Memcached Redis Sub-millisecond latency Yes Yes Developer ease of use

    Yes Yes Data partitioning Yes Yes Support for a broad set of programming languages Yes Yes Advanced data structures - Yes Multithreaded architecture Yes - Snapshots - Yes Replication - Yes Transactions - Yes Pub/Sub - Yes Lua scripting - Yes Geospatial support - Yes
  6. Expand the world view • Binary-safe strings • Lists (basically

    linked list) • Sets (Unique unsorted string elements) • Sorted sets (Sets with every string having a score) • Hashes (similar to ruby or python hashes)
  7. Look Beyond !! • Bitmaps (or bit arrays, handle string

    like array of bits) • HyperLogLogs (estimate cardinality of set [Probablistic!!]) • Streams (append only collection of map-like entries) • Geospatial indexes (whereami. Longitude comes before latitude)
  8. • Product READY • Mail blast DONE • Press Release

    DONE • 100x Traffic DONE • Server Crash DONE
  9. • You now take data from DB • Transform the

    data and put it on ElasticCache Redis • Ssshh: The parameters used to transform the data are in your MIND
  10. “When I wrote the code, only God and I knew

    why I wrote the code. But now …” – Famous meme from Stackoverflow
  11. The instance goes DOWN. Node FAILURE • What do you

    think has happened? • And just like that your whole data evaporates !! • NO RECOVERY !!
  12. Gotchas? • You have set the interval for 5 minutes

    • What if server goes down 3 minutes after the snapshot? • Needs to fork, which is time consuming for big datasets
  13. Append Only File • Popularly known as AOF • Changelog

    style persistent format • appendfsync can be “no/everysec/always”
  14. Gotchas? • Not default with ElasticCache • AOF can be

    bigger and slower depending on sync policy • There are cases where AOF might not reproduce exactly same dataset on reloading
  15. What we do? • DO BOTH • Have your cake

    and eat it !! • Do both RDB and AOF(set sync to everysec)
  16. “Don’t just turn on AOF and restart your server” –

    – somewhere in Redis documentation
  17. How do you do it in ElasticCache? • Create a

    parameter group with the appendonly parameter set to yes. • You then assign that parameter group to your cluster
  18. One More Thing ??? • Depending on you use case

    evaluate • Multi AZ with replication groups turning AOF off • Cluster mode enabled or disabled • Need at least 2 replicas in different AZ
  19. “Oh God!! We have no idea where the ElasticCache server

    gets it’s data from” – – Scary stuff DevOps Engineers hear from Backend Engineers
  20. Hidden bash script is laughing at you sitting in a

    hidden location triggered by a cronjob !!
  21. • You are an amazing Engineer • Set up persistence

    perfectly • Now you have a millions of keys in your Redis instance • BUT YOUR REDIS IS RANDOMLY GOING SLOW
  22. So WHAT HAPPENED? • There is a cron (the golden

    bullet) which reads & writes a MILLION records in Redis (the silver bullet)
  23. “I have a single thread, nobody loves me” – –

    Redis getting bombarded by your cron
  24. But I read Redis is not Single Threaded • More

    threads in background perform backend cleaning works • Since v4.0, deleting objects in background & blocking commands implemented by Redis modules
  25. • You need to process all the keys, what are

    you going to do? Use the KEYS command? • But did you know this is a BLOCKING CALL?
  26. But What can I do now? • Use NON-BLOCKING calls

    wherever possible !! • Instead of KEYS use SCAN? • SCAN is a cursor based call
  27. What if I want a pattern? • Use MATCH Along

    with your SCAN • BEWARE: MATCH works on subset returned by SCAN
  28. REDIS CLUSTERS • They are not very well known to

    be as mature as other DB clusters • Not all commands & features available on single Redis work on partitions. Eg: MSET • Scale up and down might become tricky
  29. • For high availability, there is the master-slave model •

    Data is partitioned among various Redis instances and remains available even when a small subset of nodes is unavailable
  30. Observability? • Load • CPU usage • Memory usage •

    Swap usage • Network Bandwidth • Disk usage
  31. Redis Availability & Execution • connected_clients • keyspace • instantaneous_ops_per_sec

    • rdb_last_save_time • connected slaves • master_last_io_seconds_ago
  32. Typical Failure Points • latency • used memory • memory

    fragmentation ratio • evicted keys • blocked clients
  33. • Does your App need Cluster mode on or not?

    • Which class of machines should you use for which features? • AMD vs Intel vs Gravitron (spoiler AMD will not be happy) • Have you considered latency due to swapping? • You do realize that your persistence strategy can increase latency. Right?
  34. Quick Recap? • ElasticCache is not a silver Bullet •

    Redis defaults are not the best for everyone but ElasticCache has made things easier • Spend time and figure what works best for you • Make your own mistakes !!