Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Badge Poser v3.0 - A DevOps Journey

Badge Poser v3.0 - A DevOps Journey

Sharing the whole journey experience. Starting with the handover of the keys of the pandora box, wandering around the deep dark forest of uncertainty and instability of the rushed deployed systems. Trying to declutter and reach a stable stage where the order reigns over chaos, where the poor guy can finally sleep at night and the pager eventually goes silent for a while. At the end we'll be reaching the so-desired level of confidence to not be worried about experimenting, changing things and upgrading infrastructure.

Fabio Cicerchia

November 25, 2020
Tweet

More Decks by Fabio Cicerchia

Other Decks in Programming

Transcript

  1. Hello! I AM FABIO CICERCHIA SW & Cloud Engineer @

    You can find me at: @fabiocicerchia
  2. Describe VM Config RAM: 2GB CPU: 2 HDD: 50GB Software:

    Apache 2.4.10, PHP 5.6.19, Redis 2.8.17, MySQL 5.5.47
  3. • Apache v2.4.10 ◦ Released on 2014-07-19: Age 6 years

    ◦ Available v2.4.43 • PHP v5.6.19 ◦ Released on 2016-03-03: Age 4 years ◦ Available v7.4.5 ◦ EOL: 2018-12-31 http://archive.apache.org/dist/httpd/ https://www.php.net/releases/index.php https://www.php.net/supported-versions.php https://github.com/redis/redis https://docs.redislabs.com/latest/rs/administering/product-lifecycle/ Describe VM Config - Notes • Redis v2.8.17 ◦ Released on 2014-09-19: Age 6 years ◦ Available v6.0.1 • MySQL v5.5.47 ◦ Released on 2015-12-07: Age 5 years ◦ Available v8.0.20
  4. • Nginx v1.18.0 • PHP v7.4.7 • Redis v4.0.10 Just

    Start! https://www.nginx.com/ https://www.php.net/ https://redis.io/
  5. • Ansible → Provisioning • Ansible Galaxy → Ansible’s Recipes

    Repo • AWS CloudFormation → Infrastructure as Code* • Let’s Encrypt → SSLTLS Certificate** * Terraform is way cooler **Yes, SSL is deprecated ...Then Refine https://www.ansible.com/ https://galaxy.ansible.com/ https://aws.amazon.com/cloudformation/ https://letsencrypt.org/
  6. Ansible: What’s for? - Ansible is perfect for VMs (for

    example EC2 in our scenario). - It is redundant for ECS with Fargate, since the underlying layer is fully managed by AWS. - It could be useful for ECS without Fargate, so it’ll provision the EC2 where the containers will run. - Useful for deploy and rollback.
  7. • pm.max_children = 150 pm.start_servers = 5 pm.min_spare_servers = 5

    pm.max_spare_servers = 35 • emergency_restart_threshold 10 emergency_restart_interval 1m process_control_timeout 10s • memory_limit = 192M Workaround #1: Not Quite There Yet
  8. Added Logz.io & Filebeat Added UptimeRobot It Keeps Crashing: Need

    Visibility https://logz.io/ https://www.elastic.co/beats/filebeat https://uptimerobot.com/
  9. • AWS ECS • AWS ECR Container - Part 1

    https://aws.amazon.com/ecs/ https://aws.amazon.com/ecr/
  10. Since the multi-container on Alpine was unstable just switched back

    to the good ol’ working one-container-has-all on Debian. Switch Back to All-in-One Debian
  11. MISS – The response was not found in the cache

    and so was fetched from an origin server. The response might then have been cached. BYPASS – The response was fetched from the origin server instead of served from the cache because the request matched a proxy_cache_bypass directive (see Can I Punch a Hole Through My Cache? below.) The response might then have been cached. EXPIRED – The entry in the cache has expired. The response contains fresh content from the origin server. Cache Statuses https://www.nginx.com/blog/nginx-caching-guide/
  12. Cache Statuses STALE – The content is stale because the

    origin server is not responding correctly, and proxy_cache_use_stale was configured. UPDATING – The content is stale because the entry is currently being updated in response to a previous request, and proxy_cache_use_stale updating is configured. REVALIDATED – The proxy_cache_revalidate directive was enabled and NGINX verified that the current cached content was still valid (If-Modified-Since or If-None-Match). HIT – The response contains valid, fresh content direct from the cache. https://www.nginx.com/blog/nginx-caching-guide/
  13. 0 0

  14. Key Takeaways - Never trust code - Never trust yourself

    - Do small steps - It’ll help you figuring out what went wrong - Version everything - Commit as often as possible - Never use latest tag - Use specific versions - Think outside the box - Don’t stick to playing by the manual - Prefer quick and easy fixes - Reduce the odds of breaking things - Use the tools to make your life easier - So choose them carefully - Monitor & Benchmark! - Your best friends for troubleshooting * random order