Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building infrastructure on AWS with Ruby

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Building infrastructure on AWS with Ruby

More Decks by Takayuki WATANABE (渡辺 喬之)

Other Decks in Technology

Transcript

  1. *ODSFBTJOHPG&OHJOFFST ZFBS    PGFOHJOFFST ɾ 'FXFOHJOFFSTBSFJO+BQBO ɾ PGVTFSTBSFTNBMM

    ɾ "SPVOEFOHJOFFST ɾ PGFOHJOFFSTBSFPVUTJEFPG+BQBO ɾ 5IFOVNCFSPGUFBNTBOETFSWJDFTJODSFBTF ɾ 6TFSTJODSFBTF ɾ "SPVOEFOHJOFFST ɾ .PTUPGFOHJOFFSTBSFJO+BQBO *KPJOFE$PPLQBE
  2. *ODSFBTJOHPG&OHJOFFST ZFBS    PGFOHJOFFST ɾ "SPVOEFOHJOFFST ɾ PGFOHJOFFSTBSFPVUTJEFPG+BQBO

    ɾ 5IFOVNCFSPGUFBNTBOETFSWJDFTJODSFBTF ɾ 6TFSTJODSFBTF ɾ "SPVOEFOHJOFFST ɾ .PTUPGFOHJOFFSTBSFJO+BQBO TUBHF TUBHF *KPJOFE$PPLQBE ɾ 'FXFOHJOFFSTBSFJO+BQBO ɾ PGVTFSTBSFTNBMM TUBHF
  3. 4FMFDUJPOPGJOGSBTUSVDUVSFTBOEBQQSPBDIFT • Which platform do we use for production environment?

    • PaaS, IaaS, on premise or etc … • Which development approaches(software architecture)? • monolithic architecture, SOA, microservices or etc…
  4. 1MBUGPSNBQQSPBDIFTGPSTUBHF • Managed PaaS like Heroku • Enables developers to

    focus on development of service • We don’t need infra engineers • Monolithic Architecture • Communication is quite easy and comfortable
  5. 1MBUGPSNBQQSPBDIFTGPSTUBHF • Cloud Service Providers like AWS, GCP, Azure •

    Use Virtual Machines and managed service for production environment • Enable us to build more flexible infrastructures • Users gradually increase and scalabilities of capacity are important • Monolithic Architecture • Communication is still easy and comfortable
  6. 1MBUGPSNBQQSPBDIFTGPSTUBHF • Cloud Service Providers like AWS, GCP, Azure •

    Use Containers and managed service for production environment • stage 2 is a warmup period to move stage3 • Enable us to build more flexible infrastructures • Scalabilities of capacity are important • Microservice Architecture • Communication costs are high due to the number of engineers • Separation of authorities and responsibilities are necessary to scale out an organization
  7. 3FBTPOTGPSVTJOH"84 • We use AWS since its early stage •

    AWS resources can be controlled via APIs • Pioneering cloud provider in this context • We use management tools written in Ruby for AWS • Declare AWS resources in Ruby DSL
  8. 5PPMTGPSPVSJOGSBTUSVDUVSFT • AWS resource management • Other server resource management

    • Database management tools • CDN management tools • Server configuration management (provisioning) • Deployment tools
  9. "84SFTPVSDFNBOBHFNFOU We codify AWS resources in Ruby DSL • Change

    history can be investigated from VCS(git, svn) • Idempotent • Current conditions of AWS resources should be synced to our codes • Don’t allow manual configuration changes to avoid chaos • If codes don’t have manual changes, they will be forcibly erased • Learning costs are low • Non-SRE engineer also can create PRs • Most of tools have: • dry-run feature to confirm changes before applying them • export feature to reflect current AWS condition to Ruby DSL
  10. "84SFTPVSDFNBOBHFNFOU • We ride on codenize.tools (https://codenize.tools/) • mainly maintained

    by one of our SRE • terraform was not ready when we started to use AWS • stable enough • easy to use
  11. "84SFTPVSDFNBOBHFNFOU Without any change tracking measures for the following resources,

    it tends to be linked to high operational costs • Route53 • Route Tables for Virtual Private Cloud • Identity and Access Management • Security Group • Elastic IP Addresses • Elastic Load Balancer • S3 Bucket Policy • CloudWatch Logs & Alarms
  12. 3PVUF • DNS service of AWS • Use Roadworker (https://github.com/codenize-tools/

    roadworker) to define states of Route53 using Ruby DSL
  13. hosted_zone "example.com." do rrset "example.com.", "A" do ttl 300 resource_records(

    "127.0.0.1", "127.0.0.2" ) end end 3PVUF (e.g) Declaration of “example.com” A record to Route53
  14. 71$3PVUF5BCMFT • Set of rules to determine where network traffic

    is directed • Use Mappru (https://github.com/codenize-tools/mappru) to define states of VPC Route Tables using Ruby DSL
  15. vpc "vpc-12345678" do route_table "foo-rt" do subnets "subnet-12345678" route destination_cidr_block:

    "0.0.0.0/0", gateway_id: "igw-12345678" route destination_cidr_block: “192.168.100.101/32", network_interface_id: "eni-12345678" end route_table "bar-rt" do subnets "subnet-87654321" route destination_cidr_block: "192.168.100.102/32", network_interface_id: "eni-87654321" end # Undefined Route Table will be ignored end 71$3PVUF5BCMFT (e.g) Declaration of Route Tables for vpc-12345678
  16. 4FDVSJUZ(SPVQT • Security Groups is a virtual firewall that controls

    the traffic for one or more instances • Use Piculet (https://github.com/codenize-tools/piculet) to define states of Route53 using Ruby DSL
  17. 4FDVSJUZ(SPVQT ec2 "vpc-XXXXXXXX" do security_group "default" do description "default VPC

    security group" tags( "key1" => "value1", "key2" => "value2" ) ingress do permission :tcp, 22..22 do ip_ranges( "0.0.0.0/0", ) end permission :tcp, 80..80 do ip_ranges( "0.0.0.0/0" ) end permission :udp, 60000..61000 do ip_ranges( "0.0.0.0/0" ) end # ESP (IP Protocol number: 50) permission :"50" do ip_ranges( "0.0.0.0/0" ) end permission :any do groups( "any_other_group", "default" ) end end # Continue to the right codes # Continue from the left codes egress do permission :any do ip_ranges( "0.0.0.0/0" ) end end end security_group "any_other_group" do description "any_other_group" tags( "key1" => "value1", "key2" => "value2" ) egress do permission :any do ip_ranges( "0.0.0.0/0" ) end end end end (e.g) Declaration of Security Groups for vpc-XXXXXXXX
  18. &MBTUJD*1"EESFTTFT • An Elastic IP address is a static IPv4

    address designed for dynamic cloud computing • Use Eipmap (https://github.com/codenize-tools/eipmap) to define states of Elastic IP Addresses using Ruby DSL
  19. domain "standard" do ip "54.256.256.1" ip "54.256.256.2", :instance_id=>"i-12345678" end domain

    "vpc" do ip "54.256.256.11", :network_interface_id=>"eni-12345678", :private_ip_address=>"10.0.1.1" ip "54.256.256.12", :network_interface_id=>"eni-12345678", :private_ip_address=>"10.0.1.2" ip "54.256.256.13" end &MBTUJD*1"EESFTTFT • (e.g) Declaration of Elastic IP Addresses
  20. *EFOUJUZBOE"DDFTT.BOBHFNFOU • AWS Identity and Access Management is a web

    service that helps us securely control access to AWS resources • Use Miam (https://github.com/codenize-tools/miam) to define states of Elastic IP Addresses using Ruby DSL
  21. user "takayuki-watanabe", :path=>"/infra/" do login_profile :password_reset_required=>false groups( "Admin" ) end

    group "Admin", :path => "/admin/" do policy "Admin" do {"Statement"=>[{"Effect"=>"Allow", "Action"=>"*", "Resource"=>"*"}]} end end *EFOUJUZBOE"DDFTT.BOBHFNFOU
  22. &MBTUJD-PBE#BMBODJOH • Elastic Load Balancing distributes incoming application traffic across

    multiple targets, such as Amazon EC2 instances, containers, and IP addresses • Use Kelbim (https://github.com/codenize-tools/kelbim) to define states of Elastic IP Addresses using Ruby DSL
  23. ec2 "vpc-XXXXXXXXX" do load_balancer "my-load-balancer", :internal => true do instances(

    "nyar", "yog" ) # or `any_instances` listeners do listener [:tcp, 80] => [:tcp, 80] listener [:https, 443] => [:http, 80] do app_cookie_stickiness "CookieName"=>"20" ssl_negotiation ["Protocol-TLSv1", "Protocol-SSLv3", "AES256-SHA", ...] server_certificate "my-cert" end end health_check do target "TCP:80" timeout 5 interval 30 healthy_threshold 10 unhealthy_threshold 2 end attributes do access_log :enabled => true, :s3_bucket_name => "any_bucket", :s3_bucket_prefix => nil, :emit_interval => 60 cross_zone_load_balancing :enabled => true connection_draining :enabled => false, :timeout => 300 end subnets( "subnet-XXXXXXXX" ) security_groups( "default" ) end end &MBTUJD-PBE#BMBODJOH • (e.g) Declaration of Elastic Load Balancing for vpc-XXXXXXXX
  24. 4#VDLFU1PMJDZ • Bucket policy and user policy are two of

    the access policy options available for you to grant permission to your Amazon S3 resources • Use Bukelatta (https://github.com/codenize-tools/kelbim) to define states of Elastic IP Addresses using Ruby DSL
  25. bucket "foo-bucket" do { "Version"=>"2012-10-17", "Id"=>"AWSConsole-AccessLogs-Policy-XXX", "Statement"=> [ { "Sid"=>"AWSConsoleStmt-XXX",

    "Effect"=>"Allow", "Principal"=>{"AWS"=>"arn:aws:iam::XXX:root"}, "Action"=>"s3:PutObject", "Resource"=> "arn:aws:s3:::foo-bucket/AWSLogs/XXX/*" } ] } end 4#VDLFU1PMJDZ • (e.g) Declaration of S3 Bucket Policy for foo-bucket
  26. $MPVE8BUDI-PHT"MBSN • Amazon CloudWatch offers cloud monitoring services for customers

    of AWS resources • Use Meteorlog (https://github.com/codenize-tools/ meteorlog) to define states of CloudWatch Logs using Ruby DSL • Use Radiosonde (https://github.com/codenize-tools/ radiosonde) to define states of CloudWatch Alarms using Ruby DSL
  27. log_group "/var/log/messages" do log_stream "my-stream" metric_filter "MyAppAccessCount" do metric :name=>"EventCount",

    :namespace=>"YourNamespace", :value=>"1" end metric_filter "MyAppAccessCount2" do filter_pattern '[ip, user, username, timestamp, request, status_code, bytes > 1000]' metric :name=>"EventCount2", :namespace=>"YourNamespace2", :value=>"2" end end log_group "/var/log/maillog" do log_stream "my-stream2" metric_filter "MyAppAccessCount" do filter_pattern '[..., status_code, bytes]' metric :name=>"EventCount3", :namespace=>"YourNamespace", :value=>"1" end metric_filter "MyAppAccessCount2" do filter_pattern '[ip, user, username, timestamp, request = *html*, status_code = 4*, bytes]' metric :name=>"EventCount4", :namespace=>"YourNamespace2", :value=>"2" end end $MPVE8BUDI-PHT • (e.g) Declaration of CloudWatch Logs streams
  28. alarm "alarm1" do namespace "AWS/EC2" metric_name "CPUUtilization" dimensions "InstanceId"=>"i-XXXXXXXX" period

    300 statistic :average threshold ">=", 50.0 evaluation_periods 1 actions_enabled true alarm_actions [] ok_actions [] insufficient_data_actions ["arn:aws:sns:us-east-1:123456789012:my_topic"] end alarm "alarm2" do ... end $MPVE8BUDI"MBSNT • (e.g) Declaration of CloudWatch Alarms
  29. .Z42-QSJWJMFHFT • The privileges granted to a MySQL account determine

    which operations the account can perform • Use Gratan (https://github.com/codenize-tools/gratan) to define states of MySQL access privileges using Ruby DSL
  30. user "bob", "%" do on "*.*" do grant "USAGE" end

    on "test.*", expired: '2014/10/08', identified: "PASSWORD '*ABCDEF'" do grant "SELECT" grant "INSERT" end on /^foo\.prefix_/ do grant "SELECT" grant "INSERT" end end user "bob", ["localhost", "192.168.%"], expired: '2014/10/10' do on "*.*", with: 'GRANT OPTION' do grant "ALL PRIVILEGES" end end .Z42-QSJWJMFHFT • (e.g) Declaration of MySQL privileges
  31. 0OMJOFTDIFNBNJHSBUJPO • pt-online-schema-change (https://www.percona.com/doc/ percona-toolkit/3.0/pt-online-schema-change.html) performs online, non-blocking schema changes

    to a table • Use Departure (https://github.com/departurerb/departure) without needing to use a different DSL other than Rails' migrations DSL (under trial) • Departure uses pt-online-schema-change command-line tool of Percona Toolkit which runs MySQL alter table statements without downtime
  32. 'BTUMZ • We use Fastly as our CDN • Use

    codily (https://github.com/sorah/codily) to define states of Fastly using Ruby DSL
  33. service "foo" do response_object "method not allowed" do status "405"

    response "Method Not Allowed" content "405" content_type "text/plain" request_condition "request method is not GET, HEAD or FASTLYPURGE" do priority 10 statement '!(req.request == "GET" || req.request == "HEAD" || req.request == "FASTLYPURGE")' end end end # equals as follows: service "foo" do condition "request method is not GET, HEAD or FASTLYPURGE" do priority 10 statement '!(req.request == "GET" || req.request == "HEAD" || req.request == "FASTLYPURGE")' type "REQUEST" end response_object "method not allowed" do status "405" response "Method Not Allowed" content "405" content_type "text/plain" request_condition "request method is not GET, HEAD or FASTLYPURGE" end end 'BTUMZ • (e.g) Declaration of Fastly configurations
  34. 4FSWFSDPOpHVSBUJPOT • Around a thousand EC2 instances are running on

    AWS • We used puppet previously as our configuration management • We want to use light tools like Ansible but also want to use Ruby DSL
  35. *UBNBF • Configuration management tool inspired by Chef • An

    itamae (൘લ) is a cook in a Japanese kitchen • Chef-like Ruby DSL (but not compatible with Chef) • Simpler and lighter weight than Chef • Only recipes • Apply recipes to a local machine • Apply recipes to a remote machine over ssh • Idempotent
  36. *UBNBF • (e.g) A sample recipe for nginx package 'nginx'

    do action :install end service 'nginx' do action [:enable, :start] end template "/path/to/dest" do action :create source "template.erb" variables(message: "World") end # template.erb Hello, <%= @message %>
  37. $BQJTUSBOP • Deploy Rails applications via Capistrano3 • Use Capistrano::BundleRsync

    (https://github.com/ sonots/capistrano-bundle_rsync) • Chat bot can invoke the deploy jobs via a deploy server
  38. 1SPCMFNTGPSUIFTFUPPMT • Tools explained in the previous slides work quite

    well for stage2 and a personal usage even though limited SRE can apply them to production environments • But if only SRE has privileges to use these tools, there might be problems in development scalabilities at stage3
  39. 1SPCMFNTGPSUIFTFUPPMT FYBNQMFT • Developers cannot update environment variables • SREs

    deploy them via Itamae • Developers cannot install software by themselves • SREs install them via Itamae • Developers cannot use new AWS resources soon • SREs deploy them via Codenize tools • SREs and Developers cannot work productively • frequent ops work might be requested to SREs and it becomes bottleneck of developments • Some part of authorities and responsibilities should be given them
  40. %PDLFSDPOUBJOFSTPO&$4 • Use Amazon ECS • ECS allows us to

    easily run and manage Docker-enabled applications across a cluster of EC2 instances. • Use hako (https://github.com/eagletmt/hako) to deploy Docker containers onto ECS clusters • Some applications will use this container environment
  41. %FQMPZNFOUqPXXJUI)BLP • Deploy containers via hako to ECS clusters and

    inject necessary data • Docker images are stored in ECR • Credentials are stored in Vault • Container app definitions are managed in yaml
  42. %FQMPZNFOUqPXXJUI)BLP • (e.g) A sample hako app definition file scheduler:

    type: ecs region: ap-northeast-1 cluster: eagletmt desired_count: 2 task_role_arn: arn:aws:iam::012345678901:role/Hello deployment_configuration: maximum_percent: 200 minimum_healthy_percent: 50 app: image: ryotarai/hello-sinatra memory: 128 cpu: 256 links: - redis:redis env: $providers: - type: file path: hello.env PORT: '3000' MESSAGE: '#{username}-san' # Continue to the right codes # Continue from the right codes additional_containers: front: image_tag: hako-nginx memory: 32 cpu: 32 redis: image_tag: redis:3.0 cpu: 64 memory: 512 scripts: - <<: !include front.yml backend_port: 3000
  43. %FQMPZNFOUGSPN4MBDL • Invoke deploy jobs defined on Rundeck via a

    chat bot • Use ruboty (https://github.com/r7kamura/ruboty) for chatops on Slack
  44. *OUSPEVDUJPOPGDPOUBJOFSFOWJSPONFOUT • Developers can update environment variables • hako app

    yaml has environment variables for each application • Developers can install necessary software by themselves • Docker images include all software for each application • Developers can use new AWS resources soon • Many AWS resources are ready for use after deploying containers to our ECS clusters • SRE and Developers will become productively • Authority and responsibilities are given them
  45. )VNCMFPQJOJPOT • Organization becomes big suddenly • Traditional development styles

    might not work suddenly and need to change them • There are technologies to support us and give more scalable environments • (e.g) Virtual Machine → Containers • (e.g) Monolithic architecture → Microservice architecture • But engineers cannot change their traditional workflows suddenly • Investigation and research at stage 2 is really important in terms of development scalabilities at stage 3 • At this moment, for container orchestrations, using kubernetes is better instead of ECS • Many players around containers join to kubernetes and develop eco systems (standing on the shoulders of giants)
  46. 3FDBQ • Selection of infrastructure platforms and approaches are important

    depend on the organization expansions • Writing infrastructures in Ruby DSL is pretty easy and works well • When organization becomes big, traditional workflow might not work