Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HashiConf 2015

Jeremy Carroll
September 29, 2015

HashiConf 2015

Packer @ Pinterest

Jeremy Carroll

September 29, 2015
Tweet

More Decks by Jeremy Carroll

Other Decks in Technology

Transcript

  1. Packer @ Pinterest
    2
    Jeremy Carroll - Site Reliability Engineer

    View full-size slide

  2. HashiConf agenda
    What we will talk about
    3
    Images
    All about
    Machine Images
    1
    Build System
    System design to
    create images
    2
    Management
    Lifecycle policies for
    machine images
    3

    View full-size slide

  3. 50+ Billion Pins
    categorized by people into more than

    1 Billion Boards
    4

    View full-size slide

  4. 5
    vs
    Baking Frying

    View full-size slide

  5. 6
    vs
    Baking Frying
    Produce a reusable machine
    image. Then retrieve, install,
    and apply essential components
    or configuration at runtime

    An purpose built machine
    image, with all the requisite
    software to fulfill a specific task.


    View full-size slide

  6. 7
    AMI
    Frying
    • Less State - more dependencies
    • More reusable - fewer images
    • Slower boot times
    Runtime
    App
    Dependencies
    App Config
    OS Packages
    Application
    Code
    OS Config

    View full-size slide

  7. 8
    Baking
    App
    Dependencies
    App Config
    OS Packages
    Application
    Code
    OS Config
    AMI Runtime
    • More State - less dependencies
    • Not very reusable,more images
    • Faster boot times
    • AutoScaling a primary driver

    View full-size slide

  8. SLA Inversion
    What’s the availability of your launch process?
    9
    Depends
    My App
    ?
    Apt Repo 1
    99.9%
    Public Repo 1
    NO SLA
    Python Repo 1
    99.9%
    DNS
    99.99%
    Puppet
    99.9%
    Depends
    9
    DynamoDB
    99.99%
    Depends Depends
    D
    epends D
    epends

    View full-size slide

  9. Overview
    Key Points
    • The “Bake vs Fry” debate is moot. Most
    environments do both.
    • It’s your choice to move more state inside of the
    image, and less at runtime depending on your
    requirements
    • Moving images to more Baking increases reliability
    by reducing or removing dependent services
    Machine Images
    10

    View full-size slide

  10. Build System
    Reliable / repeatable / testable

    build infrastructure for the Cloud
    2:50 PM 100%
    11

    View full-size slide

  11. Machine Image Building
    AWS Workflows
    12
    Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html
    AMI lifecycle
    Instance-Store Image Creation
    EBS Image Creation

    View full-size slide

  12. Image States
    Checkpoints for Machine Images
    13
    Foundation Base Application
    Features

    - Barebones OS
    - Standard Linux distribution
    - Minified file set
    - Built from scratch, or using
    a public cloud image
    Features

    - “Pinterest” platform OS
    - Security best practices
    - Basic packages common
    across all images
    - Kernel upgrades +
    performance tuning
    Features
    - Application specific tuning
    - May contain code + configs
    - Not reusable past this
    specific application
    Ancestor Ancestor
    Layer 0 Layer 1 Layer 2

    View full-size slide

  13. Previous System
    Creating an image
    14

    #!/bin/bash

    ...

    echo "Copying Files - $files"

    /bin/cp -rp --force $files/* $imagedir/


    for s in $(ls $scripts)

    do

    echo "Copying Script to CHROOT - $scripts/$s"

    cp $scripts/$s $imagedir/mnt/scripts

    echo "Executing Script inside of CHROOT - /mnt/scripts/$s"

    sudo -E chroot $imagedir /mnt/scripts/$s

    done
    tar -czf /mnt/$imagename.tar.gz $work_dir/*
    publish="/usr/bin/cloud-publish-tarball"
    mkdir -p /mnt/bundle

    export TEMPDIR="/mnt/bundle"


    build_output=$($publish -q -k $akiid --rename-image $name /mnt/
    $imagename.tar.gz $s3_bucket)

    amiid=$(echo $build_output | cut -f2 -d’"')
    BASH based build system
    • Used to create a ‘Base’ AMI which was used to
    launch all instances.
    • One image to rule them all
    • Each instance launched would then run Puppet to
    build the machine at runtime
    • Only handled instance-store images
    • Built by bash, not a Config Management tool.
    Difficult to make repeatable / deterministic.

    View full-size slide

  14. Image Build System
    Components
    • Jenkins
    - Build management & orchestration
    • Packer
    - Machine Image Creation
    • ServerSpec
    - Machine Image testing framework
    • CloudInit
    - Runtime initialization system
    • Image Registry
    - Machine Image metadata service
    - Cloud CMDB. OSS examples being Edda from Netflix
    • Puppet
    - Configuration mangement tool
    - Infrastructure as Code. Reproducible results
    Created w/Open Source tooling
    Confidential 15
    Image
    Registry
    Packer ServerSpec
    CloudInit
    Machine
    Image
    Jenkins
    Puppet

    View full-size slide

  15. Jenkins
    Build Scheduling
    • User interface for configuring new machine image
    builds
    • On code check-in, build and test a new image
    • Configured with many build executors as Packer
    can concurrently create a lot of images at the
    same time
    • Visualizes test suite output of all jobs in the
    workflow
    Image Build Manager
    16

    View full-size slide

  16. Jenkins
    Build Process
    • Jenkins jobs setting parameters for build criteria
    • They then trigger the downstream ‘Packer’ build
    driver to create the AMI artifact.from
    • Upstream job blocks until the Packer job finishes.
    It marks it’s build as FAILURE based on the
    Packer build driver job.
    • Job then use copies the .properties artifact from
    the Packer job which contains the AMI ID
    • It can then use this ID to trigger a validation job.
    Launch this image, run the Spec tests on it. On
    pass, mark the AMI as ‘tested’.
    • We use the ‘yarjuf’ gem for JUnit reports
    Workflow
    Confidential 17
    Parameters

    - Unit Tests
    - Region
    - Version
    - Ancestor Image
    Packer
    Params
    Validation
    Machine Image

    Base - 12.04
    Machine Image

    MyApp 1
    Machine Image

    MyApp 2
    Params
    Params
    Params
    Parameters

    - AMI ID
    - Integration Tests
    … and more
    - Puppet role / classes
    - Application Name
    - Instance Type (c3.2xlarge)
    … and more

    View full-size slide

  17. Build Process
    From Foundation, to Application
    18
    Jenkins

    Build
    packer.json

    (build template)
    + Env Vars / Conf

    + Tests
    Packer
    Image Builder

    Test Framework
    Creates Reads
    AMI
    Creates
    Machine Image
    Updates Image
    Registry
    AMI Metadata

    (EC2)
    Reads
    Image
    Registry
    Foundation AMI
    Metadata
    Jenkins

    Build
    packer.json

    (build template)
    + Env Vars / Conf

    + Tests
    Packer
    Image Builder

    Test Framework
    Creates Reads
    AMI
    Creates
    Machine Image
    Updates Image
    Registry
    AMI Metadata

    (EC2)
    Reads
    Image
    Registry
    Base AMI
    Metadata
    Updates
    App
    Base

    View full-size slide

  18. Packer
    Features
    • Packer is a tool for creating identical machine
    images form a single source configuration
    • We use Packer as a Framework which to build
    machine images
    • Platform agnostic. Supports many output
    formats (AWS, Google, Docker, etc..)
    • Replaced hard to maintain Bash build
    infrastructure
    • High levels of concurrency
    Image Creation Framework
    19

    View full-size slide

  19. Template Renderer
    Build Pipeline
    • We have Jenkins execute Python code which
    queries the AMI Registry to retrieve the ancestor
    Image ID to build from. Reuses ENV variables
    • Users specify a query like “Most recent tested
    version of the Base Image for Ubuntu 12.04
    running a 3.19.7 kernel” based on Jenkins build
    parameters
    • The code then merges in all of these options into a
    dictionary template based on the build type
    (Foundation, Base, Application), then renders as a
    packer.json file
    Packer manifest creator
    20
    python create_packer_template.py
    # LAUNCH PACKER
    packer build -debug packer.json 2>&1|tee logfile
    ESTATUS=${PIPESTATUS[0]}
    echo "the exit code of packer was $ESTATUS"
    if [ $ESTATUS -ne 0 ]; then
    exit 1
    else
    . . .

    View full-size slide

  20. {
    "variables": {

    "kernel": "3.18.7",

    "owner": "Jeremy Carroll",

    “instance_type": "c3.2xlarge",

    "build_date": "20150923222114",

    "source_ami": "ami-1234567",

    "application": "mysql",

    "version": "0.1.3",

    "tests": "{{env `TESTS`}}",

    "region": "us-east-1",

    },

    "builders": [

    {

    "bundle_destination": "/mnt",

    "ssh_port": "22",

    "user_data": "#cloud-config\ndisable_root: false\ndisable_root_opts:",

    "ami_name": "{{user `application`}}-{{user `version`}}-{{user `architecture`}}-{{user `user_timestamp`}}-{{user `type`}}”,

    “bundle_prefix": "{{user `application`}}-{{user `version`}}-{{user `architecture`}}-{{user `user_timestamp`}}-{{user `type`}}",

    "iam_instance_profile": "provisioning",

    "enhanced_networking": "{{user `enhanced_networking`}}",

    "type": "amazon-instance",

    "tags": {

    "environment": "test",

    "version": "0.1.3",

    "application": "mysql",

    "ancestor": "ami-12345678",

    },

    "ami_description": “store=amazon-instance,ancestor_id=ami-12345678,version=0.1.3,env=test,app=mysql,.. , …”,

    }

    ]

    }
    Packer Manifest
    Wrapped in Image Registry API’s
    21

    View full-size slide

  21. Testing
    Image Test Framework
    • RSpec tests for your servers
    • JUnit output (test status via Jenkins)
    • Tests run after image build complete, but before packaging + upload
    22
    require 'spec_helper'


    describe file('/usr/sbin/policy-rc.d') do

    it { should_not be_file }

    end


    describe file('/root/.ssh/id_rsa') do

    it { should_not be_file }

    end
    describe file('/root/.ssh/id_rsa.pub') do

    it { should_not be_file }

    end


    describe file('/etc/timezone') do

    it { should be_file }

    it { should contain 'Etc/UTC' }

    end
    describe service("nslcd") do

    it { should be_running }

    end


    describe command(‘getent passed jeremy') do

    its(:exit_status) { should eq 0 }

    end


    describe command('ntpq -pn|grep -E "^\*"') do

    its(:exit_status) { should eq 0 }

    end
    Integration Tests Acceptance Tests

    View full-size slide

  22. CloudInit
    Initialization routines
    • Some things cannot be baked into an image.
    CloudInit handles these cases for us at runtimes
    • Mounting and formatting ephemeral volumes
    (RAID / JBOD)
    • Registering DNS entries, calling API’s
    • Performing an optional ‘delta’ Puppet run in
    ‘blocking’ or ‘non-blocking’ mode.
    • SSH Key management
    • Hostnames / UUID generation
    • Locking for once only, per boot, per instance
    semantics. No need to re-register DNS on a
    reboot
    Runtime Configuration
    23
    Boot
    Disks
    RAID
    Logs
    EBS
    Hostname

    (uuid)
    /etc/fstab
    SSH Keys
    root
    Puppet
    Delta
    DNS
    Route53

    View full-size slide

  23. Puppet
    Fry based provisioner
    • We use Puppet to manage configuration for
    instances after launch
    • Also used during the baking process to have a
    repeatable, deterministic system to configure
    machine images
    • Be aware of dynamic fact driven templates. The
    AMI that bakes the image will not be the same that
    runs the image
    • These type of these configurations have to be
    rendered in at runtime, or mitigated
    • We use CloudInit for these types of challenges.
    Repeatable builds
    24

    View full-size slide

  24. Build Times
    Smaller is better
    25
    25
    amazon-chroot
    amazon-ebs
    amazon-instance 10
    6
    4
    Build time in minutes
    Numbers will vary
    • Each instance type can have a 1-2 minute boot
    time. Instance-store AMI not cached is slower
    • Create / Upload of the artifact for S3 is the slowest
    part of the instance-store provisioner (5-6 mins)
    • EBS provisioner uses snapshots, which does not
    involve S3 uploading
    • Chroot has the downside of having instances
    always running (with the same OS / kernel type) to
    create images
    • Build times can be largely dominated by Puppet
    times. Delta builds can shave many minutes off
    creation time

    View full-size slide

  25. Image Registry
    managing artifacts

    View full-size slide

  26. Image Metadata
    Image Attributes
    • Versioning
    • Application Name
    • Environment Name
    • Ancestor Image ID
    • Owner Name
    EC2 Attributes
    • Volume Type (instance-store / ebs)
    • Virtualization Type (HVM / PV)
    • Has an API for use in code to find AMI ID’s
    Supporting registry + search
    27
    curl https://images/api/v1?
    application=myapp&release=precise&virtualization_type=hvm&environment=prod&insta
    nce_type=instance-store

    View full-size slide

  27. Automation
    Image Janitor
    • Rules engine
    - ID not referenced in a Launch Configuration
    - Last launch date
    - Last referenced as an ancestor
    - How many images to keep. How long to keep
    them. Etc..
    Security
    • Events such as OpenSSL vulnerability
    • Package metadata to find contents of images for
    patch management
    Utilities for operational excellence
    28

    View full-size slide

  28. EC2 as a Registry
    Simple Registry
    • If you do not have your own cloud inventory
    tracking tool, can use EC2 tags as a simple
    solution
    • Only 10 tags per image, with a limit of 255
    characters. So very basic.
    • Can use this to help drive a simple janitor. Last
    time instance launched with AMI ID, etc..
    • Example code here shows using the Boto
    Python library as a filter for images with tags
    specified
    Cloud Inventory
    29

    View full-size slide

  29. Summary
    • If you operate in the cloud. Unless you want to depend on Public Images, you will need a ‘bakery’ of some
    sort. Even if you do not do immutable infrastructure for autoscaling.
    • Image Registry for discovery and management as a key enabler for this technology. Metadata for images part
    of the system.
    • Infrastructure as Code to drive your build system. Using CM systems to create reproducible / testable
    artifacts. Testing as part of the pipeline to ensure.
    • Packer creates a framework which you can tool around to craft a system. Here is ours as an example. Not
    the only way, but one that we have been successful with.
    • Examples & Tools - https://github.com/phobos182/packer_build_tools/
    Thanks for listening
    30

    View full-size slide

  30. © Copyright, All Rights Reserved Pinterest Inc. 2015

    View full-size slide