Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HashiConf 2015

Jeremy Carroll
September 29, 2015

HashiConf 2015

Packer @ Pinterest

Jeremy Carroll

September 29, 2015
Tweet

More Decks by Jeremy Carroll

Other Decks in Technology

Transcript

  1. View Slide

  2. Packer @ Pinterest
    2
    Jeremy Carroll - Site Reliability Engineer

    View Slide

  3. HashiConf agenda
    What we will talk about
    3
    Images
    All about
    Machine Images
    1
    Build System
    System design to
    create images
    2
    Management
    Lifecycle policies for
    machine images
    3

    View Slide

  4. 50+ Billion Pins
    categorized by people into more than

    1 Billion Boards
    4

    View Slide

  5. 5
    vs
    Baking Frying

    View Slide

  6. 6
    vs
    Baking Frying
    Produce a reusable machine
    image. Then retrieve, install,
    and apply essential components
    or configuration at runtime

    An purpose built machine
    image, with all the requisite
    software to fulfill a specific task.


    View Slide

  7. 7
    AMI
    Frying
    • Less State - more dependencies
    • More reusable - fewer images
    • Slower boot times
    Runtime
    App
    Dependencies
    App Config
    OS Packages
    Application
    Code
    OS Config

    View Slide

  8. 8
    Baking
    App
    Dependencies
    App Config
    OS Packages
    Application
    Code
    OS Config
    AMI Runtime
    • More State - less dependencies
    • Not very reusable,more images
    • Faster boot times
    • AutoScaling a primary driver

    View Slide

  9. SLA Inversion
    What’s the availability of your launch process?
    9
    Depends
    My App
    ?
    Apt Repo 1
    99.9%
    Public Repo 1
    NO SLA
    Python Repo 1
    99.9%
    DNS
    99.99%
    Puppet
    99.9%
    Depends
    9
    DynamoDB
    99.99%
    Depends Depends
    D
    epends D
    epends

    View Slide

  10. Overview
    Key Points
    • The “Bake vs Fry” debate is moot. Most
    environments do both.
    • It’s your choice to move more state inside of the
    image, and less at runtime depending on your
    requirements
    • Moving images to more Baking increases reliability
    by reducing or removing dependent services
    Machine Images
    10

    View Slide

  11. Build System
    Reliable / repeatable / testable

    build infrastructure for the Cloud
    2:50 PM 100%
    11

    View Slide

  12. Machine Image Building
    AWS Workflows
    12
    Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html
    AMI lifecycle
    Instance-Store Image Creation
    EBS Image Creation

    View Slide

  13. Image States
    Checkpoints for Machine Images
    13
    Foundation Base Application
    Features

    - Barebones OS
    - Standard Linux distribution
    - Minified file set
    - Built from scratch, or using
    a public cloud image
    Features

    - “Pinterest” platform OS
    - Security best practices
    - Basic packages common
    across all images
    - Kernel upgrades +
    performance tuning
    Features
    - Application specific tuning
    - May contain code + configs
    - Not reusable past this
    specific application
    Ancestor Ancestor
    Layer 0 Layer 1 Layer 2

    View Slide

  14. Previous System
    Creating an image
    14

    #!/bin/bash

    ...

    echo "Copying Files - $files"

    /bin/cp -rp --force $files/* $imagedir/


    for s in $(ls $scripts)

    do

    echo "Copying Script to CHROOT - $scripts/$s"

    cp $scripts/$s $imagedir/mnt/scripts

    echo "Executing Script inside of CHROOT - /mnt/scripts/$s"

    sudo -E chroot $imagedir /mnt/scripts/$s

    done
    tar -czf /mnt/$imagename.tar.gz $work_dir/*
    publish="/usr/bin/cloud-publish-tarball"
    mkdir -p /mnt/bundle

    export TEMPDIR="/mnt/bundle"


    build_output=$($publish -q -k $akiid --rename-image $name /mnt/
    $imagename.tar.gz $s3_bucket)

    amiid=$(echo $build_output | cut -f2 -d’"')
    BASH based build system
    • Used to create a ‘Base’ AMI which was used to
    launch all instances.
    • One image to rule them all
    • Each instance launched would then run Puppet to
    build the machine at runtime
    • Only handled instance-store images
    • Built by bash, not a Config Management tool.
    Difficult to make repeatable / deterministic.

    View Slide

  15. Image Build System
    Components
    • Jenkins
    - Build management & orchestration
    • Packer
    - Machine Image Creation
    • ServerSpec
    - Machine Image testing framework
    • CloudInit
    - Runtime initialization system
    • Image Registry
    - Machine Image metadata service
    - Cloud CMDB. OSS examples being Edda from Netflix
    • Puppet
    - Configuration mangement tool
    - Infrastructure as Code. Reproducible results
    Created w/Open Source tooling
    Confidential 15
    Image
    Registry
    Packer ServerSpec
    CloudInit
    Machine
    Image
    Jenkins
    Puppet

    View Slide

  16. Jenkins
    Build Scheduling
    • User interface for configuring new machine image
    builds
    • On code check-in, build and test a new image
    • Configured with many build executors as Packer
    can concurrently create a lot of images at the
    same time
    • Visualizes test suite output of all jobs in the
    workflow
    Image Build Manager
    16

    View Slide

  17. Jenkins
    Build Process
    • Jenkins jobs setting parameters for build criteria
    • They then trigger the downstream ‘Packer’ build
    driver to create the AMI artifact.from
    • Upstream job blocks until the Packer job finishes.
    It marks it’s build as FAILURE based on the
    Packer build driver job.
    • Job then use copies the .properties artifact from
    the Packer job which contains the AMI ID
    • It can then use this ID to trigger a validation job.
    Launch this image, run the Spec tests on it. On
    pass, mark the AMI as ‘tested’.
    • We use the ‘yarjuf’ gem for JUnit reports
    Workflow
    Confidential 17
    Parameters

    - Unit Tests
    - Region
    - Version
    - Ancestor Image
    Packer
    Params
    Validation
    Machine Image

    Base - 12.04
    Machine Image

    MyApp 1
    Machine Image

    MyApp 2
    Params
    Params
    Params
    Parameters

    - AMI ID
    - Integration Tests
    … and more
    - Puppet role / classes
    - Application Name
    - Instance Type (c3.2xlarge)
    … and more

    View Slide

  18. Build Process
    From Foundation, to Application
    18
    Jenkins

    Build
    packer.json

    (build template)
    + Env Vars / Conf

    + Tests
    Packer
    Image Builder

    Test Framework
    Creates Reads
    AMI
    Creates
    Machine Image
    Updates Image
    Registry
    AMI Metadata

    (EC2)
    Reads
    Image
    Registry
    Foundation AMI
    Metadata
    Jenkins

    Build
    packer.json

    (build template)
    + Env Vars / Conf

    + Tests
    Packer
    Image Builder

    Test Framework
    Creates Reads
    AMI
    Creates
    Machine Image
    Updates Image
    Registry
    AMI Metadata

    (EC2)
    Reads
    Image
    Registry
    Base AMI
    Metadata
    Updates
    App
    Base

    View Slide

  19. Packer
    Features
    • Packer is a tool for creating identical machine
    images form a single source configuration
    • We use Packer as a Framework which to build
    machine images
    • Platform agnostic. Supports many output
    formats (AWS, Google, Docker, etc..)
    • Replaced hard to maintain Bash build
    infrastructure
    • High levels of concurrency
    Image Creation Framework
    19

    View Slide

  20. Template Renderer
    Build Pipeline
    • We have Jenkins execute Python code which
    queries the AMI Registry to retrieve the ancestor
    Image ID to build from. Reuses ENV variables
    • Users specify a query like “Most recent tested
    version of the Base Image for Ubuntu 12.04
    running a 3.19.7 kernel” based on Jenkins build
    parameters
    • The code then merges in all of these options into a
    dictionary template based on the build type
    (Foundation, Base, Application), then renders as a
    packer.json file
    Packer manifest creator
    20
    python create_packer_template.py
    # LAUNCH PACKER
    packer build -debug packer.json 2>&1|tee logfile
    ESTATUS=${PIPESTATUS[0]}
    echo "the exit code of packer was $ESTATUS"
    if [ $ESTATUS -ne 0 ]; then
    exit 1
    else
    . . .

    View Slide

  21. {
    "variables": {

    "kernel": "3.18.7",

    "owner": "Jeremy Carroll",

    “instance_type": "c3.2xlarge",

    "build_date": "20150923222114",

    "source_ami": "ami-1234567",

    "application": "mysql",

    "version": "0.1.3",

    "tests": "{{env `TESTS`}}",

    "region": "us-east-1",

    },

    "builders": [

    {

    "bundle_destination": "/mnt",

    "ssh_port": "22",

    "user_data": "#cloud-config\ndisable_root: false\ndisable_root_opts:",

    "ami_name": "{{user `application`}}-{{user `version`}}-{{user `architecture`}}-{{user `user_timestamp`}}-{{user `type`}}”,

    “bundle_prefix": "{{user `application`}}-{{user `version`}}-{{user `architecture`}}-{{user `user_timestamp`}}-{{user `type`}}",

    "iam_instance_profile": "provisioning",

    "enhanced_networking": "{{user `enhanced_networking`}}",

    "type": "amazon-instance",

    "tags": {

    "environment": "test",

    "version": "0.1.3",

    "application": "mysql",

    "ancestor": "ami-12345678",

    },

    "ami_description": “store=amazon-instance,ancestor_id=ami-12345678,version=0.1.3,env=test,app=mysql,.. , …”,

    }

    ]

    }
    Packer Manifest
    Wrapped in Image Registry API’s
    21

    View Slide

  22. Testing
    Image Test Framework
    • RSpec tests for your servers
    • JUnit output (test status via Jenkins)
    • Tests run after image build complete, but before packaging + upload
    22
    require 'spec_helper'


    describe file('/usr/sbin/policy-rc.d') do

    it { should_not be_file }

    end


    describe file('/root/.ssh/id_rsa') do

    it { should_not be_file }

    end
    describe file('/root/.ssh/id_rsa.pub') do

    it { should_not be_file }

    end


    describe file('/etc/timezone') do

    it { should be_file }

    it { should contain 'Etc/UTC' }

    end
    describe service("nslcd") do

    it { should be_running }

    end


    describe command(‘getent passed jeremy') do

    its(:exit_status) { should eq 0 }

    end


    describe command('ntpq -pn|grep -E "^\*"') do

    its(:exit_status) { should eq 0 }

    end
    Integration Tests Acceptance Tests

    View Slide

  23. CloudInit
    Initialization routines
    • Some things cannot be baked into an image.
    CloudInit handles these cases for us at runtimes
    • Mounting and formatting ephemeral volumes
    (RAID / JBOD)
    • Registering DNS entries, calling API’s
    • Performing an optional ‘delta’ Puppet run in
    ‘blocking’ or ‘non-blocking’ mode.
    • SSH Key management
    • Hostnames / UUID generation
    • Locking for once only, per boot, per instance
    semantics. No need to re-register DNS on a
    reboot
    Runtime Configuration
    23
    Boot
    Disks
    RAID
    Logs
    EBS
    Hostname

    (uuid)
    /etc/fstab
    SSH Keys
    root
    Puppet
    Delta
    DNS
    Route53

    View Slide

  24. Puppet
    Fry based provisioner
    • We use Puppet to manage configuration for
    instances after launch
    • Also used during the baking process to have a
    repeatable, deterministic system to configure
    machine images
    • Be aware of dynamic fact driven templates. The
    AMI that bakes the image will not be the same that
    runs the image
    • These type of these configurations have to be
    rendered in at runtime, or mitigated
    • We use CloudInit for these types of challenges.
    Repeatable builds
    24

    View Slide

  25. Build Times
    Smaller is better
    25
    25
    amazon-chroot
    amazon-ebs
    amazon-instance 10
    6
    4
    Build time in minutes
    Numbers will vary
    • Each instance type can have a 1-2 minute boot
    time. Instance-store AMI not cached is slower
    • Create / Upload of the artifact for S3 is the slowest
    part of the instance-store provisioner (5-6 mins)
    • EBS provisioner uses snapshots, which does not
    involve S3 uploading
    • Chroot has the downside of having instances
    always running (with the same OS / kernel type) to
    create images
    • Build times can be largely dominated by Puppet
    times. Delta builds can shave many minutes off
    creation time

    View Slide

  26. Image Registry
    managing artifacts

    View Slide

  27. Image Metadata
    Image Attributes
    • Versioning
    • Application Name
    • Environment Name
    • Ancestor Image ID
    • Owner Name
    EC2 Attributes
    • Volume Type (instance-store / ebs)
    • Virtualization Type (HVM / PV)
    • Has an API for use in code to find AMI ID’s
    Supporting registry + search
    27
    curl https://images/api/v1?
    application=myapp&release=precise&virtualization_type=hvm&environment=prod&insta
    nce_type=instance-store

    View Slide

  28. Automation
    Image Janitor
    • Rules engine
    - ID not referenced in a Launch Configuration
    - Last launch date
    - Last referenced as an ancestor
    - How many images to keep. How long to keep
    them. Etc..
    Security
    • Events such as OpenSSL vulnerability
    • Package metadata to find contents of images for
    patch management
    Utilities for operational excellence
    28

    View Slide

  29. EC2 as a Registry
    Simple Registry
    • If you do not have your own cloud inventory
    tracking tool, can use EC2 tags as a simple
    solution
    • Only 10 tags per image, with a limit of 255
    characters. So very basic.
    • Can use this to help drive a simple janitor. Last
    time instance launched with AMI ID, etc..
    • Example code here shows using the Boto
    Python library as a filter for images with tags
    specified
    Cloud Inventory
    29

    View Slide

  30. Summary
    • If you operate in the cloud. Unless you want to depend on Public Images, you will need a ‘bakery’ of some
    sort. Even if you do not do immutable infrastructure for autoscaling.
    • Image Registry for discovery and management as a key enabler for this technology. Metadata for images part
    of the system.
    • Infrastructure as Code to drive your build system. Using CM systems to create reproducible / testable
    artifacts. Testing as part of the pipeline to ensure.
    • Packer creates a framework which you can tool around to craft a system. Here is ours as an example. Not
    the only way, but one that we have been successful with.
    • Examples & Tools - https://github.com/phobos182/packer_build_tools/
    Thanks for listening
    30

    View Slide

  31. © Copyright, All Rights Reserved Pinterest Inc. 2015

    View Slide