$30 off During Our Annual Pro Sale. View Details »

How Netflix Gives All Its Engineers SSH Access To Instances Running In Production

How Netflix Gives All Its Engineers SSH Access To Instances Running In Production

One of the ways Netflix enables engineering velocity is with the Freedom and Responsibility culture that empowers individuals with the freedom to do what is needed to get the job done. As a result, the security teams at Netflix focus on reducing developer friction, making it easy to do the right thing, and then rely on auditing, automated analysis, and alerting to keep things safe. This talk begins with a review of few approaches used in the industry to secure SSH bastions (aka jumpboxes), and evaluates them through the lense of our Netflix security culture.

With the industry norms as the backdrop, we’ll explain why Netflix decided it needed to build something new to enhance SSH bastion security. We needed something that was low friction for engineers, but would allow for additional security features to be added in behind the scenes.

We’ll review our SSH bastion architecture, which at its core uses SSO to authenticate engineers, and then issues per user credentials with short lived certificates for SSH authentication of the bastion to an instance. These short lived credentials reduce the risk associated them being lost. We’ll cover how this approach allows us to audit and automatically alert after the fact, instead of slowing down engineers before granting access.

Lastly, we’ll present the SSH Certificate Authority at the core of this system. It runs as an Amazon Web Services Lambda function, and protects its private key with AWS’s Key Management Service. By relying only on AWS services, the SSH Certificate Authority is easy to bring up, and can be used to bootstrap Netflix’s cloud deployments without adding circular dependencies. Additionally, Netflix announced the open sourcing of BLESS; the Bastion's Lambda Ephemeral Ssh Service.

Russell Lewis

May 19, 2016
Tweet

Other Decks in Technology

Transcript

  1. How Netflix Gives All Its Engineers SSH
    Access To Instances Running In Production
    Russell Lewis Product and Application Security Team

    View Slide

  2. Dangers Of A
    Stolen SSH Key • Insider stole funds
    • Changed all their keys
    • Rebuilt their whole infrastructure on a new
    cloud provider
    • Hacked again
    • Server logs had been wiped
    • One stolen SSH key enabled subsequent
    thefts, from an installed backdoor
    Shapeshift, a cryptocurrency exchange
    startup, was hacked three times in a month.
    http://bit.ly/1T23WJG

    View Slide

  3. Table of Contents
    1. What Do We Need?
    2. Industry Practices
    3. The Netflix Approach
    4. Detecting Undesirable SSH Usage
    5. How Do We Issue SSH Certificates?

    View Slide

  4. SSH Key
    Management • Private keys can get lost, stolen, or shared
    • Replacing a key means updating all your
    servers
    • Can you easily update every server?
    • Do you have an up to date inventory of all
    enabled SSH keys in your organization?
    • How do you know if there isn’t a backdoor
    key?
    • Do you rely on SSH to update all of those
    servers?
    “Many organizations don't even know how
    many SSH keys they have configured to grant
    access to their information systems or who
    has copies of those keys” - NISTIR 7966

    View Slide

  5. How Can You Limit
    That Danger?

    View Slide

  6. What About Single
    Use SSH Keys?

    View Slide

  7. What If It Left Great
    Clues Behind?

    View Slide

  8. How Can We Protect
    Server Access?

    View Slide

  9. Is Anybody
    Doing This?
    Will their solution work for us?

    View Slide

  10. Freedom &
    Responsibility
    • Share information openly and proactively
    • Context, not control
    • You build it, you run it
    How can we work like a startup, but with all
    the responsibilities of a big company?

    View Slide

  11. So How Do We
    Secure Things?
    • Feature, not friction
    • Define secure defaults
    • Automated scanning
    • Secret management
    • Alerting and reports

    View Slide

  12. How Should
    We Secure
    SSH?

    View Slide

  13. 1.What Do We Need?
    2. Industry Practices
    3. The Netflix Approach
    4. Detecting Undesirable SSH Usage
    5. How Do We Issue SSH Certificates?

    View Slide

  14. SSH Key
    Protection
    • How can you protect your private keys?
    • Hardware SSH Key Protection
    • How can you verify they are protected?

    View Slide

  15. Traditional
    Access
    Operator 2
    App A
    Instances
    App B
    Instances
    App C
    Instances
    Operator 3
    Operator 1

    View Slide

  16. Bastion
    Access
    Operator 2 Bastion
    App A
    Instances
    App B
    Instances
    App C
    Instances
    Operator 3
    Operator 1

    View Slide

  17. 1.What Do We Need?
    2. Industry Practices
    3. The Netflix Approach
    4. Detecting Undesirable SSH Usage
    5. How Do We Issue SSH Certificates?

    View Slide

  18. System
    Objective
    • Give developers freedom to access SSH
    • Gather context without friction
    • Scan for undesirable SSH usage

    View Slide

  19. Bastions
    • Choke point on the network
    • Simplify Authentication, using SSO
    • Log all activity
    • Users self manage Public Keys
    • One system to manage for server access
    tools

    View Slide

  20. 2FA
    • Verify intended use
    • Second factor already provisioned
    • PAM modules for most 2FA solutions
    • Discourages undesirable automated
    workflows
    • SSH Multiplexing to reduce additional
    challenges

    View Slide

  21. SSHD Config
    • Secured by default SSHD config in BaseAMI
    • Out of the box, just works
    • Tooling in place to track deployed BaseAMIs

    View Slide

  22. What About Single
    Use SSH Keys?

    View Slide

  23. BLESS:
    Single Serving
    SSH Certificates

    View Slide

  24. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    User or Host certificates

    View Slide

  25. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    Control over what is logged by sshd

    View Slide

  26. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    If you can issue an SSH Certificate for every
    SSH request, you only need the certificate to
    be valid during session authentication.
    Sessions stay established after the
    certificates expire.

    View Slide

  27. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    Valid for a single target, no matter how you
    define that target.
    • Account
    • Application
    • Username
    • Instance
    • Define and Authorize as you see fit

    View Slide

  28. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    Valid from a single host

    View Slide

  29. Type: [email protected] user
    certificate
    Public key: RSA-CERT SHA256:BLAH
    Signing CA: RSA SHA256:BLAH
    Key ID: "Any ID information you want"
    Serial: 0
    Valid: from 2016-05-19T14:30:00 to
    2016-05-19T14:34:00
    Principals:
    host_username
    Critical Options:
    source-address 192.168.1.1
    force-command /bin/date
    Extensions:
    permit-X11-forwarding
    permit-agent-forwarding
    permit-port-forwarding
    permit-pty
    permit-user-rc
    Control what the SSH session can be used for

    View Slide

  30. SSH Certificate
    Authorities
    • Trust CAs with SSHD Configs
    • Deploy multiple trusted CAs
    • Leave some offline
    • Rotate Regularly
    • Emergency Preparedness

    View Slide

  31. Enable BLESS
    /etc/ssh/cas.pub:
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQ…
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQ…
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQ…
    /etc/ssh/sshd_config :
    TrustedUserCAKeys /etc/ssh/cas.pub

    View Slide

  32. 1.What Do We Need?
    2. Industry Practices
    3. The Netflix Approach
    4. Detecting Undesirable SSH Usage
    5. How Do We Issue SSH Certificates?

    View Slide

  33. What Is
    Undesirable
    SSH Usage?
    • Malicious
    • Dangerous
    • Avoidable

    View Slide

  34. Detecting
    Undesirable
    SSH Usage
    With as many instances as Netflix runs,
    administered by all its engineers, finding
    undesirable SSH sessions could feel like
    finding a needle in a haystack.

    View Slide

  35. Finding A
    Needle In A
    Haystack
    The Mythbusters tested the difficulty of this
    problem with two very different approaches.

    View Slide

  36. Shapeshift’s
    Approach
    • Had to act fast
    • Didn’t have tooling in place to help with the
    compromise
    • Tore down and rebuilt everything
    Discard the haystack, buy more hay.

    View Slide

  37. Burn It Down!
    • Simple, in theory
    • Destructive
    • Still have a searching problem
    Burn the haystack and search the remains.
    Does shutting off SSH access entirely work?

    View Slide

  38. Can We Sort
    The Good
    From The Bad?
    • Not overly complex
    • Built a machine for the searching problem
    • Use Our Data Pipeline
    Built a Machine to exploit the different
    densities of the hay and needles.

    View Slide

  39. What SSH
    Properties Can
    We Exploit?
    Start by understanding how your engineers
    use SSH.

    View Slide

  40. Security
    Intelligence
    And Response
    Team • Process SSH Certificate requests
    • Process SSH authentication attempts
    • Process SSH session logs
    • Reports and Alerts

    View Slide

  41. 1.What Do We Need?
    2. Industry Practices
    3. The Netflix Approach
    4. Detecting Undesirable SSH Usage
    5. How Do We Issue SSH Certificates?

    View Slide

  42. Bastion's
    Lambda
    Ephemeral
    Ssh
    Service

    View Slide

  43. What is BLESS?
    • Python AWS Lambda Function
    • Constructs SSH Certificates
    • Signs SSH Certificates

    View Slide

  44. Why Lambda? • No circular dependencies on Netflix
    ecosystem
    • Easy to run in special purposed AWS
    account with the tightest controls
    • Separate account, separate rate limits
    • Bootstrap system with AWS KMS
    • Lambda Secured w/IAM instead of SSH key
    management
    • Audit-able
    • Alias let you manage deployments of new
    versions easily

    View Slide

  45. View Slide

  46. View Slide

  47. View Slide

  48. View Slide

  49. View Slide

  50. View Slide

  51. View Slide

  52. SSD Auth.log
    May 19 14:30:00 host sshd[#####]:
    Accepted publickey for host_username from
    192.168.1.1 port ##### ssh2: RSA-CERT ID
    request[###########################
    #########] for[developer_username]
    from[10.0.1.1] command[ssh
    [email protected]] ssh_key:[RSA
    00:00:00:00:de:ad:be:ef00:00:00:00:de:ad:be
    :ef] ca:
    [arn:aws:lambda:region:account#:function:na
    me] valid_to[YYY/MM/DD HH:MM:SS] (serial
    0) CA RSA 8b:ad:f0:0d:
    00:00:00:00:00:00:00:00:8b:ad:f0:0d
    May 19 14:30:00 host sshd[#####]:
    Accepted publickey for host_username from
    192.168.1.1 port ##### ssh2: RSA
    de:ad:be:ef:
    00:00:00:00:00:00:00:00:de:ad:be:ef

    View Slide

  53. View Slide

  54. Would This
    Have Helped
    Shapeshift?
    • 2FA on the Bastion
    • Audit Logs locked away
    • Key Rotations all the way down

    View Slide

  55. Go Use SSH
    Certificates!
    https://github.com/Netflix/bless

    View Slide

  56. Questions?
    RussellL@netflix.com

    View Slide