How Netflix Gives All Its Engineers SSH Access To Instances Running In Production

How Netflix Gives All Its Engineers SSH Access To Instances Running In Production

One of the ways Netflix enables engineering velocity is with the Freedom and Responsibility culture that empowers individuals with the freedom to do what is needed to get the job done. As a result, the security teams at Netflix focus on reducing developer friction, making it easy to do the right thing, and then rely on auditing, automated analysis, and alerting to keep things safe. This talk begins with a review of few approaches used in the industry to secure SSH bastions (aka jumpboxes), and evaluates them through the lense of our Netflix security culture.

With the industry norms as the backdrop, we’ll explain why Netflix decided it needed to build something new to enhance SSH bastion security. We needed something that was low friction for engineers, but would allow for additional security features to be added in behind the scenes.

We’ll review our SSH bastion architecture, which at its core uses SSO to authenticate engineers, and then issues per user credentials with short lived certificates for SSH authentication of the bastion to an instance. These short lived credentials reduce the risk associated them being lost. We’ll cover how this approach allows us to audit and automatically alert after the fact, instead of slowing down engineers before granting access.

Lastly, we’ll present the SSH Certificate Authority at the core of this system. It runs as an Amazon Web Services Lambda function, and protects its private key with AWS’s Key Management Service. By relying only on AWS services, the SSH Certificate Authority is easy to bring up, and can be used to bootstrap Netflix’s cloud deployments without adding circular dependencies. Additionally, Netflix announced the open sourcing of BLESS; the Bastion's Lambda Ephemeral Ssh Service.

8c14fd1ceed481623d8a4168f98139bd?s=128

Russell Lewis

May 19, 2016
Tweet

Transcript

  1. How Netflix Gives All Its Engineers SSH Access To Instances

    Running In Production Russell Lewis Product and Application Security Team
  2. Dangers Of A Stolen SSH Key • Insider stole funds

    • Changed all their keys • Rebuilt their whole infrastructure on a new cloud provider • Hacked again • Server logs had been wiped • One stolen SSH key enabled subsequent thefts, from an installed backdoor Shapeshift, a cryptocurrency exchange startup, was hacked three times in a month. http://bit.ly/1T23WJG
  3. Table of Contents 1. What Do We Need? 2. Industry

    Practices 3. The Netflix Approach 4. Detecting Undesirable SSH Usage 5. How Do We Issue SSH Certificates?
  4. SSH Key Management • Private keys can get lost, stolen,

    or shared • Replacing a key means updating all your servers • Can you easily update every server? • Do you have an up to date inventory of all enabled SSH keys in your organization? • How do you know if there isn’t a backdoor key? • Do you rely on SSH to update all of those servers? “Many organizations don't even know how many SSH keys they have configured to grant access to their information systems or who has copies of those keys” - NISTIR 7966
  5. How Can You Limit That Danger?

  6. What About Single Use SSH Keys?

  7. What If It Left Great Clues Behind?

  8. How Can We Protect Server Access?

  9. Is Anybody Doing This? Will their solution work for us?

  10. Freedom & Responsibility • Share information openly and proactively •

    Context, not control • You build it, you run it How can we work like a startup, but with all the responsibilities of a big company?
  11. So How Do We Secure Things? • Feature, not friction

    • Define secure defaults • Automated scanning • Secret management • Alerting and reports
  12. How Should We Secure SSH?

  13. 1.What Do We Need? 2. Industry Practices 3. The Netflix

    Approach 4. Detecting Undesirable SSH Usage 5. How Do We Issue SSH Certificates?
  14. SSH Key Protection • How can you protect your private

    keys? • Hardware SSH Key Protection • How can you verify they are protected?
  15. Traditional Access Operator 2 App A Instances App B Instances

    App C Instances Operator 3 Operator 1
  16. Bastion Access Operator 2 Bastion App A Instances App B

    Instances App C Instances Operator 3 Operator 1
  17. 1.What Do We Need? 2. Industry Practices 3. The Netflix

    Approach 4. Detecting Undesirable SSH Usage 5. How Do We Issue SSH Certificates?
  18. System Objective • Give developers freedom to access SSH •

    Gather context without friction • Scan for undesirable SSH usage
  19. Bastions • Choke point on the network • Simplify Authentication,

    using SSO • Log all activity • Users self manage Public Keys • One system to manage for server access tools
  20. 2FA • Verify intended use • Second factor already provisioned

    • PAM modules for most 2FA solutions • Discourages undesirable automated workflows • SSH Multiplexing to reduce additional challenges
  21. SSHD Config • Secured by default SSHD config in BaseAMI

    • Out of the box, just works • Tooling in place to track deployed BaseAMIs
  22. What About Single Use SSH Keys?

  23. BLESS: Single Serving SSH Certificates

  24. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc User or Host certificates
  25. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc Control over what is logged by sshd
  26. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc If you can issue an SSH Certificate for every SSH request, you only need the certificate to be valid during session authentication. Sessions stay established after the certificates expire.
  27. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc Valid for a single target, no matter how you define that target. • Account • Application • Username • Instance • Define and Authorize as you see fit
  28. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc Valid from a single host
  29. Type: ssh-rsa-cert-v01@openssh.com user certificate Public key: RSA-CERT SHA256:BLAH Signing CA:

    RSA SHA256:BLAH Key ID: "Any ID information you want" Serial: 0 Valid: from 2016-05-19T14:30:00 to 2016-05-19T14:34:00 Principals: host_username Critical Options: source-address 192.168.1.1 force-command /bin/date Extensions: permit-X11-forwarding permit-agent-forwarding permit-port-forwarding permit-pty permit-user-rc Control what the SSH session can be used for
  30. SSH Certificate Authorities • Trust CAs with SSHD Configs •

    Deploy multiple trusted CAs • Leave some offline • Rotate Regularly • Emergency Preparedness
  31. Enable BLESS /etc/ssh/cas.pub: ssh-rsa AAAAB3NzaC1yc2EAAAADAQ… ssh-rsa AAAAB3NzaC1yc2EAAAADAQ… ssh-rsa AAAAB3NzaC1yc2EAAAADAQ… /etc/ssh/sshd_config

    : TrustedUserCAKeys /etc/ssh/cas.pub
  32. 1.What Do We Need? 2. Industry Practices 3. The Netflix

    Approach 4. Detecting Undesirable SSH Usage 5. How Do We Issue SSH Certificates?
  33. What Is Undesirable SSH Usage? • Malicious • Dangerous •

    Avoidable
  34. Detecting Undesirable SSH Usage With as many instances as Netflix

    runs, administered by all its engineers, finding undesirable SSH sessions could feel like finding a needle in a haystack.
  35. Finding A Needle In A Haystack The Mythbusters tested the

    difficulty of this problem with two very different approaches.
  36. Shapeshift’s Approach • Had to act fast • Didn’t have

    tooling in place to help with the compromise • Tore down and rebuilt everything Discard the haystack, buy more hay.
  37. Burn It Down! • Simple, in theory • Destructive •

    Still have a searching problem Burn the haystack and search the remains. Does shutting off SSH access entirely work?
  38. Can We Sort The Good From The Bad? • Not

    overly complex • Built a machine for the searching problem • Use Our Data Pipeline Built a Machine to exploit the different densities of the hay and needles.
  39. What SSH Properties Can We Exploit? Start by understanding how

    your engineers use SSH.
  40. Security Intelligence And Response Team • Process SSH Certificate requests

    • Process SSH authentication attempts • Process SSH session logs • Reports and Alerts
  41. 1.What Do We Need? 2. Industry Practices 3. The Netflix

    Approach 4. Detecting Undesirable SSH Usage 5. How Do We Issue SSH Certificates?
  42. Bastion's Lambda Ephemeral Ssh Service

  43. What is BLESS? • Python AWS Lambda Function • Constructs

    SSH Certificates • Signs SSH Certificates
  44. Why Lambda? • No circular dependencies on Netflix ecosystem •

    Easy to run in special purposed AWS account with the tightest controls • Separate account, separate rate limits • Bootstrap system with AWS KMS • Lambda Secured w/IAM instead of SSH key management • Audit-able • Alias let you manage deployments of new versions easily
  45. None
  46. None
  47. None
  48. None
  49. None
  50. None
  51. None
  52. SSD Auth.log May 19 14:30:00 host sshd[#####]: Accepted publickey for

    host_username from 192.168.1.1 port ##### ssh2: RSA-CERT ID request[########################### #########] for[developer_username] from[10.0.1.1] command[ssh host_username@192.168.2.1] ssh_key:[RSA 00:00:00:00:de:ad:be:ef00:00:00:00:de:ad:be :ef] ca: [arn:aws:lambda:region:account#:function:na me] valid_to[YYY/MM/DD HH:MM:SS] (serial 0) CA RSA 8b:ad:f0:0d: 00:00:00:00:00:00:00:00:8b:ad:f0:0d May 19 14:30:00 host sshd[#####]: Accepted publickey for host_username from 192.168.1.1 port ##### ssh2: RSA de:ad:be:ef: 00:00:00:00:00:00:00:00:de:ad:be:ef
  53. None
  54. Would This Have Helped Shapeshift? • 2FA on the Bastion

    • Audit Logs locked away • Key Rotations all the way down
  55. Go Use SSH Certificates! https://github.com/Netflix/bless

  56. Questions? RussellL@netflix.com