Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to make your container:Kubernetes is a bit more secure

How to make your container:Kubernetes is a bit more secure

Kyle Bai

June 30, 2020
Tweet

More Decks by Kyle Bai

Other Decks in Technology

Transcript

  1. @k2r2bai How to make your Container/ Kubernetes is a bit

    more secure 1 SDN x Cloud Native Meetup #29
  2. @k2r2bai About Me ⽩凱仁(Kyle Bai) • Site Reliability Engineer at

    AMIS/MaiCoin • Co-organizer of Cloud Native Taiwan User Group. • Interested in emerging technologies. • Contributor to multiple OSS. kairen k2r2bai.com
  3. @k2r2bai Agenda Today I would like to talk about •

    The 4C's of Cloud Native security • Cloud/Co-location • Cluster • Container • Code
  4. @k2r2bai A suspicious Kubeflow image was seen deployed to thousands

    of clusters in April, all from a single public repository. Closer inspection showed that the image runs a common open-source cryptojacking malware that mines the Monero virtual currency, known as XMRIG. Misconfigured Kubeflow workloads are a security risk h@ps://bit.ly/2NI7Q0A
  5. @k2r2bai • The cluster owner granted cluster-admin to system:anonymous user.

    • The cluster owner exposed the dashboard to the internet, and the attacker found it by scanning. Cryptocurrency mining attack against Kubernetes clusters h@ps://bit.ly/3ibjZsS $ kubectl create clusterrolebinding open-api \ --clusterrole=cluster-admin --user=system:anonymous $ kubectl create clusterrolebinding dashboard \ --clusterrole=cluster-admin \ --user=system:serviceaccount:kube-system:kubernetes-dashboard
  6. @k2r2bai RedLock Cloud Security Intelligence (CSI) team discovered that cryptocurrency

    mining scripts, used for cryptojacking -- the unauthorized use of computing power to mine cryptocurrency -- were operating on Tesla's unsecured Kubernetes instances, which allowed the attackers to steal the Tesla AWS compute resources to line their own pockets. Cryptojacking and Crypto Mining – Tesla, Kubernetes, and Jenkins Exploits h@ps://www.ithome.com.tw/news/121378
  7. @k2r2bai Allows attackers to overwrite the host runc binary (and

    consequently obtain host root access) by leveraging the ability to execute a command as root within one of these types of containers: • A new container with an attacker-controlled image. • An existing container, to which the attacker previously had write access, that can be attached with docker exec. CVE-2019-5736 h@ps://www.cvedetails.com/cve/CVE-2019-5736/
  8. @k2r2bai CVE-2019-14271 marks a security issue in the implementation of

    the Docker cp command that can lead to full container escape when exploited by an attacker. CVE-2019-14271 h@ps://bit.ly/2VwF6Mr h@ps://www.anquanke.com/post/id/193218
  9. @k2r2bai A path traversal vulnerability has been discovered in podman

    before version 1.4.0 in the way it handles symlinks inside containers. An attacker who has compromised an existing container can cause arbitrary files on the host filesystem to be read/written when an administrator tries to copy a file from/to the container. CVE-2019-10152 h@ps://www.cvedetails.com/cve/CVE-2019-10152/
  10. @k2r2bai • Kubernetes API DoS vulnerability (CVE-2019-1002100). • Kubectl vulnerability

    (CVE-2019-1002101). • Kubernetes API server vulnerability (CVE-2019-11247). • Kubernetes billion laughs attack vulnerability (CVE-2019-11253). • HTTP/2 Ping Flood(CVE-2019-9512). • HTTP/2 Reset Flood(CVE-2019-9514 ). • ... Kubernetes Vulnerabilities of 2019 h@ps://bit.ly/2AfQmFL h@ps://bit.ly/3iedhCr
  11. @k2r2bai In all Kubernetes versions prior to v1.10.11, v1.11.5, and

    v1.12.3, incorrect handling of error responses to proxied upgrade requests in the kube-apiserver allowed specially crafted requests to establish a connection through the Kubernetes API server to backend servers, then send arbitrary requests over the same connection directly to the backend, authenticated with the Kubernetes API server's TLS credentials used to establish the backend connection. CVE-2018-1002105 h@ps://rancher.com/blog/2018/2018-12-04-k8s-cve/
  12. @k2r2bai • Misconfiguration Issues: As the number of components for

    various cloud architectures increase, we also expect to see a rise in the number of misconfigurations. • Automation: Automation is good for improving the speed of creating new systems and deploying new applications, however, it can also propagate errors and security issues much faster if they are not properly checked and monitored. The most common issues found in cloud systems
  13. @k2r2bai • Infrastructure as code (IaC): IaC uses code to

    automate the proper provisioning of IT architectures, which allows for the elimination of manual provisioning by DevOps engineers, therefore minimizing oversight and human errors as long as best practices are followed. How to avoid issues?
  14. @k2r2bai • CSP’s security recommendations: Following their cloud provider’s recommendations

    and performing regular audits to make sure that everything is configured properly before they’re deployed to production and exposed to the internet. How to avoid issues?
  15. @k2r2bai • Leverage the “at-rest” encryption that each service provides

    for your data. An example is enabling S3 SSE encryption on a bucket or encrypting an RDS instance with a KMS key. • Ensure that operating systems are always patched and up to date using the package manager of that operating system. • Subscribe to CVE feeds that let you know if something you have in production is vulnerable. If you are unfamiliar with CVE’s, these are Common Vulnerabilities and Exposures and there is an international database that tracks high-profile software bugs so that you can remediate them Example for AWS Compute
  16. @k2r2bai • Leverage AWS VPC’s and VPNs/VPC Peering/VPC endpoints to

    securely and privately communicate with your applications and AWS services from your applications. • Use Security Groups that are extremely locked down so that no traffic is communicating unnecessarily. • Leverage VPC Flow logging to get packet level inspection of traffic. • Use tools such as WAF and AWS Shield to protect endpoints from commonly known attacks. Example for AWS Network
  17. @k2r2bai • Network access to API Server (Control plane): All

    access to the Kubernetes control plane is not allowed publicly on the internet and is controlled by network access control lists restricted to the set of IP addresses needed to administer the cluster. • Controlling network access to API server using a Bastion instance. • Network access to Nodes (nodes): Nodes should be configured to only accept connections (via network access control lists)from the control plane on the specified ports, and accept connections for services in Kubernetes of type NodePort and LoadBalancer. • Use multiple cloud Load Balancer(ex: Internal and external ALB). Infrastructure security for Kubernetes
  18. @k2r2bai • Kubernetes access to Cloud Provider API: Each cloud

    provider needs to grant a different set of permissions to the Kubernetes control plane and nodes. • Provide the cluster with cloud provider access that follows the principle of least privilege (PoLP) for the resources it needs to administer. Infrastructure security for Kubernetes
  19. @k2r2bai • Access to etcd: Access to etcd (the datastore

    of Kubernetes) should be limited to the control plane only. • Depending on your configuration, you should attempt to use etcd over TLS. • etcd Encryption: Wherever possible it's a good practice to encrypt all drives at rest, but since etcd holds the state of the entire cluster (including Secrets) its disk should especially be encrypted at rest. • Using a KMS provider for data encryption. • Encrypting Secret Data at Rest. Infrastructure security for etcd
  20. @k2r2bai Things like controlling API server access and restricting direct

    access to etcd, which is Kubernetes’s primary datastore, should be top of mind when it comes to cluster security: • Component(kube-scheduler, kubelet, custom controller,... , etc) should be limited to its need permission for accessing API server. • API Authentication. • API Authorization. • Controlling the capabilities of a workload or user at runtime. See Securing a Cluster. Cluster components
  21. @k2r2bai • Enable Kubernetes Audit Logging. • Leverage OPA to

    enforce admission control decisions in Kubernetes clusters without modifying or recompiling any Kubernetes components. • Use any kind of tool to increase awareness and visibility for security issues in Kubernetes environments. • ex: kube-hunter, kube-bench, kubeaudit, kubesec, Dagda, Falco... Cluster components
  22. @k2r2bai Audit Logging Kubernetes Audit logging is a way to

    get a transcript of every action taken on a cluster. This is important to be able to perform forensic analysis after an attack was carried out, or to understand if there are malicious bad actors performing tasks in your cluster that should not be. h@ps://kubernetes.io/docs/tasks/debug-application-cluster/audit/
  23. @k2r2bai To secure these services(applications), Kubernetes recommends employing certain protective

    measures such as resource management and running services with the least privilege. Cluster services(applications)
  24. @k2r2bai RBAC • Newer versions of Kubernetes use a form

    of API security called role-based access control. By leveraging ClusterRoles/Roles and ClusterRoleBindings/RoleBindings, cluster operators are able to control access to manipulate resources in Kubernetes. • Much in the same way you would want to be careful about what access you give in AWS IAM, you will want to be similarly cautious in Kubernetes. • ex: aws-iam-authenticator. • Scan Kubernetes cluster for risky permissions in Kubernetes's RBAC authorization model. • ex: KubiScan, kubernetes-rbac-audit.
  25. @k2r2bai Pod Security Policies • PodSecurityPolicies allow you to dictate

    how a Pod is allowed to run on a Node. This is helpful in case you want to enforce that Pods cannot run as a root user in Linux or that they cannot map a particular hostPath. • User and group to run as. • Available Linux capabilities. • Ability to escalate privileges. • By utilizing these, Cluster Operators can have confidence that Kubernetes will only schedule and start a Pod which complies with these policies.
  26. @k2r2bai Quotas • Limiting resource usage on a cluster(Resource quota,

    Limit ranges). By utilizing Kubernetes Quotas, you can avoid a Denial of Service attack to disrupt the normal flow of information to legitimate users. • What can happen here without them is that Kubernetes assignes a QoS class of “BestEffort” to each of the Pods. And if one is currently undergoing an attack it can expand and start to cause disruption to other Pods on the cluster.
  27. @k2r2bai Secret management • If you are not already encrypting

    your etcd volumes at rest in your cloud provider, then you should consider using an EncryptionProvider to ensure that secrets are secure at rest and only decrypted when a Pod needs them. • Integrate Secrets Store CSI driver to allow Kubernetes to mount multiple secrets, keys, and certs stored in enterprise-grade external secrets stores into their pods as a volume. • ex: HashiCorp Vault, Azure Key Vault.
  28. @k2r2bai This is related to the proper allocation of ports

    to facilitate communication between containers, pods, and services. Cluster networking
  29. @k2r2bai • Filtering load balanced traffic. • Limiting Pod-to-Pod communication.

    • Depending on the CNI that your cluster uses, you may have the ability to apply NetworkPolicies to your cluster. Network Policies
  30. @k2r2bai TLS Ingress • You can leverage the TLS encryption

    of Kubernetes ingress objects to ensure that your traffic is coming into the cluster encrypted. • If you want to ensure that all communication between all Pods is encrypted, then you should consider using a Service Mesh tool such as Istio or Linkerd.
  31. @k2r2bai • wg-security-audit Kubernetes Final Report: https://bit.ly/3ieHMIh • Aqua Blog:

    https://blog.aquasec.com/page/2 • Stackrox Blog: https://www.stackrox.com/post/ • Kubernetes Security Book: https://kubernetes-security.info/ • Kubernetes Security Docs: https://kubernetes.io/docs/concepts/security/ Other Resources
  32. @k2r2bai Container Runtime Engines (CREs) are needed for running the

    containers in the cluster. Although Docker is one of the most popular CREs, Kubernetes also supports others such as containerd or CRI-O. There are three main things that organizations need to be concerned about with this layer: • How secure are your images? • Can they be trusted? • Are they running with proper privileges? Container Security
  33. @k2r2bai • This comes down to making sure your containers

    are up-to-date and free of any major vulnerability that could be exploited by a threat actor. • Use an image scanner to identify known Container vulnerabilities and OS Dependency security. • ex: trivy, Clair, Cloud service's container registry scanner. • Reducing the size of your Container images. • Build image from scratch. • Use distroless images. • Configure a repository to be immutable to prevent image tags from being overwritten. How secure are your images?
  34. @k2r2bai • By using image signing tools, to sign your

    images and maintain a system of trust for the content of your containers. • ex: TUF, Notary. • Use Kubernetes admission controller for the enforcment of image security policies. • ex: IBM portieris, Aqua Image Assurance. Can they be trusted?
  35. @k2r2bai • Assess the privileges used by containers. The principle

    of least privilege(PoLP) applies here. • You should only run containers with users that have the minimal OS privileges necessary to carry out their tasks. • Use Rootless mode to allow running the Container daemon and containers as a non- root user. • Secure Container Isolation. Are they running with proper privileges?
  36. @k2r2bai • Namespaces: Isolate kernel data structures, such as processes,

    mount tables, network interfaces, and others. Not all kernel data structures have namespace isolation, such as the clock, audit logs, and keyrings. • cgroups: Limits, controls, and accounting of compute resources and devices. Examples include limiting and accounting CPU, memory and network usage, hiding devices, and limiting the number of process IDs. • Users: Core linux permission model. Mostly used for filesystem permissions (DAC) and process signaling. Current State of Container Isolation
  37. @k2r2bai • seccomp-bpf: Whitelist (filter) linux syscalls & arguments. Useful

    for restricting non-namespaced syscalls, poorly supported syscalls, and syscalls that don't have associated capabilities. Docker provides a default seccomp profile, which is compatible with most unprivileged container workloads. • AppArmor / SELinux: A Linux Security Module (AppArmor & SELinux are mutually exclusive). Mostly useful for finer grained control of filesystem access, but recent changes are adding in more networking controls. • Capabilities: Subdivide root user privileges into various capabilities. The docker defaults drop un- namespaced capabilities (e.g. ability to install kernel modules, manage the network devices, and reboot the machin Current State of Container Isolation
  38. @k2r2bai If your code needs to communicate by TCP, perform

    a TLS handshake with the client ahead of time. With the exception of a few cases, encrypt everything in transit. Going one step further, it's a good idea to encrypt network traffic between services. This can be done through a process known as mutual or mTLS which performs a two sided verification of communication between two certificate holding services. Access over TLS only
  39. @k2r2bai This recommendation may be a bit self-explanatory, but wherever

    possible you should only expose the ports on your service that are absolutely essential for communication or metric gathering. Limiting port ranges of communication
  40. @k2r2bai It is a good practice to regularly scan your

    application's third party libraries for known security vulnerabilities. Each programming language has a tool for performing this check automatically. 3rd Party Dependency Security
  41. @k2r2bai • Most languages provide a way for a snippet

    of code to be analyzed for any potentially unsafe coding practices. Whenever possible you should perform checks using automated tooling that can scan codebases for common security errors. • Some of the tools can be found at: https://owasp.org/www-community/ Source_Code_Analysis_Tools. Static Code Analysis
  42. @k2r2bai There are a few automated tools that you can

    run against your service to try some of the well known service attacks. These include SQL injection, CSRF, and XSS. One of the most popular dynamic analysis tools is the OWASP Zed Attack Proxy. Dynamic probing attacks