various ways that an attacker could attempt to compromise your kubernetes cluster and the applications running on it. • An example of a security threat External attacks targeting security holes in the control plane Attacks targeting security holes in containers and nodes Attacks aimed at leaking administrator credentials Attacks aimed at mistakes in permission settings etc Reference) https://kubernetes-security.info/
Cloud k8s Cluster Master Node API Version • Extension • Core • Apps *** Resources • Deployment • Node • Pod *** Action • Create • Get • List *** Human Constraint of Namespace For Pod For User Admission Control Authn Authz k8s cluster Guard rail • ResourceQuota • LimitRange • AlwaysPullImages • NamespaceLifecycle • Priority • Pod Security Policy *** IAM OpenID Connect IAM Role for Service Account IAM aws-auth • For examples, authentication and authorization in AWS EKS.
WorkerNode Master Node Such as SaaS Client Container registory such as ECR Service outside kubnertes cluster such as RDS Container Container Private subnet Network Communication type application service operation、release work Maintenance server such as EC2 ELB z • Control network communication between each component.
from application code and container images. • Make it unique with production and staging and development. • Set the expiration date and rotate the encryption key. • The secret value is specified by the volume, not the environment variable. environment variables have the following risks. There are cases where environment variables are written to files or logs when a process crashes. Environment variables can be viewed with the “kubeclt describe pod” command. Environment variables can be viewed with the “docker inspect” command at the docker runtime.
workload running in the kubernetes is stateless, all version control with Git for updating or roleback. • Use CICD tooles like GitOps such as flux, argo, Jenkins x. Git repository is single source of true. Gitops tools flux argo Jenkins X developer operator Git code Git config Container registry build Config update cluster Code change Countinuous Integration Countinuous Deployment Deploy (update,roleback) etc 22
container life cycle Public Cloud (e.g. AWS) kubernetes Cluster Master/Woker Node Container private registry(ECR, etc) Git repository (codeCommit, etc) Server・OS Container runtime K8s manifest container image workload Development server public container registry such as Github or Quay Build Ship Run CI tool (CodeBuild, S3, etc) Build tool (docker, Bazel, etc) CDツール(CodeDeploy,etc) Penetration test Check artifact container images CIS kubernetes Benchmark test Configuration test • Security measures at shift left are important. Check base container images Unit & Configuration test
Version for kubernetes cluster and node. • Update kubernetes version regularly. If used Helm or operator, the same. • Update master and worker node os version regularly. Managing permissions for node. • Uses optimized host os. e.g)CoreOS, Container-Optimized OS, Bottlerocket • Enable rootless mode. supported in podman, experimental in docker v19.03. • Restrict remote login to host and container. same for sudo. • Prevent local file system mounts on the node. 24
phaze Build phaze • Minimum required package and library tools. e.g) shell,cli,etc • Use the trusted container based image provided by cloud vendors. • Use the latest version and vulnerabilities checked container image. Ship phaze • Check the container image regularly, not just at build time. • Add container tag like semantic version. not「latest」. e.g) v1.2.3 • Delete old container images regularly. • Use container image signature checking. e.g) DOCKER_CONTENT_TRUST = 1 25
Check using a security check tool suitable for each layer. terraform cloudformation Infrastructure as code kubernetes manifest conftest kubesec Each layer Check tools Check summary Docker file trivy hadlint Scans container images for known vulnerabilities Checks workload configuration for security issues Detection security risks in cloud infrastructure Specify custom checks for kubernetes resources Checks docker file for best practice etc container image 26
Run phaze • Prohibit promotion to privileged user and process execution. • Prohibit writing to the root file system. • Limit the mount path of volumes on the host. • Use linux capabilities. e.g) CAP_NET_RAW = disable • Limit number of process on the container. e.g) pod-max-pids, podPidsLimit • Disable unused listen port on the container. • Defines the request and limit of cpu and memory resource. • Challenge DevSecOps. 27
Run phaze • Separate the UID and GID of the container OS and host OS. • Limit Input/Output access from container to storage. • Limit input/output network access on container. • Limit number of file descriptor. e.g) ulimit –n 10 • Use the functions of AppArmor or seccomp. • Detecting and blocking suspicious activity. Some of the paid vendor tools. 28
Filed name Type Summary Example process privileged bool Allow privileged mode false allowPrivilegeEscalation bool Allow promotion to root user false defaultAllowPrivilegeEscalation bool Default value above false runAsUser runAsUserStrategyOptio n Specify the execution UID SecurityContext > runAsNonRoot:true 1000 runAsGroup runAsGroupStrategyOpti on Specify the execution GID 1000 filesystem readOnlyRootFileSystem bool Make the root file system read only true allowedHostPaths []AllowedHostPath Specify a AllowList for hostpath - pathPrefix: "/foo" readOnly: true fsGroup FSGroupStrategyOption Set groups of file systems to allow 1000 capabiliti es requiredDropCapabilities []string List to remove capabilities set in container requiredDropCapabilities: - NET_RAW - ** allowedCapabilitties []string List of capabilities to allow. If not specifiled, the default set is implicitly granted. allowedCapabilities: - KILL - CHOWN - ** • Pod security policy is still beta and has not reached GA. There is also a plan to deprecate pod security policy and move to Open Policy Agent. So be careful about future tends. 29
a security check tool suitable for each layer. gatekeeper Each layer Check tools Check summary trivy Scans container images for known vulnerabilities Checks best practices in workload configuration Detection security risks in cloud infrastructure Specify custom checks for kubernetes resources Checks the cluster against the CIS Benchmark Kube-bench Kube-hunter Checks for cluster and node-level security vulnerabilities terraform Infrastructure as code kubernetes cluster (master/worker node) cloudformation container image Kubernetes workload etc etc Recommends resource limits and requests based on actual resource usage 30
• usage of cpu and mem and disk, etc • number of api request, network traffic, cache hit rate, page load time, etc 2. Logs • syslog of server and container, log of application and middleware • network packet, database query, api request, release operation, audit log, etc 3. Traces • Tracking api call times between microservcies Reference) https://www.elastic.co/blog/observability-with-the-elastic-stack Reference) https://grafana.com/blog/2019/10/21/whats-next-for-observability/
The most commonly adopted tools are open source. (Prometheus, Grafana, Elastic, etc) • Many companies are using multiple tools. half of the companies are using 5 or more tools. Reference) https://radar.cncf.io/
various logs for each of the following layers, and make them ready for investigation and auditing. Cloud layer • cloud operation, host node, network communication, etc • Enable auditing and network logging options provided by cloud vendors. Kubernetes cluster layer • api-server request, master/worker node, pod, etc • Enable audit policy on kubernetes. Reference) https://kubernetes.io/docs/tasks/debug-application-cluster/audit Container layer • Application logs such as nginx, tomcat, etc • Aggregate the StdOut and StdErr of the container with a logging tool such as fluentd or elastic. • Logging Architecture Reference) https://kubernetes.io/docs/concepts/cluster-administration/logging/
behavior of the following containers. Container runs in privileged mode. Mount sensitive directory path. Browsing sensitive files. Writing device files. Browsing system binary files. Out-of-band network connection. etc Cloud Native Security Hub publishes Falco rules for multiple purposes. Reference) https://securityhub.dev/ Note) Falco cannot detect writes to host paths bound-mounted by processes in the container or indirect writes via symbolic links. Therefore, if you want to detect it, monitor file access with auditd which is a package provided by each Linux distribution.