Auditing in Kubernetes 101

Auditing in Kubernetes 101

Have you ever wondered who created particular changes in your cluster, when they created it or what resources were modified? All of such information about “what sequence of events lead to this scenario” can be obtained using the powerful audit logging feature. In this talk, we will first go over what audit logs are and how to leverage them to stay informed with what goes on in your cluster. Keeping both performance impact and accountability in mind, we will then walk through examples of policy configurations to enforce best security practices, detect misuse and make your cluster more compliant. We’ll also do a demo of setting up auditing on a cluster and inspecting the logs. Finally, we will see what future improvements are planned for this feature and how you can provide feedback and get involved.

Bb8ed71cd77d9da995685ca872b303e7?s=128

Nikhita Raghunath

February 17, 2020
Tweet

Transcript

  1. Nikhita Raghunath Loodse Auditing in Kubernetes 101

  2. WHO AM I • Software Engineer at Loodse • Member

    of the Kubernetes Steering Committee • Technical Lead for SIG Contributor Experience • CNCF Ambassador Github - nikhita Twitter - TheNikhita
  3. SECRET CONTAINING PASSWORD IN YOUR CLUSTER

  4. SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO

    MYSTERIOUS VALUE
  5. SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO

    MYSTERIOUS VALUE LOGS
  6. Logs from the Pod @TheNikhita

  7. Logs from the Pod @TheNikhita

  8. Logs from the Pod Events @TheNikhita

  9. Logs from the Pod Events @TheNikhita

  10. Logs from the Pod Events Apiserver Logs @TheNikhita

  11. Logs from the Pod Events Apiserver Logs @TheNikhita

  12. AUDIT LOGS!

  13. { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "7684b057-7e2d-4188-a6ae-8fc51afd0c9d", "stage":

    "ResponseComplete", "requestURI": "/api/v1/namespaces/default/secrets", "verb": "create", "user": { "username": "minikube-user", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "X.Y.Z.1" ], "objectRef": { "resource": "secrets", "namespace": "default", "name": "mysecret", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 201 }, "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", } AUDIT EVENTS @TheNikhita
  14. WHAT HAPPENED "verb": "create", @TheNikhita

  15. ON WHAT DID IT HAPPEN "objectRef": { "resource": "secrets", "namespace":

    "default", "name": "mysecret", "apiVersion": "v1" }, @TheNikhita
  16. WHEN DID IT HAPPEN "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", @TheNikhita

  17. WHO DID IT "user": { "username": "minikube-user", "groups": [ "system:masters",

    "system:authenticated" ] }, @TheNikhita
  18. WHERE WAS IT INITIATED "sourceIPs": [ "1.2.3.4" ], @TheNikhita

  19. THAT’S A LOT OF LOGS!

  20. LET’S CONTROL THE VERBOSITY

  21. LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG

  22. LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG

    YAML
  23. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  24. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  25. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  26. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  27. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    @TheNikhita
  28. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests @TheNikhita
  29. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed @TheNikhita
  30. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed 4. Panic - Event generated when panic occurs @TheNikhita
  31. RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

  32. Request Kube APIserver

  33. RequestReceived Request Kube APIserver

  34. RequestReceived Response Request Kube APIserver

  35. RequestReceived Response Panic Request Kube APIserver

  36. RequestReceived Response ResponseStarted Request Kube APIserver

  37. RequestReceived ResponseComplete Response ResponseStarted Request Kube APIserver

  38. RequestReceived ResponseComplete Response ResponseStarted Request Response Kube APIserver

  39. RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

  40. WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level:

    Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets GROUP/VERSION RESOURCE VERBS @TheNikhita
  41. WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level:

    Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  42. LEVELS 1. None - don’t log these requests @TheNikhita

  43. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata @TheNikhita
  44. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata 3. Request - ,, + request body @TheNikhita
  45. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata 3. Request - ,, + request body 4. RequestResponse - ,, + response body @TheNikhita
  46. RECOMMENDATIONS FOR WRITING POLICIES

  47. - level: Metadata resources: - group: "" resources: - secrets

    - configmaps - group: authentication.k8s.io resources: - tokenreviews Only log at Metadata level for sensitive resources @TheNikhita
  48. - level: None nonResourceURLs: - '/healthz*' - /version - '/swagger*'

    Don’t log read-only URLs @TheNikhita
  49. Log at RequestResponse level for critical resources Log at atleast

    Metadata level for all resources @TheNikhita
  50. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] @TheNikhita
  51. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Evaluated in top-down order @TheNikhita
  52. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Status calls can be large and high-volume @TheNikhita
  53. More examples at https://github.com/kubernetes/kubernetes/blob/master/cl uster/gce/gci/configure-helper.sh @TheNikhita

  54. WHERE DO THESE LOGS GO

  55. BACKEND LOG WEBHOOK @TheNikhita

  56. BACKEND LOG WEBHOOK • Writes events to disk • Sends

    events to external API @TheNikhita
  57. BACKEND LOG WEBHOOK • Writes events to disk • --audit-log-path

    • Sends events to external API • --audit-webhook-config-file @TheNikhita
  58. BACKEND LOG WEBHOOK • Writes events to disk • --audit-log-path

    • Sends events to external API • --audit-webhook-config-file --audit-policy-file @TheNikhita
  59. HOW ARE THESE LOGS SENT TO THE BACKEND

  60. BATCHING BATCH BLOCKING BLOCKING-STRICT @TheNikhita

  61. BATCHING BATCH BLOCKING BLOCKING-STRICT Buffers events & processes in batches

    Blocks APIserver responses to process individual events Failure at RequestReceived stage leads to failure of whole call @TheNikhita
  62. BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

  63. BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

  64. UPDATING AUDIT POLICY

  65. UPDATING AUDIT POLICY RESTART OF APISERVER

  66. UPDATING AUDIT POLICY RESTART OF APISERVER

  67. UPDATING AUDIT POLICY

  68. UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

  69. UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

  70. DYNAMIC AUDIT CONFIGURATION --audit-dynamic-configuration --feature-gates=DynamicAuditing=true --runtime-config=auditregistration.k8s.io/v1alpha1=true @TheNikhita

  71. DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink

    spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita
  72. DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink

    spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita
  73. DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink

    spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita
  74. DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink

    spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita
  75. SECURITY PERFORMANCE @TheNikhita

  76. SECURITY PERFORMANCE Write access to feature = Read access to

    all cluster data cluster-admin level privilege @TheNikhita
  77. SECURITY PERFORMANCE Write access to feature = Read access to

    all cluster data cluster-admin level privilege Increase in CPU/Memory Usage Don’t use too many sinks @TheNikhita
  78. KEP https://github.com/kubernetes/enhancements/pull/1259 #sig-auth slack channel on k8s slack @TheNikhita

  79. None
  80. LOG COLLECTOR PATTERNS

  81. LOG COLLECTOR PATTERNS Audit Log File + Fluentd @TheNikhita

  82. LOG COLLECTOR PATTERNS Audit Webhook File + Logstash @TheNikhita

  83. LOG COLLECTOR PATTERNS Audit Webhook File + Falco @TheNikhita

  84. HOW ARE AUDIT LOGS HELPFUL

  85. UNDERSTANDING K8S INTERNALS Analysing system calls show how different components

    interact @TheNikhita
  86. DETECTING MISCONFIGURATIONS “Who deleted this resource?” @TheNikhita

  87. TROUBLESHOOTING ISSUES Analysing calls which trigger HTTP errors @TheNikhita

  88. PERFORMANCE ISSUES “Which app is generating lots of calls” @TheNikhita

  89. CONCLUSION • Audit logs can give us a lot of

    information of what goes on in our cluster • To control what should be logged, we write audit policies • Recommendations for writing audit policies • Different audit backends • Batching methods • Dynamic Audit Configuration • Log Collector Patterns
  90. THANK YOU