Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Infra Days Asia - Auditing in Kubernetes 101

Bb8ed71cd77d9da995685ca872b303e7?s=47 Nikhita Raghunath
September 11, 2021
9

Open Infra Days Asia - Auditing in Kubernetes 101

Talk about basics of auditing in Kubernetes, for Open Infra Days Asia 2021

Bb8ed71cd77d9da995685ca872b303e7?s=128

Nikhita Raghunath

September 11, 2021
Tweet

Transcript

  1. Auditing in Kubernetes 101 Nikhita Raghunath Staff Engineer, VMware

  2. WHO AM I • Staff Engineer at VMware • Member

    of the Kubernetes Steering Committee • Technical Lead for SIG Contributor Experience • CNCF Ambassador Github - nikhita Twitter - TheNikhita
  3. SECRET CONTAINING PASSWORD IN YOUR CLUSTER

  4. SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO

    MYSTERIOUS VALUE
  5. SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO

    MYSTERIOUS VALUE LOGS
  6. Logs from the Pod @TheNikhita

  7. Logs from the Pod @TheNikhita

  8. Logs from the Pod Events @TheNikhita

  9. Logs from the Pod Events @TheNikhita

  10. Logs from the Pod Events Apiserver Logs @TheNikhita

  11. Logs from the Pod Events Apiserver Logs @TheNikhita

  12. AUDIT LOGS!

  13. { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "7684b057-7e2d-4188-a6ae-8fc51afd0c9d", "stage":

    "ResponseComplete", "requestURI": "/api/v1/namespaces/default/secrets", "verb": "create", "user": { "username": "minikube-user", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "X.Y.Z.1" ], "objectRef": { "resource": "secrets", "namespace": "default", "name": "mysecret", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 201 }, "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", } AUDIT EVENTS @TheNikhita
  14. WHAT HAPPENED "verb": "create", @TheNikhita

  15. ON WHAT DID IT HAPPEN "objectRef": { "resource": "secrets", "namespace":

    "default", "name": "mysecret", "apiVersion": "v1" }, @TheNikhita
  16. WHEN DID IT HAPPEN "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", @TheNikhita

  17. WHO DID IT "user": { "username": "minikube-user", "groups": [ "system:masters",

    "system:authenticated" ] }, @TheNikhita
  18. WHERE WAS IT INITIATED "sourceIPs": [ "1.2.3.4" ], @TheNikhita

  19. THAT’S A LOT OF LOGS!

  20. LET’S CONTROL THE VERBOSITY

  21. LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG

  22. LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG

    YAML
  23. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  24. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  25. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  26. AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata

    omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  27. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    @TheNikhita
  28. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests @TheNikhita
  29. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed @TheNikhita
  30. WHEN TO LOG 1. RequestReceived - Audit handler receives request

    2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed 4. Panic - Event generated when panic occurs @TheNikhita
  31. RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

  32. Request Kube APIserver

  33. RequestReceived Request Kube APIserver

  34. RequestReceived Response Request Kube APIserver

  35. RequestReceived Response Panic Request Kube APIserver

  36. RequestReceived Response ResponseStarted Request Kube APIserver

  37. RequestReceived ResponseComplete Response ResponseStarted Request Kube APIserver

  38. RequestReceived ResponseComplete Response ResponseStarted Request Response Kube APIserver

  39. RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

  40. WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level:

    Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets GROUP/VERSION RESOURCE VERBS @TheNikhita
  41. WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level:

    Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita
  42. LEVELS 1. None - don’t log these requests @TheNikhita

  43. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata @TheNikhita
  44. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata 3. Request - ,, + request body @TheNikhita
  45. LEVELS 1. None - don’t log these requests 2. Metadata

    - only request metadata 3. Request - ,, + request body 4. RequestResponse - ,, + response body @TheNikhita
  46. RECOMMENDATIONS FOR WRITING POLICIES

  47. - level: Metadata resources: - group: "" resources: - secrets

    - configmaps - group: authentication.k8s.io resources: - tokenreviews Only log at Metadata level for sensitive resources @TheNikhita
  48. - level: None nonResourceURLs: - '/healthz*' - /version - '/swagger*'

    Don’t log read-only URLs @TheNikhita
  49. Log at RequestResponse level for critical resources Log at atleast

    Metadata level for all resources @TheNikhita
  50. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] @TheNikhita
  51. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Evaluated in top-down order @TheNikhita
  52. rules: - level: RequestResponse resources: - group: "" resources: ["pods"]

    - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Status calls can be large and high-volume @TheNikhita
  53. More examples at https://github.com/kubernetes/kubernetes/blob/master/cl uster/gce/gci/configure-helper.sh @TheNikhita

  54. WHERE DO THESE LOGS GO

  55. BACKEND LOG WEBHOOK @TheNikhita

  56. BACKEND LOG WEBHOOK • Writes events to disk • Sends

    events to external API @TheNikhita
  57. BACKEND LOG WEBHOOK • Writes events to disk • --audit-log-path

    • Sends events to external API • --audit-webhook-config-file @TheNikhita
  58. BACKEND LOG WEBHOOK • Writes events to disk • --audit-log-path

    • Sends events to external API • --audit-webhook-config-file --audit-policy-file @TheNikhita
  59. HOW ARE THESE LOGS SENT TO THE BACKEND

  60. BATCHING BATCH BLOCKING BLOCKING-STRICT @TheNikhita

  61. BATCHING BATCH BLOCKING BLOCKING-STRICT Buffers events & processes in batches

    Blocks APIserver responses to process individual events Failure at RequestReceived stage leads to failure of whole call @TheNikhita
  62. BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

  63. BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

  64. UPDATING AUDIT POLICY

  65. UPDATING AUDIT POLICY RESTART OF APISERVER

  66. UPDATING AUDIT POLICY RESTART OF APISERVER

  67. UPDATING AUDIT POLICY

  68. UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

  69. UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

  70. DYNAMIC AUDIT CONFIGURATION @TheNikhita

  71. DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink

    spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita
  72. SECURITY PERFORMANCE @TheNikhita

  73. SECURITY PERFORMANCE Write access to feature = Read access to

    all cluster data @TheNikhita
  74. SECURITY PERFORMANCE Write access to feature = Read access to

    all cluster data cluster-admin level privilege Increase in CPU/Memory Usage @TheNikhita
  75. KEP #sig-auth slack channel on k8s slack @TheNikhita

  76. None
  77. LOG COLLECTOR PATTERNS

  78. LOG COLLECTOR PATTERNS Audit Log File + Fluentd @TheNikhita

  79. LOG COLLECTOR PATTERNS Audit Webhook File + Logstash @TheNikhita

  80. LOG COLLECTOR PATTERNS Audit Webhook File + Falco @TheNikhita

  81. HOW ARE AUDIT LOGS HELPFUL

  82. UNDERSTANDING K8S INTERNALS Analysing system calls show how different components

    interact @TheNikhita
  83. DETECTING MISCONFIGURATIONS “Who deleted this resource?” @TheNikhita

  84. TROUBLESHOOTING ISSUES Analysing calls which trigger HTTP errors @TheNikhita

  85. PERFORMANCE ISSUES “Which app is generating lots of calls” @TheNikhita

  86. CONCLUSION • Audit logs can give us a lot of

    information of what goes on in our cluster • To control what should be logged, we write audit policies • Recommendations for writing audit policies • Different audit backends • Batching methods • Dynamic Audit Configuration • Log Collector Patterns
  87. THANK YOU