Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Infra Days Asia - Auditing in Kubernetes 101

Nikhita Raghunath
September 11, 2021
28

Open Infra Days Asia - Auditing in Kubernetes 101

Talk about basics of auditing in Kubernetes, for Open Infra Days Asia 2021

Nikhita Raghunath

September 11, 2021
Tweet

Transcript

  1. Auditing in
    Kubernetes 101
    Nikhita Raghunath
    Staff Engineer, VMware

    View full-size slide

  2. WHO AM I
    ● Staff Engineer at VMware
    ● Member of the Kubernetes Steering
    Committee
    ● Technical Lead for SIG Contributor
    Experience
    ● CNCF Ambassador
    Github - nikhita
    Twitter - TheNikhita

    View full-size slide

  3. SECRET CONTAINING PASSWORD
    IN YOUR CLUSTER

    View full-size slide

  4. SECRET CONTAINING PASSWORD
    IN YOUR CLUSTER
    SECRET GOT UPDATED TO
    MYSTERIOUS VALUE

    View full-size slide

  5. SECRET CONTAINING PASSWORD
    IN YOUR CLUSTER
    SECRET GOT UPDATED TO
    MYSTERIOUS VALUE
    LOGS

    View full-size slide

  6. Logs from the Pod
    @TheNikhita

    View full-size slide

  7. Logs from the Pod
    @TheNikhita

    View full-size slide

  8. Logs from the Pod
    Events
    @TheNikhita

    View full-size slide

  9. Logs from the Pod
    Events
    @TheNikhita

    View full-size slide

  10. Logs from the Pod
    Events
    Apiserver Logs
    @TheNikhita

    View full-size slide

  11. Logs from the Pod
    Events
    Apiserver Logs
    @TheNikhita

    View full-size slide

  12. {
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "Metadata",
    "auditID": "7684b057-7e2d-4188-a6ae-8fc51afd0c9d",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/namespaces/default/secrets",
    "verb": "create",
    "user": {
    "username": "minikube-user",
    "groups": [
    "system:masters",
    "system:authenticated"
    ]
    },
    "sourceIPs": [
    "X.Y.Z.1"
    ],
    "objectRef": {
    "resource": "secrets",
    "namespace": "default",
    "name": "mysecret",
    "apiVersion": "v1"
    },
    "responseStatus": {
    "metadata": {},
    "code": 201
    },
    "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z",
    "stageTimestamp": "2020-02-12T18:06:04.584173Z",
    }
    AUDIT
    EVENTS
    @TheNikhita

    View full-size slide

  13. WHAT HAPPENED
    "verb": "create",
    @TheNikhita

    View full-size slide

  14. ON WHAT DID IT HAPPEN
    "objectRef": {
    "resource": "secrets",
    "namespace": "default",
    "name": "mysecret",
    "apiVersion": "v1"
    },
    @TheNikhita

    View full-size slide

  15. WHEN DID IT HAPPEN
    "requestReceivedTimestamp":
    "2020-02-12T18:06:04.577792Z",
    "stageTimestamp":
    "2020-02-12T18:06:04.584173Z",
    @TheNikhita

    View full-size slide

  16. WHO DID IT
    "user": {
    "username": "minikube-user",
    "groups": [
    "system:masters",
    "system:authenticated"
    ]
    },
    @TheNikhita

    View full-size slide

  17. WHERE WAS IT INITIATED
    "sourceIPs": [
    "1.2.3.4"
    ],
    @TheNikhita

    View full-size slide

  18. THAT’S A LOT OF LOGS!

    View full-size slide

  19. LET’S CONTROL THE
    VERBOSITY

    View full-size slide

  20. LET’S CONTROL THE
    VERBOSITY
    WHAT TO LOG WHEN TO LOG

    View full-size slide

  21. LET’S CONTROL THE
    VERBOSITY
    WHAT TO LOG WHEN TO LOG
    YAML

    View full-size slide

  22. AUDIT POLICY
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    @TheNikhita

    View full-size slide

  23. AUDIT POLICY
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    @TheNikhita

    View full-size slide

  24. AUDIT POLICY
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    @TheNikhita

    View full-size slide

  25. AUDIT POLICY
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    @TheNikhita

    View full-size slide

  26. WHEN TO LOG
    1. RequestReceived - Audit handler receives
    request
    @TheNikhita

    View full-size slide

  27. WHEN TO LOG
    1. RequestReceived - Audit handler receives
    request
    2. ResponseStarted - For long running requests
    @TheNikhita

    View full-size slide

  28. WHEN TO LOG
    1. RequestReceived - Audit handler receives
    request
    2. ResponseStarted - For long running requests
    3. ResponseComplete - Response body completed
    @TheNikhita

    View full-size slide

  29. WHEN TO LOG
    1. RequestReceived - Audit handler receives
    request
    2. ResponseStarted - For long running requests
    3. ResponseComplete - Response body completed
    4. Panic - Event generated when panic occurs
    @TheNikhita

    View full-size slide

  30. RequestReceived
    ResponseComplete
    Response
    ResponseStarted
    Panic
    Request
    Response
    Kube APIserver

    View full-size slide

  31. Request
    Kube APIserver

    View full-size slide

  32. RequestReceived
    Request
    Kube APIserver

    View full-size slide

  33. RequestReceived Response
    Request
    Kube APIserver

    View full-size slide

  34. RequestReceived Response
    Panic
    Request
    Kube APIserver

    View full-size slide

  35. RequestReceived Response
    ResponseStarted
    Request
    Kube APIserver

    View full-size slide

  36. RequestReceived
    ResponseComplete
    Response
    ResponseStarted
    Request
    Kube APIserver

    View full-size slide

  37. RequestReceived
    ResponseComplete
    Response
    ResponseStarted
    Request
    Response
    Kube APIserver

    View full-size slide

  38. RequestReceived
    ResponseComplete
    Response
    ResponseStarted
    Panic
    Request
    Response
    Kube APIserver

    View full-size slide

  39. WHAT TO LOG
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    GROUP/VERSION
    RESOURCE
    VERBS
    @TheNikhita

    View full-size slide

  40. WHAT TO LOG
    apiVersion: audit.k8s.io/v1
    kind: Policy
    rules:
    - level: Metadata
    omitStages:
    - RequestReceived
    resources:
    - group: ""
    resources:
    - secrets
    @TheNikhita

    View full-size slide

  41. LEVELS
    1. None - don’t log these requests
    @TheNikhita

    View full-size slide

  42. LEVELS
    1. None - don’t log these requests
    2. Metadata - only request metadata
    @TheNikhita

    View full-size slide

  43. LEVELS
    1. None - don’t log these requests
    2. Metadata - only request metadata
    3. Request - ,, + request body
    @TheNikhita

    View full-size slide

  44. LEVELS
    1. None - don’t log these requests
    2. Metadata - only request metadata
    3. Request - ,, + request body
    4. RequestResponse - ,, + response body
    @TheNikhita

    View full-size slide

  45. RECOMMENDATIONS FOR
    WRITING POLICIES

    View full-size slide

  46. - level: Metadata
    resources:
    - group: ""
    resources:
    - secrets
    - configmaps
    - group: authentication.k8s.io
    resources:
    - tokenreviews
    Only log at
    Metadata level for
    sensitive resources
    @TheNikhita

    View full-size slide

  47. - level: None
    nonResourceURLs:
    - '/healthz*'
    - /version
    - '/swagger*'
    Don’t log read-only URLs
    @TheNikhita

    View full-size slide

  48. Log at RequestResponse level for
    critical resources
    Log at atleast Metadata level for all
    resources
    @TheNikhita

    View full-size slide

  49. rules:
    - level: RequestResponse
    resources:
    - group: ""
    resources: ["pods"]
    - level: Metadata
    resources:
    - group: ""
    resources: ["pods/log", "pods/status"]
    @TheNikhita

    View full-size slide

  50. rules:
    - level: RequestResponse
    resources:
    - group: ""
    resources: ["pods"]
    - level: Metadata
    resources:
    - group: ""
    resources: ["pods/log", "pods/status"]
    Evaluated in
    top-down order
    @TheNikhita

    View full-size slide

  51. rules:
    - level: RequestResponse
    resources:
    - group: ""
    resources: ["pods"]
    - level: Metadata
    resources:
    - group: ""
    resources: ["pods/log", "pods/status"]
    Status calls can be large and
    high-volume
    @TheNikhita

    View full-size slide

  52. More examples at
    https://github.com/kubernetes/kubernetes/blob/master/cl
    uster/gce/gci/configure-helper.sh
    @TheNikhita

    View full-size slide

  53. WHERE DO THESE LOGS GO

    View full-size slide

  54. BACKEND
    LOG WEBHOOK
    @TheNikhita

    View full-size slide

  55. BACKEND
    LOG WEBHOOK
    ● Writes events to disk ● Sends events to external API
    @TheNikhita

    View full-size slide

  56. BACKEND
    LOG WEBHOOK
    ● Writes events to disk
    ● --audit-log-path
    ● Sends events to external API
    ● --audit-webhook-config-file
    @TheNikhita

    View full-size slide

  57. BACKEND
    LOG WEBHOOK
    ● Writes events to disk
    ● --audit-log-path
    ● Sends events to external API
    ● --audit-webhook-config-file
    --audit-policy-file
    @TheNikhita

    View full-size slide

  58. HOW ARE THESE LOGS
    SENT TO THE BACKEND

    View full-size slide

  59. BATCHING
    BATCH BLOCKING BLOCKING-STRICT
    @TheNikhita

    View full-size slide

  60. BATCHING
    BATCH BLOCKING BLOCKING-STRICT
    Buffers events
    & processes
    in batches
    Blocks APIserver
    responses to
    process individual
    events
    Failure at
    RequestReceived
    stage leads to
    failure of whole call
    @TheNikhita

    View full-size slide

  61. BATCHING
    BATCH BLOCKING BLOCKING-STRICT
    --audit-webhook-mode
    --audit-log-mode
    @TheNikhita

    View full-size slide

  62. BATCHING
    BATCH BLOCKING BLOCKING-STRICT
    --audit-webhook-mode
    --audit-log-mode
    @TheNikhita

    View full-size slide

  63. UPDATING AUDIT POLICY

    View full-size slide

  64. UPDATING AUDIT POLICY
    RESTART OF APISERVER

    View full-size slide

  65. UPDATING AUDIT POLICY
    RESTART OF APISERVER

    View full-size slide

  66. UPDATING AUDIT POLICY

    View full-size slide

  67. UPDATING AUDIT POLICY
    UPDATING A K8S RESOURCE

    View full-size slide

  68. UPDATING AUDIT POLICY
    UPDATING A K8S RESOURCE

    View full-size slide

  69. DYNAMIC AUDIT CONFIGURATION
    @TheNikhita

    View full-size slide

  70. DYNAMIC AUDIT CONFIGURATION
    apiVersion: auditregistration.k8s.io/v1alpha1
    kind: AuditSink
    metadata:
    name: mysink
    spec:
    policy:
    level: Metadata
    stages:
    - ResponseComplete
    webhook:
    throttle:
    qps: 10
    burst: 15
    clientConfig:
    url: "https://audit.app"
    @TheNikhita

    View full-size slide

  71. SECURITY
    PERFORMANCE
    @TheNikhita

    View full-size slide

  72. SECURITY
    PERFORMANCE
    Write access to feature = Read access to all cluster data
    @TheNikhita

    View full-size slide

  73. SECURITY
    PERFORMANCE
    Write access to feature = Read access to all cluster data
    cluster-admin level privilege
    Increase in CPU/Memory Usage
    @TheNikhita

    View full-size slide

  74. KEP
    #sig-auth slack channel on k8s slack
    @TheNikhita

    View full-size slide

  75. LOG COLLECTOR PATTERNS

    View full-size slide

  76. LOG COLLECTOR PATTERNS
    Audit Log File + Fluentd
    @TheNikhita

    View full-size slide

  77. LOG COLLECTOR PATTERNS
    Audit Webhook File +
    Logstash
    @TheNikhita

    View full-size slide

  78. LOG COLLECTOR PATTERNS
    Audit Webhook File +
    Falco
    @TheNikhita

    View full-size slide

  79. HOW ARE AUDIT LOGS
    HELPFUL

    View full-size slide

  80. UNDERSTANDING K8S INTERNALS
    Analysing system calls show how
    different components interact
    @TheNikhita

    View full-size slide

  81. DETECTING MISCONFIGURATIONS
    “Who deleted this resource?”
    @TheNikhita

    View full-size slide

  82. TROUBLESHOOTING ISSUES
    Analysing calls which trigger
    HTTP errors
    @TheNikhita

    View full-size slide

  83. PERFORMANCE ISSUES
    “Which app is generating lots of calls”
    @TheNikhita

    View full-size slide

  84. CONCLUSION
    ● Audit logs can give us a lot of information of what goes on in our cluster
    ● To control what should be logged, we write audit policies
    ● Recommendations for writing audit policies
    ● Different audit backends
    ● Batching methods
    ● Dynamic Audit Configuration
    ● Log Collector Patterns

    View full-size slide