Slide 1

Slide 1 text

Auditing in Kubernetes 101 Nikhita Raghunath Staff Engineer, VMware

Slide 2

Slide 2 text

WHO AM I ● Staff Engineer at VMware ● Member of the Kubernetes Steering Committee ● Technical Lead for SIG Contributor Experience ● CNCF Ambassador Github - nikhita Twitter - TheNikhita

Slide 3

Slide 3 text

SECRET CONTAINING PASSWORD IN YOUR CLUSTER

Slide 4

Slide 4 text

SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO MYSTERIOUS VALUE

Slide 5

Slide 5 text

SECRET CONTAINING PASSWORD IN YOUR CLUSTER SECRET GOT UPDATED TO MYSTERIOUS VALUE LOGS

Slide 6

Slide 6 text

Logs from the Pod @TheNikhita

Slide 7

Slide 7 text

Logs from the Pod @TheNikhita

Slide 8

Slide 8 text

Logs from the Pod Events @TheNikhita

Slide 9

Slide 9 text

Logs from the Pod Events @TheNikhita

Slide 10

Slide 10 text

Logs from the Pod Events Apiserver Logs @TheNikhita

Slide 11

Slide 11 text

Logs from the Pod Events Apiserver Logs @TheNikhita

Slide 12

Slide 12 text

AUDIT LOGS!

Slide 13

Slide 13 text

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Metadata", "auditID": "7684b057-7e2d-4188-a6ae-8fc51afd0c9d", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/default/secrets", "verb": "create", "user": { "username": "minikube-user", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "X.Y.Z.1" ], "objectRef": { "resource": "secrets", "namespace": "default", "name": "mysecret", "apiVersion": "v1" }, "responseStatus": { "metadata": {}, "code": 201 }, "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", } AUDIT EVENTS @TheNikhita

Slide 14

Slide 14 text

WHAT HAPPENED "verb": "create", @TheNikhita

Slide 15

Slide 15 text

ON WHAT DID IT HAPPEN "objectRef": { "resource": "secrets", "namespace": "default", "name": "mysecret", "apiVersion": "v1" }, @TheNikhita

Slide 16

Slide 16 text

WHEN DID IT HAPPEN "requestReceivedTimestamp": "2020-02-12T18:06:04.577792Z", "stageTimestamp": "2020-02-12T18:06:04.584173Z", @TheNikhita

Slide 17

Slide 17 text

WHO DID IT "user": { "username": "minikube-user", "groups": [ "system:masters", "system:authenticated" ] }, @TheNikhita

Slide 18

Slide 18 text

WHERE WAS IT INITIATED "sourceIPs": [ "1.2.3.4" ], @TheNikhita

Slide 19

Slide 19 text

THAT’S A LOT OF LOGS!

Slide 20

Slide 20 text

LET’S CONTROL THE VERBOSITY

Slide 21

Slide 21 text

LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG

Slide 22

Slide 22 text

LET’S CONTROL THE VERBOSITY WHAT TO LOG WHEN TO LOG YAML

Slide 23

Slide 23 text

AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita

Slide 24

Slide 24 text

AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita

Slide 25

Slide 25 text

AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita

Slide 26

Slide 26 text

AUDIT POLICY apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita

Slide 27

Slide 27 text

WHEN TO LOG 1. RequestReceived - Audit handler receives request @TheNikhita

Slide 28

Slide 28 text

WHEN TO LOG 1. RequestReceived - Audit handler receives request 2. ResponseStarted - For long running requests @TheNikhita

Slide 29

Slide 29 text

WHEN TO LOG 1. RequestReceived - Audit handler receives request 2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed @TheNikhita

Slide 30

Slide 30 text

WHEN TO LOG 1. RequestReceived - Audit handler receives request 2. ResponseStarted - For long running requests 3. ResponseComplete - Response body completed 4. Panic - Event generated when panic occurs @TheNikhita

Slide 31

Slide 31 text

RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

Slide 32

Slide 32 text

Request Kube APIserver

Slide 33

Slide 33 text

RequestReceived Request Kube APIserver

Slide 34

Slide 34 text

RequestReceived Response Request Kube APIserver

Slide 35

Slide 35 text

RequestReceived Response Panic Request Kube APIserver

Slide 36

Slide 36 text

RequestReceived Response ResponseStarted Request Kube APIserver

Slide 37

Slide 37 text

RequestReceived ResponseComplete Response ResponseStarted Request Kube APIserver

Slide 38

Slide 38 text

RequestReceived ResponseComplete Response ResponseStarted Request Response Kube APIserver

Slide 39

Slide 39 text

RequestReceived ResponseComplete Response ResponseStarted Panic Request Response Kube APIserver

Slide 40

Slide 40 text

WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets GROUP/VERSION RESOURCE VERBS @TheNikhita

Slide 41

Slide 41 text

WHAT TO LOG apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets @TheNikhita

Slide 42

Slide 42 text

LEVELS 1. None - don’t log these requests @TheNikhita

Slide 43

Slide 43 text

LEVELS 1. None - don’t log these requests 2. Metadata - only request metadata @TheNikhita

Slide 44

Slide 44 text

LEVELS 1. None - don’t log these requests 2. Metadata - only request metadata 3. Request - ,, + request body @TheNikhita

Slide 45

Slide 45 text

LEVELS 1. None - don’t log these requests 2. Metadata - only request metadata 3. Request - ,, + request body 4. RequestResponse - ,, + response body @TheNikhita

Slide 46

Slide 46 text

RECOMMENDATIONS FOR WRITING POLICIES

Slide 47

Slide 47 text

- level: Metadata resources: - group: "" resources: - secrets - configmaps - group: authentication.k8s.io resources: - tokenreviews Only log at Metadata level for sensitive resources @TheNikhita

Slide 48

Slide 48 text

- level: None nonResourceURLs: - '/healthz*' - /version - '/swagger*' Don’t log read-only URLs @TheNikhita

Slide 49

Slide 49 text

Log at RequestResponse level for critical resources Log at atleast Metadata level for all resources @TheNikhita

Slide 50

Slide 50 text

rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] @TheNikhita

Slide 51

Slide 51 text

rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Evaluated in top-down order @TheNikhita

Slide 52

Slide 52 text

rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] Status calls can be large and high-volume @TheNikhita

Slide 53

Slide 53 text

More examples at https://github.com/kubernetes/kubernetes/blob/master/cl uster/gce/gci/configure-helper.sh @TheNikhita

Slide 54

Slide 54 text

WHERE DO THESE LOGS GO

Slide 55

Slide 55 text

BACKEND LOG WEBHOOK @TheNikhita

Slide 56

Slide 56 text

BACKEND LOG WEBHOOK ● Writes events to disk ● Sends events to external API @TheNikhita

Slide 57

Slide 57 text

BACKEND LOG WEBHOOK ● Writes events to disk ● --audit-log-path ● Sends events to external API ● --audit-webhook-config-file @TheNikhita

Slide 58

Slide 58 text

BACKEND LOG WEBHOOK ● Writes events to disk ● --audit-log-path ● Sends events to external API ● --audit-webhook-config-file --audit-policy-file @TheNikhita

Slide 59

Slide 59 text

HOW ARE THESE LOGS SENT TO THE BACKEND

Slide 60

Slide 60 text

BATCHING BATCH BLOCKING BLOCKING-STRICT @TheNikhita

Slide 61

Slide 61 text

BATCHING BATCH BLOCKING BLOCKING-STRICT Buffers events & processes in batches Blocks APIserver responses to process individual events Failure at RequestReceived stage leads to failure of whole call @TheNikhita

Slide 62

Slide 62 text

BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

Slide 63

Slide 63 text

BATCHING BATCH BLOCKING BLOCKING-STRICT --audit-webhook-mode --audit-log-mode @TheNikhita

Slide 64

Slide 64 text

UPDATING AUDIT POLICY

Slide 65

Slide 65 text

UPDATING AUDIT POLICY RESTART OF APISERVER

Slide 66

Slide 66 text

UPDATING AUDIT POLICY RESTART OF APISERVER

Slide 67

Slide 67 text

UPDATING AUDIT POLICY

Slide 68

Slide 68 text

UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

Slide 69

Slide 69 text

UPDATING AUDIT POLICY UPDATING A K8S RESOURCE

Slide 70

Slide 70 text

DYNAMIC AUDIT CONFIGURATION @TheNikhita

Slide 71

Slide 71 text

DYNAMIC AUDIT CONFIGURATION apiVersion: auditregistration.k8s.io/v1alpha1 kind: AuditSink metadata: name: mysink spec: policy: level: Metadata stages: - ResponseComplete webhook: throttle: qps: 10 burst: 15 clientConfig: url: "https://audit.app" @TheNikhita

Slide 72

Slide 72 text

SECURITY PERFORMANCE @TheNikhita

Slide 73

Slide 73 text

SECURITY PERFORMANCE Write access to feature = Read access to all cluster data @TheNikhita

Slide 74

Slide 74 text

SECURITY PERFORMANCE Write access to feature = Read access to all cluster data cluster-admin level privilege Increase in CPU/Memory Usage @TheNikhita

Slide 75

Slide 75 text

KEP #sig-auth slack channel on k8s slack @TheNikhita

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

LOG COLLECTOR PATTERNS

Slide 78

Slide 78 text

LOG COLLECTOR PATTERNS Audit Log File + Fluentd @TheNikhita

Slide 79

Slide 79 text

LOG COLLECTOR PATTERNS Audit Webhook File + Logstash @TheNikhita

Slide 80

Slide 80 text

LOG COLLECTOR PATTERNS Audit Webhook File + Falco @TheNikhita

Slide 81

Slide 81 text

HOW ARE AUDIT LOGS HELPFUL

Slide 82

Slide 82 text

UNDERSTANDING K8S INTERNALS Analysing system calls show how different components interact @TheNikhita

Slide 83

Slide 83 text

DETECTING MISCONFIGURATIONS “Who deleted this resource?” @TheNikhita

Slide 84

Slide 84 text

TROUBLESHOOTING ISSUES Analysing calls which trigger HTTP errors @TheNikhita

Slide 85

Slide 85 text

PERFORMANCE ISSUES “Which app is generating lots of calls” @TheNikhita

Slide 86

Slide 86 text

CONCLUSION ● Audit logs can give us a lot of information of what goes on in our cluster ● To control what should be logged, we write audit policies ● Recommendations for writing audit policies ● Different audit backends ● Batching methods ● Dynamic Audit Configuration ● Log Collector Patterns

Slide 87

Slide 87 text

THANK YOU