Upgrade to Pro — share decks privately, control downloads, hide ads and more …

KubernetesでDatadogを飼うならオートディスカバリーを使わないと損

 KubernetesでDatadogを飼うならオートディスカバリーを使わないと損

Atsushi Tanaka

August 07, 2024
Tweet

More Decks by Atsushi Tanaka

Other Decks in Technology

Transcript

  1. © 2024 Wantedly, Inc. $ whoami @bgpat / Atsushi Tanaka

    ウォンテッドリー株式会社 Infrastructure Engineer Kubernetes / Terraform SRE / Platform Engineering Datadog 歴 6〜7年くらい
  2. © 2024 Wantedly, Inc. 設定が書ける箇所 Annotation に書く • Pod: ad.datadoghq.com/<CONTAINER_IDENTIFIER>.checks

    • Service: ad.datadoghq.com/service.checks • Endpoint: ad.datadoghq.com/endpoints.checks (オートディスカバリに近い?機能) • Tag Labels: ◦ tags.datadoghq.com/env ◦ tags.datadoghq.com/version ◦ tags.datadoghq.com/service • ConfigMap: ad_identifiers に一致するイメージに適用
  3. © 2024 Wantedly, Inc. 利用できるテンプレート変数 https://docs.datadoghq.com/containers/guide/template_variables/ • %%host%%, %%host_<NETWORK_NAME>%% •

    %%port%%, %%port_<NUMBER_X>%%, %%port_<NAME>%% • %%pid%%, %%hostname%% • %%env_<ENV_VAR>%% • %%kube_namespace%%, %%kube_pod_name%%, %%kube_pod_uid%% https://docs.datadoghq.com/ja/agent/configuration/secrets-management/ • ENC[file@/path/to/file] • ENC[k8s_secret@some_namespace/some_name/a_key]
  4. © 2024 Wantedly, Inc. 使用例 apiVersion: v1 kind: Pod metadata:

    annotations: ad.datadoghq.com/redis.checks: | { "redisdb": { "init_config": {}, "instances": [ { "host": "%%host%%", "port":"%%port%%", "password":"%%env_REDIS_PASSWORD%%" } ] } } ad.datadoghq.com/redis.logs: '[{"source":"redis"}]' spec: containers: - name: redis ︙ Pod の annotation に書く
  5. © 2024 Wantedly, Inc. 使用例 apiVersion: v1 kind: Service metadata:

    annotations: ad.datadoghq.com/service.checks: | { "redisdb": { "init_config": {}, "instances": [ { "host": "%%host%%", "port":"%%port%%", "password":"ENC[k8s_secret@my-redis/redis-secret/password]" } ] } } ︙ Service の annotation に書く
  6. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 1. (Cluster Agent

    の場合) 監視している Agent を探す i. Cluster Agent のリーダーを調べる ii. agent clusterchecks を実行して対象の Check を探す iii. 対象 Agent の pod を頑張って探す 2. Agent のステータスを確認 i. agent status を実行して対象の Check を探す
  7. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 (Cluster Agent の場合)

    監視している Agent を探す # Cluster Agent のリーダーを調べる $ kubectl -n default get cm datadog-agent-leader-election -o json | \ jq '.metadata.annotations["control-plane.alpha.kubernetes.io/leader"] | fromjson.holderIdentity' "datadog-agent-cluster-agent-7474855779-z2zjf" # agent clusterchecks を実行して対象の Check を探す $ kubectl -n default exec datadog-agent-cluster-agent-7474855779-z2zjf -- agent clusterchecks ︙ ===== Checks on i-0123456789abcdef0 ===== === postgres check === Configuration provider: kubernetes-services Configuration source: kube_services:kube_service://default/aurora-postgres Config for instance ID: postgres:65af62e418817e1e ︙ # 対象 Agent の pod を頑張って探す (ウォンテッドリーの環境は providerID から特定できた) $ kubectl get no -o json | \ jq '.items[] | select(.spec.providerID | endswith(" i-0123456789abcdef0")) | .metadata.name' "ip-10-3-96-189.ap-northeast-1.compute.internal" $ kube sandbox -n default get po \ --field-selector spec.nodeName= ip-10-3-96-189.ap-northeast-1.compute.internal -l app=datadog-agent NAME READY STATUS RESTARTS AGE datadog-agent-dncs6 3/3 Running 0 61m
  8. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 Agent のステータスを確認 #

    agent status を実行して対象の Check を探す $ kubectl -n default exec datadog-agent-dncs6 -- agent status Defaulted container "agent" out of: agent, trace-agent, process-agent, init-volume (init), init-config (init) disable most components. It's recommended to use autoconfig_exclude_features and autoconfig_include_features to activate/deactivate features selectively Getting the status from the agent. =============== Agent (v7.54.0) =============== ︙ postgres (18.2.2) ----------------- Instance ID: postgres:65af62e418817e1e [OK] Configuration Source: kube_services:kube_service://default/aurora-postgres Total Runs: 33 Metric Samples: Last Run: 10,503, Total: 346,599 Events: Last Run: 0, Total: 0 Database Monitoring Metadata Samples: Last Run: 1, Total: 2 Service Checks: Last Run: 1, Total: 33 Average Execution Time : 498ms ︙
  9. © 2024 Wantedly, Inc. まとめ • Kubernetes と Datadog は相性がいい

    • 監視対象はコンテナとサービスが選択可能 • 便利なテンプレート機能も利用可能 • デバッグ方法に難あり (いい方法があれば知りたい)