Slide 1

Slide 1 text

Scaling osquery Using pure Open Source technology @javutin

Slide 2

Slide 2 text

➔ Former: ➔ Current: Javier Marcos de Prado Staff Security Engineer @ ABS Global Trading Page 2 @osqueryatscale @javutin $ whoami @javutin javuto

Slide 3

Slide 3 text

Page 3 @osqueryatscale @javutin Agenda ▪ Quick overview of osquery ▪ The osquery remote API ▪ Building and scaling your TLS endpoint ▪ osctrl as example of a TLS endpoint ▪ Conclusions and lessons learned

Slide 4

Slide 4 text

Quick overview of osquery Page 4 @osqueryatscale @javutin

Slide 5

Slide 5 text

Page 5 @osqueryatscale @javutin The CLI - osqueryi (single host) https://osquery.readthedocs.io/en/stable/introduction/using-osqueryi/

Slide 6

Slide 6 text

Page 6 @osqueryatscale @javutin The daemon - osqueryd (single host) https://osquery.readthedocs.io/en/stable/introduction/using-osqueryd/ intrusion detection use cases centralized management (backend) operative system, users, services configuration logging osqueryd

Slide 7

Slide 7 text

Page 7 @osqueryatscale @javutin The daemon - osqueryd (multiple hosts) https://osquery.readthedocs.io/en/stable/introduction/using-osqueryd/ ...

Slide 8

Slide 8 text

Page 8 @osqueryatscale @javutin osqueryd logging https://osquery.readthedocs.io/en/stable/introduction/using-osqueryd/ ➔ Local logging ❏ Forwarders ○ Logstash, Splunk... ➔ Remote logging ❏ Kinesis ❏ Kafka ❏ Splunk ❏ TLS endpoint

Slide 9

Slide 9 text

Page 9 @osqueryatscale @javutin osqueryd configuration https://osquery.readthedocs.io/en/stable/introduction/using-osqueryd/ ➔ Local configuration ❏ IT/infra management ○ Chef, puppet, jamf, ansible... ➔ Remote configuration ❏ TLS endpoint

Slide 10

Slide 10 text

The osquery remote API Page 10 @osqueryatscale @javutin

Slide 11

Slide 11 text

Page 11 @osqueryatscale @javutin osquery remote API https://osquery.readthedocs.io/en/stable/deployment/remote/ ... Status/Results Logs Configuration TLS endpoint

Slide 12

Slide 12 text

Page 12 @osqueryatscale @javutin osquery remote API ▪ Enroll POST /path/to/enroll ▪ Configuration POST /path/to/config ▪ Logs POST /path/to/log ▪ Extras (On-demand queries) (File carving) ... https://osquery.readthedocs.io/en/stable/deployment/remote/

Slide 13

Slide 13 text

Page 13 @osqueryatscale @javutin osquery remote API: Enroll https://osquery.readthedocs.io/en/stable/deployment/remote/ { "enroll_secret": "...", // Optional. "host_identifier": "...", // --host_identifier flag "host_details": { // Helpful osquery tables. "os_version": {}, "osquery_info": {}, "system_info": {}, "platform_info": {} } } POST /path/to/enroll

Slide 14

Slide 14 text

Page 14 @osqueryatscale @javutin osquery remote API: Enroll https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_key": "...", // Optionally blank "node_invalid": false // Optional, true to indicate failure. } HTTP RESPONSE

Slide 15

Slide 15 text

Page 15 @osqueryatscale @javutin osquery remote API: Configuration https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_key": "..." // Optionally blank } POST /path/to/config

Slide 16

Slide 16 text

Page 16 @osqueryatscale @javutin osquery remote API: Configuration https://osquery.readthedocs.io/en/stable/deployment/remote/ { "schedule": { "query_name": { "query": "...", "interval": 10 } }, "node_invalid": false // Optional, true for re-enrollment. } HTTP RESPONSE

Slide 17

Slide 17 text

Page 17 @osqueryatscale @javutin osquery remote API: Logs https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_key": "...", // Optionally blank "log_type": "result", // Either "result" or "status" "data": [ {...} // Each result event, or status event ] } POST /path/to/log

Slide 18

Slide 18 text

Page 18 @osqueryatscale @javutin osquery remote API: Logs https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_invalid": false // Optional, true for re-enrollment. } HTTP RESPONSE

Slide 19

Slide 19 text

Page 19 @osqueryatscale @javutin osquery remote API: On-demand queries (read) https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_key": "..." // Optionally blank } POST /path/to/query-read

Slide 20

Slide 20 text

Page 20 @osqueryatscale @javutin osquery remote API: On-demand queries (read) https://osquery.readthedocs.io/en/stable/deployment/remote/ { "queries": { "id1": "SELECT * FROM osquery_info;", "id2": "SELECT * FROM osquery_schedule;", "id3": "SELECT * FROM does_not_exist;" }, "node_invalid": false // Optional, true for re-enrollment. } HTTP RESPONSE

Slide 21

Slide 21 text

Page 21 @osqueryatscale @javutin osquery remote API: On-demand queries (write) https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_key": "...", "queries": { "id1": [ {"column1": "value1", "column2": "value2"} ] }, "statuses": { "id1": 0 } } POST /path/to/query-write

Slide 22

Slide 22 text

Page 22 @osqueryatscale @javutin osquery remote API: On-demand queries (write) https://osquery.readthedocs.io/en/stable/deployment/remote/ { "node_invalid": false // Optional, true for re-enrollment. } HTTP RESPONSE

Slide 23

Slide 23 text

Page 23 @osqueryatscale @javutin osquery remote API: Flags https://osquery.readthedocs.io/en/stable/deployment/remote/ ▪ Enroll --enroll_tls_endpoint ▪ Configuration --config_tls_endpoint --config_tls_refresh ▪ Log --logger_tls_endpoint --logger_tls_period ▪ Queries --distributed_tls_[read-write]_endpoint --distributed_interval

Slide 24

Slide 24 text

Building and scaling a TLS endpoint Page 24 @osqueryatscale @javutin

Slide 25

Slide 25 text

Page 25 @osqueryatscale @javutin Building a TLS endpoint ▪ Handler for Enroll ▪ Handler for Configuration ▪ Handlers for Logs ▪ Handlers for extras (On-demand queries) (File carving) ... https://osquery.readthedocs.io/en/stable/deployment/remote/ ✅ ✅ ✅ ✅

Slide 26

Slide 26 text

Page 26 @osqueryatscale @javutin Scaling a TLS endpoint - Configuration TLS endpoint - Logs - On-demand queries - (Enroll)

Slide 27

Slide 27 text

Page 27 @osqueryatscale @javutin Scaling a TLS endpoint 1 x --config_tls_refresh=60 --logger_tls_period=60 --distributed_interval=60 = 3 requests per minute

Slide 28

Slide 28 text

Page 28 @osqueryatscale @javutin Scaling a TLS endpoint N x --config_tls_refresh=300 --logger_tls_period=600 --distributed_interval=100 =

Slide 29

Slide 29 text

Page 29 @osqueryatscale @javutin Scaling a TLS endpoint LOG CONFIG QUERY intervals 600 300 0 100 100 100 100 CONFIG QUERY QUERY QUERY QUERY QUERY

Slide 30

Slide 30 text

Page 30 @osqueryatscale @javutin Scaling a TLS endpoint sum(highest_interval / each_interval) 600 / 600 = 1 600 / 300 = 2 600 / 100 = 6 9 requests per 600 seconds 9 / 600 = 0.015 per second ; For 1000 nodes, ~15 per second

Slide 31

Slide 31 text

Page 31 @osqueryatscale @javutin Scaling a TLS endpoint: Caveats ▪ Don’t forget enroll! (N requests at T0) ▪ Query writes? ▪ File carving? ▪ Accelerated mode? ▪ All those intervals are NOT splayed peaks

Slide 32

Slide 32 text

Example TLS endpoint: osctrl Page 32 @osqueryatscale @javutin

Slide 33

Slide 33 text

Fast and efficient osquery management Page 33 @osqueryatscale @javutin https://osctrl.net github.com/jmpsec/osctrl

Slide 34

Slide 34 text

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Page 34 @osqueryatscale @javutin osctrl-admin / osctrl-api Status / Results / Logs Configuration osctrl-tls operator backend metrics osctrl-cli osquery vpn https://osctrl.net

Slide 35

Slide 35 text

Page 35 @osqueryatscale @javutin ➔ Monitor osquery agents ✅ ➔ Collect and process status/result logs ✅ ➔ Distribute osquery configuration fast ✅ ➔ Run on-demand queries ✅ ➔ Extract files/directories using file carves ✅ https://osctrl.net

Slide 36

Slide 36 text

Page 36 @osqueryatscale @javutin https://osctrl.net

Slide 37

Slide 37 text

Conclusions Page 37 @osqueryatscale @javutin

Slide 38

Slide 38 text

Page 38 @osqueryatscale @javutin Conclusions / lessons learned ▪ Buy solution VS. Build solution ▪ Always plan for the worse case scenario ▪ If you are using cloud, make the most of it ▪ Once you have logs, don’t forget them! ▪ Read code for hidden/undocumented features

Slide 39

Slide 39 text

Page 39 @osqueryatscale @javutin Any questions?

Slide 40

Slide 40 text

@javutin Thank you!