Slide 1

Slide 1 text

Heartbeats & Healthchecks (at the Edge of Human and Compute)

Slide 2

Slide 2 text

Sr. Developer Advocate at HashiCorp for Infrastructure and Orchestration @ksatirli Kerim Satirli

Slide 3

Slide 3 text

computing that takes place at or near the physical location of the producer or consumer of data. Similar: noun point of presence mobile datacenter edge com·put·ing

Slide 4

Slide 4 text

ConFoo Attendee Kerim Satirli plastic frame low-powered screen conference branding

Slide 5

Slide 5 text

ConFoo Attendee Kerim Satirli back front

Slide 6

Slide 6 text

01 Noisy Neighbors Don't expect connectivity at the edge.

Slide 7

Slide 7 text

! reception okay'ish frequent disconnects

Slide 8

Slide 8 text

Noise on the Net Spectrum Scan (2.4 GHz) 08:24 08:48 09:12 09:36 10:00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Channels

Slide 9

Slide 9 text

Challenge: no connectivity

Slide 10

Slide 10 text

Suggested Solutions store data with device- specific IDs to prevent reconciling conflicts retry and back-off with sensible intervals, don't just increase the delay implement code that allows a control server to push config changes unique IDs back-off push config implement code that runs in offline / delayed connectivity situations local-first

Slide 11

Slide 11 text

job "baedge" { group "server" { disconnect { ttl = "4h" stop_after = "2h" replace = true reconcile = "keep_original" } } } baedge.nomad.hcl Offline "Easy-Mode"

Slide 12

Slide 12 text

02 Sims aren't real Don't expect mocks to tell the full story.

Slide 13

Slide 13 text

digital twin real device

Slide 14

Slide 14 text

Fake isn't Real.

Slide 15

Slide 15 text

Suggested Solutions get feedback from people not involved in the building process subject to physical stress to understand operational impact buy devices from multiple vendors to account for revisions don't train throw 'em test broadly deploy devices in real environments often, avoid mocked stages field early

Slide 16

Slide 16 text

03 Build for Breaches Expect leaks and compromises.

Slide 17

Slide 17 text

Control goes down, and Risk goes up.

Slide 18

Slide 18 text

bbc.com/news/technology-48743043 Device Security

Slide 19

Slide 19 text

Challenge: remain in control

Slide 20

Slide 20 text

Suggested Solutions limit access credentials on devices to absolute bare minimum automatically rotate secrets, and expire rotated secrets audit access logs early and often, use data to make informed choices limit rotate audit physically seal and disconnect ports you don't actively use. seal

Slide 21

Slide 21 text

job "baedge" { group "server" { identity { name = "baedge" aud = [ "oidc.baedge.local", ] file = true ttl = "30m" } } } baedge.nomad.hcl Rotating Credentials

Slide 22

Slide 22 text

This talk wasn't about conference badges.

Slide 23

Slide 23 text

This talk was about fault tolerance.

Slide 24

Slide 24 text

"Nomad" screen "Baedge" screen ConFoo Attendee Allocation: 1a2b3c4d Address: 192.168.0.23 Version: 1.7.5 Nomad Runtime ConFoo Attendee Model: 2in7b Revision: v2 {Ba,E}dge Hardware

Slide 25

Slide 25 text

Demo Code

Slide 26

Slide 26 text

Thank you speakerdeck.com/ksatirli