Online Meetup #3 - Solo.io, Tidepool, Weaveworks, Buoyant

1 | Copyright © 2019 1 | Copyright © 2019
Online Meetup Oct 31, 2019

2 | Copyright © 2019 Welcome Betty Junod Derrick Burns
Stefan Prodan Rick Ducott Thomas Rampelberg Introduction Tidepool Case Study Q&A and Panel Discussion

Backend Modernization with Open Source at Tidepool.org Derrick Burns, Cloud Architect

Tidepool is a nonprofit organization dedicated to making diabetes data
more accessible, actionable, and meaningful for people with diabetes, their care teams, and researchers.

Solid Infrastructure is ___, if it ___ Secure Protects user
data from unauthorized access Scalable Supports any increase in demand without slow down Highly Available Works when you use it Auditable Tells you who did what and when Observable Provides insight into what is happening now Agile Lets you try out changes to your applications easily and quickly Responsive Responds to your requests without noticeable delay

Our Legacy Infrastructure Works At ~20K MAU Secure Inter-service communication
is encrypted with a single, shared, multi-year-old secret. Scalable Compute scaling is manual and infrequent. Highly Available Single points of failure. Auditable Log messages are inconsistent and don’t enable auditing. Observable Service metrics are largely unavailable. Agile Cycle time for changes is days/weeks, not seconds/minutes. Responsive Response times often >500ms. goal < 200 ms

So What’s Changed? Our Anticipated Needs • Present ◦ Our
current demand is current small: <20K monthly active users (MAU). ◦ We are largely reactive: our customers report problems to us. Support bears the brunt. • Future ◦ Tidepool Loop will bring increased demand on our backend. ◦ Moreover, to meet our mission, we need to increase MAU by orders of magnitude! ◦ If you are going to scale, our infrastructure must be ready. ◦ We need to be proactive. ▪ Our customers cannot be relied upon to report failures to us. ▪ We have the opportunity and the obligation to show users what a reliable service looks like. • The bar is moving higher!

Our Legacy Infrastructure At >200K MAU Secure Inter-service communication is
encrypted with a single, shared, multi-year-old secret. Scalable Compute scaling is manual and infrequent. Highly Available Single points of failure. Auditable Log messages are inconsistent and don’t enable auditing. Observable Service metrics are largely unavailable. Agile Cycle time for changes is days/weeks, not seconds/minutes. Responsive Response times often >500ms. goal < 200 ms

We Can’t Do it Alone We Need to Leverage the
Latest Inventions

CNCF* Tools Help Make Infrastructure ... Secure Linkerd* encrypts data
in flight for HIPAA compliance. Scalable K8s* allocates more CPU/Network bandwidth on demand. Highly Available K8s* deploys replicas of pods for redundancy. Auditable Jaeger* traces execution paths. Observable Prometheus* collects metrics. Grafana* visualizes them. Agile Flux* deploys new versions of software as it becomes available. Flagger deploys new software progressively. Gloo/Envoy* routes traffic and supports retries and timeouts. Responsive K8s* allocates service replicas on demand.

Status: New Tools Deployed! Secure Linkerd service mesh is deployed.
Scalable K8s cluster-autoscaling is deployed. Highly Available K8s multiple service replicas are deployed. Auditable Jaeger is coming soon. Logs are aggregated to SumoLogic. Observable Prometheus is coming soon. Grafana is coming soon. Fluxcloud is deployed. Agile Flux and Gloo/Envoy are deployed. Tilt is deployed. Flagger is coming soon. Spotlight documentation w/ auto-generated bindings coming soon. Responsive K8s horizontal pod autoscaling is deployed.

Status: Benefits Realized To Date Secure TLS certs are auto-renewed.
Secrets are encrypted. Scalable CPU and Networking is auto-scaled on demand. Highly Available Each service can be replicated by simple setting a number. Auditable Logs are aggregated and persisted. Access logs are collected. Observable Deployment notifications are auto-published to slack. Agile Cycle time to deploy to AWS: 10-30 min. => 1-3 min. Cycle time to deploy locally: 1-3 min. => 2-5 seconds. Responsive New replicas are auto-deployed under load.

Status: Unresolved Legacy Issues Secure Scalable • Services not designed
for concurrency (e.g. message-api) • Sloooooow database queries Highly Available Auditable • Inconsistent logging • Excessive logging Observable • Per service metrics are unavailable • Execution traces are unavailable Agile • Need automated testing for progressive deployment of backend. Responsive • Sloooooow database queries

Questions, Answers and Discussion

Online Meetup #3 - Solo.io, Tidepool, Weavework...

Online Meetup #3 - Solo.io, Tidepool, Weaveworks, Buoyant

Solo.io

More Decks by Solo.io

Other Decks in Programming

Featured

Transcript

1 | Copyright © 2019 1 | Copyright © 2019

2 | Copyright © 2019 Welcome Betty Junod Derrick Burns

3 | Copyright © 2019 3 | Copyright © 2019

Tidepool is a nonprofit organization dedicated to making diabetes data

Solid Infrastructure is _, if it _ Secure Protects user

Our Legacy Infrastructure Works At ~20K MAU Secure Inter-service communication

So What’s Changed? Our Anticipated Needs • Present ◦ Our

Our Legacy Infrastructure At >200K MAU Secure Inter-service communication is

We Can’t Do it Alone We Need to Leverage the

CNCF* Tools Help Make Infrastructure ... Secure Linkerd* encrypts data

Status: New Tools Deployed! Secure Linkerd service mesh is deployed.

Status: Benefits Realized To Date Secure TLS certs are auto-renewed.

Status: Unresolved Legacy Issues Secure Scalable • Services not designed

15 | Copyright © 2019 15 | Copyright © 2019

16 | Copyright © 2019 SOLO.IO solo.io/gloo link.medium.com/qDUTbgu810 TIDEPOOL tidepool.org