Constructing Open Source SDKs for Ops Teams with REST and GraphQL

80a34a84e6b006b603b1192a8af3b435?s=47 Chris Wahl
October 30, 2019

Constructing Open Source SDKs for Ops Teams with REST and GraphQL

Operations teams are the front line team that have to install, configure, manage, and maintain a complex set of infrastructure and services. Getting a solid, well-documented SDK into their hands can turn manual, painful processes into simple automation tasks. However, as many APIs are shifting from REST to GraphQL, new approaches to tooling are often required. In this session, you’ll hear about how a small team of engineers builds and maintains a collection of open source SDKs focused on operations teams and how they added support for GraphQL endpoints.

80a34a84e6b006b603b1192a8af3b435?s=128

Chris Wahl

October 30, 2019
Tweet

Transcript

  1. 2.

    Chris Wahl ❖ Chief Technologist @ Rubrik ❖ Author of

    Networking for VMware Administrators ❖ Open Source Enabler at Rubrik Build ❖ he/him Twitter: @ChrisWahl GitHub: chriswahl LinkedIn: /wahlchris Blog: Wahl Network
  2. 4.

    This is a story about toil And a lot of

    learning through triumph and mistakes @ChrisWahl | #DevWeek2019 4
  3. 5.

    “ ” The kind of work tied to running a

    production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows - Toil @ChrisWahl | #DevWeek2019 5
  4. 7.

    Life of an operator • At the end of the

    release cycle • “Here’s a thing, make it work, keep it working” • Myriad of systems to understand and maintain while being short staffed @ChrisWahl | #DevWeek2019 7
  5. 8.

    “ ” I need a one-liner or script to accomplish

    this task so I can copy and paste it into my environment, solve my problem, and get back to putting out a hundred other fires - Systems Administrators @ChrisWahl | #DevWeek2019 8
  6. 9.

    Abuse from Crude Tools Tools like AutoIt • Script GUI

    actions using a DSL • The ultimate “sad panda” @ChrisWahl | #DevWeek2019 9
  7. 11.

    Initial Research • Our audience preferred Microsoft PowerShell • Auto

    generation of SDK was ugly • Our swagger specification was non-standard • Decided to craft a bespoke SDK @ChrisWahl | #DevWeek2019 11
  8. 12.

    The Mission • Give operators a familiar tool to manage

    our product and remove toil • Use my background as an operator to control the UX • Selfishly: Learn how to build an SDK @ChrisWahl | #DevWeek2019 12
  9. 13.

    Project Plan • Everything in GitHub as an open source

    project • MIT licensing (Legal ) • One project per repository • Official product support for projects • Unit tests for new features • External CI: AppVeyor, Azure Pipelines • Internal CI: CircleCI • Integration of Jira and GitHub via Zapier @ChrisWahl | #DevWeek2019 13
  10. 15.

    Our API’s Original Purpose • Distributed systems to chat with

    each other • Supply the GUI with an interface @ChrisWahl | #DevWeek2019 15 me
  11. 16.

    This created friction • There were no API versions •

    Breaking changes were normal • Standards for model, params, enums, etc. did not exist • The product surface area was rapidly expanding @ChrisWahl | #DevWeek2019 16
  12. 19.

    We Made Versions! • Internal • meant for testing and

    developing new features and for providing command and control endpoints for the software itself. • Versioned (Vn) • meant for public consumption with a declaration on versioning, deprecation, and when breaking changes would be introduced. @ChrisWahl | #DevWeek2019 19
  13. 20.

    “ ” API versioning does not prevent breaking changes. It

    just helps control when, where, and how the break occurs. Someone must still update their code. - Me @ChrisWahl | #DevWeek2019 20
  14. 21.

    More Cleanup • Placed major integrations at the parent (root)

    level • Leveraged HTTP methods to simplify workflows • Used Boolean field naming conventions @ChrisWahl | #DevWeek2019 21 Ugly: POST to “/add_node” and “/remove_node/{id}” Pretty: POST to “/node” and DELETE to “/node/{id}” Start with ‘has’, ‘is’ or ‘should’ to make it clear that it is a Boolean field Examples: ‘hasRootAccess’, ‘isAdmin’ and ‘shouldDoSomething’
  15. 22.

    “ ” The sooner you start to code, the longer

    the program will take. - Roy Carlson @ChrisWahl | #DevWeek2019 22
  16. 23.

    Internal Became the Hypnotoad • No incentives for versioning •

    Over 95% of the API resided in Internal @ChrisWahl | #DevWeek2019 23
  17. 25.

    Too Much Complexity • Each function with the SDK was

    a closed loop • The community found it too difficult to contribute • A new architecture was needed @ChrisWahl | #DevWeek2019 25
  18. 26.

    SDK Design Goal API File • Gather information for each

    supported endpoint • Supply the SDK with methods, params, status codes, etc. • Version the data for backwards compatibility Generic Functions • Functions look at the API File to understand their purpose • Functions can alter their state based on the target product version @ChrisWahl | #DevWeek2019 26
  19. 28.
  20. 29.
  21. 30.

    Enablement and Communication Too focused on the technology Not enough

    focus on the hygiene Lots of questions from our customers General fear of GitHub and coding More was needed @ChrisWahl | #DevWeek2019 30
  22. 31.
  23. 34.

    Communication Efforts @ChrisWahl | #DevWeek2019 34 The rules of versioning

    and deprecation. Future deprecation of endpoints / resources. New or updated endpoints / resources.
  24. 37.

    “ ” You haven't mastered a tool until you understand

    when it should not be used. - Kelsey Hightower @ChrisWahl | #DevWeek2019 37
  25. 38.

    Initial Research in 2017 • Dramatic speed improvements for the

    GUI • As more objects are added, REST continues to fall behind • Simple to query all objects and use cursor / pagination • More flexibility with our returned values @ChrisWahl | #DevWeek2019 38 Stress tested load times 95th percentile load times with GraphQL: 3.256 seconds 95th percentile load times with REST: 6.619 seconds
  26. 39.

    Since Then • Added GraphQL to our on-premises product. •

    Reporting • Dashboards • Various other components • Constructed a SaaS platform with GraphQL as the standard API • Started from scratch • Using what we learned • Lots of tweaking @ChrisWahl | #DevWeek2019 39
  27. 40.

    Challenges • Schema is in flux • There are no

    versions • Documentation holy wars • We’re all still learning GraphQL • Graph-Que-What? @ChrisWahl | #DevWeek2019 40
  28. 41.

    Current State • Schema tools (Voyager, GraphiQL) for visualization •

    Internal construction of new SDKs • Existing auth methods (e.g. tokens) are valid globally @ChrisWahl | #DevWeek2019 41 Base platform will continue with REST and GraphQL SaaS platform will remain entirely GraphQL Using GitHub private repos for development
  29. 42.
  30. 43.

    SDK Development Let use cases drive stack-ranking Mimic a near-identical

    UX Educate and enable in parallel Invite early-adopters and give them checklists @ChrisWahl | #DevWeek2019 43
  31. 45.

    If we could do it all over again • Increased

    collaboration with engineering and support • Create incentives to document and polish the API • Make documentation a top priority • Educate internal stakeholders on API usage • Bring operators into your SDK build process @ChrisWahl | #DevWeek2019 45 Use cases, UX, testing, feedback