Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Microservices lifecycle management

Microservices lifecycle management

Video: https://www.youtube.com/watch?v=sNi2UOG2c2k

As more organizations transition from a monolith to a more micro-services architecture, organizations are finding significant challenges around governance and lifecycle management of micro-services.
For example, how often have you (developer, ops, leadership) have asked one or more of the following questions?

1. What does it take to create and manage a new micro service? (Metadata Management, governance)
2. How do we identify a micro services canonically across infrastructure/platform services? (Identity)
3. How do we allocate resources for a micro service? (Resource provisioning)
4. What does it take to operate a micro service? (Deploy pipelines, orchestration, monitoring)
5. How do we measure resource utilization and cost of operating a micro service? (Metering and Chargeback)

These questions persist independent of an organization's container strategy or public/private cloud strategy.

Through this talk, I will deep dive further into the above challenges, the impact and share details on the need for a governance system that manages the lifecycle of micro-services. The talk will focus on the following areas:

1.Metadata Management (project info, team ownership info, operational info such as dashboards, alerts)
2. Identity Management (canonical service identifiers, secrets provisioning, distribution and management)
3. Resource Management (provisioning of primitive resources such as CPU, MEM or abstract resources such as RPS)
4. Metering and Chargeback

At the end of the talk, I'll share case studies from Twitter and Pinterest on how they implemented portions of these systems and its impact.

About Micheal Benedict
Micheal Benedict leads Product Management for Pinterest's Infrastructure Platform Teams. Previously he lead products for Twitter Cloud Platform building the next generation compute infrastructure that spans internal and the public cloud. He and his team built Kite - Service lifecycle manager and Infrastructure Metering & Chargeback system. Prior to that he was an engineer building systems that power Twitter's Observability and Monitoring stack. Micheal has a Masters degree in Computer Science from SUNY at Buffalo.

More Decks by Micheal Benedict (@micheal)

Other Decks in Technology

Transcript

  1. Agenda History (Microservices at Twitter and Pinterest) Lifecycle of a

    job What is Governance? Challenges & Solution Future 1 2 3 4
  2. FENCING & OWNERSHIP Clear isolation of services & its ownership.

    RELIABILITY
 Failure isolation and graceful degradation SCALABILITY & EFFICIENCY Scale independently ensuring efficient use of infrastructure DEVELOPER PRODUCTIVITY Make it simple for engineers to build and launch services quickly and easily MICROSERVICES The obvious benefits
  3. A job can be… ‣long running service ‣batch job ‣map

    reduce ‣model training ‣experiment
  4. RELEASE TEST & BUILD PACKAGE MONITOR LOGS, METRICS & TRACE

    GRAPH & ALERTS ONCALL DEPLOY (CANARY/PROD) CREATE DEPRECATE
  5. RELEASE TEST & BUILD PACKAGE MONITOR LOGS, METRICS & TRACE

    GRAPH & ALERTS ONCALL DEPLOY (CANARY/PROD) MANAGE CREATE DEPRECATE
  6. RELEASE TEST & BUILD PACKAGE MONITOR LOGS, METRICS & TRACE

    GRAPH & ALERTS ONCALL DEPLOY (CANARY/PROD) MANAGE IDENTITY & CREDENTIAL METADATA RESOURCE & CAPACITY CREATE DEPRECATE BUDGET & SPEND OWNERSHIP
  7. Deploy package `pin_write_service` vCPU: 8.0 Memory: 12G Instances: 10 Service

    Discovery: pinwriter COMPUTE _cluster=pin_write_cluster _namespace=pin_write
  8. BLOB STORAGE _prefix=pin_media_pictures _prefix=pin_media_videos KEY/VAL STORAGE _namespace=pin_write COMPUTE _cluster=pin_write_cluster _namespace=pin_write

    Deploy package `pin_write_service` vCPU: 8.0 Memory: 12G Instances: 10 Service Discovery: pinwriter GB: 20GB RPS: 100K WPS: 10K GB: 2TB GETs: 500K PUTs: 50K
  9. Logical grouping of identifiers tied to the business The dictionary

    JOB OWNERSHIP DIRECTORY BUSINESS OWNER TEAM PROJECT 1:N 1:N JOB NAME 1:N <SCOPE, IDENTIFIERS>
 (Depends on Identity & Credential Manager) 1:N OWNERSHIP IDENTITY
  10. BUSINESS OWNER TEAM PROJECT 1:N 1:N JOB NAME 1:N <SCOPE,

    IDENTIFIERS>
 (Depends on Identity & Credential Manager) 1:N OWNERSHIP IDENTITY INFRASTRUCTURE CORE-SERVICE PinAndBoard 1:N 1:N pin_writer_service 1:N <compute, pin_write_cluster> <blob, pin_media_pictures> <blob, pin_media_videos> 1:N
  11. BLOB STORAGE _prefix=pin_media_pictures _prefix=pin_media_videos KEY/VAL STORAGE _namespace=pin_write COMPUTE _cluster=pin_write_cluster _namespace=pin_write

    Deploy package `pin_write_service` vCPU: 8.0 Memory: 12G Instances: 10 Service Discovery: pinwriter GB: 20GB RPS: 100K WPS: 10K GB: 2TB GETs: 500K PUTs: 50K
  12. pin_write_service BLOB STORAGE _prefix=<UUID> _prefix=<UUID> COMPUTE _cluster=<UUID> KEY/VAL STORAGE _namespace=<UUID>

    JOB NAME <SCOPE, IDENTIFIERS> 1:N IDENTIFIER PER 
 RESOURCE TYPE CANONICAL JOB IDENTIFIER
  13. Canonical identifiers for a job Identifying a job across platform/infrastructure

    services. COMPUTE BLOB STORAGE KEY/VAL
 STORAGE foo_service _cluster=
 <UUID> _namespace= <UUID> IDENTITY PROVISIONING SERVICE _prefix=<UUID> IDENTITY MANAGER
  14. A consistent (role based) method to generate credentials for access

    control & audibility. COMPUTE BLOCK STORAGE RDBMS foo_service _cluster=
 foo_cluster _database= foodb IDENTITY PROVISIONING SERVICE CREDENTIAL PROVISIONING SERVICE generate service account with privileges based on identifiers IAM Keys and Secrets _prefix=fooStore _prefix=barStore CREDENTIAL MANAGER
  15. Key/Val pairs tied to Jobs & Projects following an hierarchical

    order Source of truth for Job Metadata METADATA
 MANAGER KEY/VAL KEY/VAL BUSINESS OWNER TEAM PROJECT 1:N 1:N JOB NAME 1:N <SCOPE, IDENTIFIERS>
 (Depends on Identity & Credential Manager) 1:N OWNERSHIP IDENTITY
  16. So, what resources can I use? Inventorying and provisioning of

    resources across platform/infrastructure services. RESOURCE
 MANAGER Define resources to offer: - Online Compute - Storage - Batch Compute Abstract resource provisioning by providing a workflow to provision resources - Allows policies (ex: < 100 vCPU free to launch) - Tie to identity system
  17. So, what resources can I use? Inventorying and provisioning of

    resources across platform/infrastructure services. RESOURCE
 MANAGER COMPUTE BLOB STORAGE KEY/VAL
 STORAGE foo_service CPU MEMORY DISK STORAGE IN GB GETS PUTS STORAGE IN GB WPS RPS RESOURCE PROVISIONING SERVICE CLOUD PROVISIONING IDENTITY PROVISIONING SERVICE
  18. METER &CHARGEBACK How much am I using? $$ Ability to

    meter allocation and utilization of resources per service, per engineering team and charge them accordingly Enables Visibility & Accountability Metering across Infrastructure requires standard `schema` - ts (timestamp) - identifier - infrastructure - resource - utilization Leverage internal visibility/observability stack Unit price definition per resource can difficult.
  19. DASHBOARD (SINGLE PANE OF GLASS) METADATA RESOURCE & CAPACITY BUDET,

    METERING & CHARGEBACK IDENTITY & CREDENTIAL PROVIDER APIS & ADAPTERS REPORTING WORKFLOWS { INFRASTRUCTURE AND PLATFORM SERVICES DATACENTER / PUBLIC CLOUD INTERNAL APIS OWNERSHIP