Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Carrier grade API Store

Carrier grade API Store

Presentation given at the first API Days event in Amsterdam.

In this presentation we share our experiences in building the KPN API Store platform.

Albert W. Alberts

October 16, 2018
Tweet

More Decks by Albert W. Alberts

Other Decks in Technology

Transcript

  1. Carrier grade API Store Why, what and how … Albert

    W. Alberts AMSTERDAM, OCTOBER 16th & 17th 2018
  2. 2 KPN since january 1999: - Previous functions: developer, designer,

    software architect - Currently architect for CloudNL VMware, API Store - Worked on several KPN patents Meetup Organizer of: - devNetNoord, a developer community (491 members) - domoticaGrunn, a home automation community (199 members) Private: Swimming, water polo, cycling Who is Albert Alberts [email protected]
  3. 3 Why we build it, What we build, How we

    build it … What to expect? API Store at KPN
  4. 4 • Expose internal Telco services on the internet. •

    Not just our own but open for third party APIs and partners. • On-premises instead of the cloud: • Security guidelines (KPN-CISO/kpn-security-policy) requiring a multi layered approach. • Own data centers, own internal infra provider. • As Telco we would like to have it carrier grade. API Store the “why” DC or
  5. 5 Wikipedia: In telecommunication, a "carrier grade" or "carrier class"

    refers to a system, or a hardware or software component that is extremely reliable, well tested and proven in its capabilities. Carrier grade systems are tested and engineered to meet or exceed "five nines" high availability standards, and provide very fast fault recovery through redundancy. Carrier grade, what does that mean?
  6. 6 front-end availability = (API Store availability) x (back-end availability)

    Availability, the math … API Store availability back-end service front-end service As pass-through system we can only reduce the overall availability. Strive for lots of “nines”.
  7. 7 From the definition: fast fault recovery, fail-over … Solutions

    at every layer of the stack: • Application: fault- or error- handling • Network: load balancer, DNS, floating IP-addresses • Platform: clustering • Infrastructure: fail-over of virtual machines (vMotion) • Storage: SAN • Network: fail-over over network hardware • Hardware: dual NIC, dual-PSU, UPS Redundancy …
  8. 8 • Market leader • Offered on-premises solution • Local

    contacts • Good enough to be bought by Platform choice why Apigee?
  9. 9 Apigee environment: - Edge for Private Cloud - Edge

    Analytics - Developer Services portal or Portal Apigee platform functional blocks Edge API Analytics Portal
  10. Analytics Edge Apigee components 10 Router Zookeeper Cassandra OpenLDAP Management

    Server Edge UI Message Processor Qpid Server Postgres Server Developer Portal Drupal Postgres Server
  11. 11 3 functional blocks 11 different components and … 6

    recommended installation topologies: - 1-node, All-in-One - 2-node - 5-node - 9-node - 13-node - 12-node (dual datacenter, DR/HA) Apigee installation topologies Analytics Edge Router Zookeeper Cassandra OpenLDAP Management Server Edge UI Message Processor Qpid Server Postgres Server Developer Portal Drupal Postgres Server
  12. Portal Analytics Edge API Store at KPN production platform 12

    Node 2 Router Node 3 Router Node 1 Zookeeper Cassandra OpenLDAP Management Server Edge UI Node 4 Zookeeper Cassandra Message Processor Node 5 Zookeeper Cassandra Message Processor Node 6 Node 7 Qpid Server Postgres Server Qpid Server Postgres Server Developer Portal Drupal Postgres Server “Carrier grade systems are tested and engineered to meet or exceed "five nines" high availability standards, and provide very fast fault recovery through redundancy.”
  13. Portal Analytics Edge API Store at KPN redundancy 13 Node

    2 Router Node 3 Router Node 1 Zookeeper Cassandra OpenLDAP Management Server Edge UI Node 4 Zookeeper Cassandra Message Processor Node 5 Zookeeper Cassandra Message Processor Node 6 Node 7 Qpid Server Postgres Server Qpid Server Postgres Server Developer Portal Drupal Postgres Server • Router and Message Processor separated • Dual Router setup with load balancer • Dual Message Processor setup • Zookeeper and Cassandra clustered • Analytics servers clustered
  14. 14 - Logging, access, errors, system use - Tracing, specialized

    logging API execution - Analytics, how many calls are made - Monitoring, system health, CPU load, memory, disk space - Alerting, notify when an action is required API Store at KPN observability disk at 80%
  15. Apigee at KPN development & staging platform 17 Node 2

    Router Message Processor Node 1 Zookeeper Cassandra OpenLDAP Management Server Edge UI Zookeeper Cassandra Node 3 Qpid Server Postgres Server Developer Portal Drupal Postgres Server Router Message Processor Node 2 Router Message Processor Node 1 Zookeeper Cassandra OpenLDAP Management Server Edge UI Zookeeper Cassandra Node 3 Qpid Server Postgres Server Developer Portal Drupal Postgres Server Router Message Processor development staging
  16. Current API deployment process 19 !DEV "STG #PRD Content editors

    API-Developers !DEV "STG #PRD Content editors API-Developers ✏ API API push API push content push content push % content
  17. Preferred API deployment process 20 !DEV "STG #PRD API-Developers !DEV

    "STG #PRD API-Developers ✏ API % API push % API push % content push % content push & content Content editors Content editors
  18. 21 Near future: Implement the preferred API deployment flow Long

    term: Dual datacenter setup => DR/HA Hybrid setup => On Premises combined with Cloud for the external API’s API Store at KPN roadmap
  19. Carrier grade & API Lifecycle Management takeaways 22 Availability: •

    Implement the redundancy mechanisms. • Use load balancers. • Separate production from development. Deployment: • Separate production from development. • Define the processes. • Automate (not only your workflow).