The talk outlines the evolution of a Node.js application from a proof-of-concept implementation to a mature, prospering product that earns revenue and scales to millions of customers
What this talk is about? ● The stories and lessons learned from building a greenfield project in an enterprise setting ● An overview of best practices and tools that we have adopted in the past years ● This talk is for you if: ○ you'd like to be more familiar with building and evolving Node.js services - both from a technical and organizational view, ○ you'd like to build greenfield projects in the near future using Node.js.
What this talk is about? A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system. John Gall
What do we build? ● A personalization engine: ○ exposing a REST API, ○ shipping an admin interface for (mostly) POs, ○ all built on top of a machine learning stack.
Goals ● Build a personalization platform that can be used by other teams ● For the initial implementation: ○ Increase user engagement ■ measured by click-through rate of call-to-actions April, 2017 April, 2018 April, 2016
Proof of concept (1/2) ● disposable piece of code to prove a point ○ most probably won't take into account ■ scalability, ■ security, ■ error handling April, 2017 April, 2018 April, 2016
● it may easily get miscommunicated ○ it may set false expectations / tell the wrong story on the status of the project ● as a consequence, it may get incorporated into production systems ● providing the functionality is not enough - it must be a viable application Proof of concept (2/2) April, 2017 April, 2018 April, 2016
Tracer ammunition (tracers) are bullets that are built with a small pyrotechnic charge in their base. Ignited by the burning powder, the pyrotechnic composition burns very brightly, making the projectile trajectory visible to the naked eye during daylight, and very bright during nighttime. This enables the shooter to make aiming corrections without observing the impact of the rounds fired and without using the sights of the weapon. Tracer code - where does the term come from? April, 2017 April, 2018 April, 2016
Tracer code - how does it apply to software? ● the main components built in the early stages of a project will be used in the production environment ○ gives you an opportunity to test architecture, and an idea how difficult adding new functionality will be ● later on, the architecture can be evolved, just like how the tracer bullet might change trajectory ● the tracer code approach requires user stories to be in place, it does not allow exploration April, 2017 April, 2018 April, 2016
First look at the service we built - with a single downstream dependency Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Domains API Contains Card data
Graceful shutdowns ● The process of freeing up the resources used by the application before termination like ○ database connections, ○ or file descriptors ● Terminus ○ Adds graceful shutdown and Kubernetes readiness / liveness checks for any HTTP applications https://www.npmjs.com/package/@godaddy/terminus
Service architecture with Card Inventory Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Domains API Card Schemas (in Ceph) Generated on-demand Card Inventory (GDocs)
● error prone ○ easy to edit the wrong row ● no visual feedback on the card edited ● card schemas are generated on demand, not on every change Better, but April, 2018 April, 2017 April, 2018 April, 2016
Card Authoring Tool April, 2018 April, 2017 April, 2018 April, 2016 ● The need emerged for a CMS-like solution to manage cards ● Requirements ○ Version Card Schemas ○ Browse all the Card Definitions in the system ○ Track Card Definition state: has changes but not published; versus content is same as published content
What are PDRs? April, 2018 April, 2017 April, 2018 April, 2016 ● Stands for Pre-Development Review ● They are first and foremost discussions - pull requests where project members collaborate ● We use them to ○ investigate new technologies ○ compare solutions ○ in short: to learn.
Service architecture with Card Authoring Tool Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Domains API Card Schemas (in Ceph) Generated on each change Card Definitions (in GHE) Card Authoring Tool
Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Domains API Card Schemas (in Ceph) Generated on each change Card Definitions (in GHE) Card Authoring Tool
Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Domains API Card Schemas (in Ceph) Generated on each change Card Definitions (in GHE) Card Authoring Tool Domains API Domains API Domains API Domains API
Introducing a unified API for partner teams April, 2018 April, 2017 April, 2018 April, 2016 ● Get the same type of data from different downstream services soon became an issue ○ We had to implement custom logic for each new partner team to collect information (sometimes calling multiple endpoints) ● Standardized Entity API ○ Each partner team has to expose an endpoint that returns data in the same format ○ Adding new partner teams became a single line of code change
Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Entity Service Card Schemas (in Ceph) Generated on each change Card Definitions (in GHE) Card Authoring Tool All partner APIs
Canary deployments ● Canary deployments is a technique to reduce the risk of introducing a new version in production by slowly rolling out the change ○ Capture & monitor metrics, and rollback if needed ○ Reduces user impact of changes https://martinfowler.com/bliki/CanaryRelease.html
npm ci ● similar to npm install, except it's meant to be used in automated environments ○ npm ci can only install entire projects at a time: individual dependencies cannot be added with this command, ○ npm ci will never write to package.json or any of the package-locks: installs are essentially frozen ● as a result, performance is improved, can be 40% faster then npm install or yarn https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable
Challenges of moving to AWS April, 2018 April, 2017 April, 2018 April, 2016 ● Some partner APIs are not exposed on the public network ○ We need a way to "talk home" ● Dedicated infrastructure team vs. you build it, you run it
Talking home April, 2018 April, 2017 April, 2018 April, 2016 ● Two clients running in OpenStack ● Running the OpenVPN Servers in Amazon ECS ● CloudWatch is used to monitor if the OpenVPN server is up and running ○ If not, it triggers a Lambda function which replaces the OpenVPN master with the secondary ○ The old primary will be recovered by ECS and put back in rotation
Card Catalog April, 2017 April, 2018 April, 2016 JSON {} Entity Service Card Schemas (in S3) Generated on each change Card Definitions (in GHE) Card Authoring Tool All partner APIs Through the OpenVPN bridge
You build it, you run it April, 2018 April, 2017 April, 2018 April, 2016 ● Encourages ownership and accountability, which leads to more independent and responsible teammates ● Leads to operational excellence
Ark by Heptio ● Ark gives you tools to backup and restore your Kubernetes cluster resources and persistent volumes: ○ Take backups of your cluster and restore in case of loss ○ Copy cluster resources across cloud providers ○ Replicate your production environment for development and testing environments https://github.com/heptio/ark
Skaffold https://github.com/GoogleContainerTools/skaffold ● Skaffold is a command line tool that facilitates continuous development for Kubernetes applications ○ Detect changes in your source code and automatically build/push/deploy ○ Support for multiple application components. Build and deploy only the pieces of your stack that have changed ○ Deploy regularly when saving files or run one off deployments using the same configuration April, 2018 April, 2017 April, 2018 April, 2016
The future ● Moving to AWS Direct Connect ● Moving from a pull-based architecture to a push-based one for Entity Service ○ Partner teams need to push data whenever there is a change / new record ■ Improved latency ■ Simplified data flow