dark out and you’re trying to wrap up for day so you can begin your long commute home. That’s when you hear it. At first, it is faint but with each half-second, the thumping begins to sound like the pace of somebody who’s in a hurry but not just your “oh I forgot to send that last email for the day” hurry, and that worries you because it’s coming in your direction.
fun at your expense, the brisk pace stops at your desk. You had your head down, hoping that the fury would simply pass you by. But alas, the entity, now hovering above your workspace, trying to catch it’s breath, isn’t moving. It has found it’s target. And it is you.
upwards. It’s a project manager. Not just any project manager, but the one managing the data integration project you and your team have been working on for the last 6 months. Now there’s usually no fear and trepidation in your daily interactions with your manager, but this time, something _feels_ different. You can’t quite put your finger on it, but you’ve experienced this sensation before. And like a bolt of lightning from Zeus himself, it hits you…
software. But it isn’t until recently that I had a profound realization that reduced all I’ve been doing for the last 20 years to simply moving data from point A to point B. Whether or not you realize it, many of you are in the same boat.
you dreaded were coming: “we overlooked something.” And for the next hour or so, you and her unpack the problem, come up with a design, and hope you have enough time to deliver a solution by the following Thursday.
the problem to its essence, it is a simple matter of having two organizations exchange some information. But you know simple is not easy. The devil’s always in the details so you ask for them.
our internal data with a business partner as close to real time as possible. They only have processes in place to pull data from an object store but can do so on demand. Love, The Business It is said that “if you want good answers, ask good questions, and that often the answer lies within the question.” You look at the problem statement again, this time, keeping an eye out for the goal and the constraints.
our internal data with a business partner as close to real time as possible. They only have processes in place to pull data from an object store but can do so on demand. You manage to distill the requirements to: - needing to share some data - needing to determine what data is important to share - needing to work with existing data—however it is stored today - needing to make changes available quickly - and finally, needing to trigger the retrieval of this important data
draw up a plan. Your organization uses AWS for all the things so you see this as an opportunity to leverage Lambda to solve a real business problem. After a few design rounds you end up with this.
you’ve got with your team. “We’ll start with our data source!”, you exclaim. It is an RDS mySQL cluster with a Primary and some Read Replicas. We’re dealing with a relational system so that means the chances are high the information we need to relay is not going to be all in one table. Furthermore, for a number of reasons, we can’t hook into application logic at the code level to find out when interesting things happen to the information we care about. This forces us to then look at things at the storage layer and keep an eye out for interesting changes happening there. After some research, we’ve find out that the Primary keeps a binlog that is shared with the replicas so that they may be kept up to date. This binlog contains row-level changes happening across all tables in the database. If we start to also consume this binlog, we’ll know when interesting changes take place. So we’ll create a service (binlog streamer) that continuously consumes the binlog and sends change data to a Kinesis stream. As data is being poured into the stream, we’ll have Lambda poll it and invoke a Function whose job it is to determine if the change is something we care about according to our original goal. When it is, we’ll use Step Functions to orchestrate a state machine with a Lambda function for each step. 1. Go back to our original datasource for additional information when necessary. 2. Upon obtaining all of the pieces of information we need to relay, we need to put it in the object store where the Business Partner will eventually claim it. 3. For internal purposes, we’ll need to keep an audit log so that we know what was communicated to the partner. This step happens in parallel with the last step…
the team not only struggles with the paradigm shift of not having servers or even containers sticking around, but also with the lack of visibility into what the pieces of the architecture are doing at any given time.
introduced Step Functions to do the orchestration rather than doing it yourself with functions calling or triggering other functions. And you enabled X-Ray for traceability. These tools helped make the system more observable.
you learned. It is the first time you’ve really leveraged “Function as a Service” to this extent. You marvel at the efficiency of it all as you look back on your accomplishments, but you also remember some of the painful parts. In particular…
lock-in, jump at the tooling. Debugging remote functions with a short lifespan requires a shift in mind set. You’ll never love metrics, logging, and tracing as much as you do when doing serverless work. These are new concerns for the traditional application developer. Testing locally is hard. Tools to do so are improving, however. Many fear “lock-in” but you always have lock-in somewhere and trying to use abstractions on top of vendor-specific APIs is doable but time-consuming. Use all the supporting services and tooling that your vendor of choice makes available to ship more quickly.
Sanjeev Kugan • Jarren Simmons • NASA • Nathan Pirkle • Jeremy Thomas • Anthony Metcalfe • Nathan Dumlao • Tyson Dudley • Tim Wright Photo by Anthony Metcalfe on Unsplash