You are in a room full of overlapping cron jobs. You can hear the screams of a dying MySQL server. An Oracle vendor is here. To the West, a door is marked “Map/Reduce” To the East, a door is marked “Stream Processing”
@pyr Publish & Subscribe ● Records are produced on topics ● Topics have a predefined number of partitions ● Records have a key which determines their partition
@pyr Avoiding overbilling ● Reconciler acts as logical clock ● When supplying usage, attach a unique transaction ID ● Reject multiple transaction attempts on a single ID
@pyr Avoiding overbilling ● Reconciler acts as logical clock ● When supplying usage, attach a unique transaction ID ● Reject multiple transaction attempts on a single ID
@pyr What about batch? ● Streaming doesn’t work for everything ● Sometimes throughput matters more than latency ● Building models in batch, applying with stream processing