If you are a data scientist or a platform engineer, you probably can relate to the pains of working with the current explosive growth of Data/ML technologies and toolings. With many overlapping options and steep learning curves for each, it’s increasingly challenging for data science teams. Many platform teams started thinking about building an abstracted ML platform layer to support generalized ML use cases. But there are many complexities involved, especially when dealing with Real-time or Near Real-time ML.
In this talk, we’ll discuss why ML platforms can benefit from a simple and “invisible” abstraction. We’ll offer some evidence on why you should consider leveraging streaming technologies despite all the hard challenges you have heard. We’ll share learnings (combining both ML and Infra perspectives) about some of the hard complexities involved in building such simple abstractions, the design principles behind them, and some counterintuitive decisions you may come across along the way.
By the end of the talk, I hope data scientists can walk away with tips to be more productive with ML workflows, and platform engineers learned a few architectural and design tricks that can make your organization future-proof.