Dask is a Python based flexible parallel computing library. It enables interactive data analysis on large datasets, and scales from your laptop to a cluster. This parallelization can speed up your analysis, but if your compute nodes are sat idle you can end up burning a lot of money.
Dask has an interface through which it can ask for more / less resources. We've built an example system, using cluster management tools like Kubernetes and public cloud infrastructure, that allows you to maximize the amount of compute you get when you need it while also minimizing cost.