Set up and operation of data pipeline components • Dealing with back-pressure: elasticity (static vs. dynamic partitioning) • Efficient usage of resources (utilization/TCO)
• One cluster for • stateless services such as Web servers & app servers • stateful services like PostgreSQL, MemSQL, Kafka, Cassandra, etc. • elastic data processing via Spark, Storm/Heron, Akka, etc. • CI/CD, for example Jenkins/Marathon • Dynamic partitioning of your cluster, depending on your needs • Increased utilization (10% → 80%+)
35 • Try to have short feedback loops • Containers and 'The Cloud' make deployment easy, leverage it! • Technology is the simple part of the solution: big data technologies won't fix your broken culture