Users leave thousands of traces per second on a successful ecommerce site. It’s very pragmatic to analyse and react on this trace event stream in realtime. This is called clickstream analysis. In the talk I’ll present a software architecture based on Apache Spark which is able to process thousands of clickstream events per second. A product based on this architecture is in production since mid 2015 and is still performing well. The building blocks of the architecture beside Spark are Kafka to handle the inbound event stream, Spark Streaming for initial stream processing and Parquet as serialization format. I argue why we’ve chosen these technologies and what experiences we had in developing, launching and operating the product.