The implicit data parallelism in collective operations on aggregate data structures constitutes an attractive parallel programming model for functional languages. Beginning with our work on integrating nested data parallelism into Haskell, we explored a variety of different approaches to array-centric data parallel programming in Haskell, experimented with a range of code generation and optimisation strategies, and targeted both multicore CPUs and GPUs. In addition to practical tools for parallel programming, the outcomes of this research programme include more widely applicable concepts, such as Haskell’s type families and stream fusion. In this talk, I will contrast the different approaches to data parallel programming that we explored. I will discuss their strengths and weaknesses and review what we have learnt in the course of exploring the various options. This includes our experience of implementing these approaches in the Glasgow Haskell Compiler as well as the experimental results that we have gathered so far. Finally, I will outline the remaining open challenges and our plans for the future.