"Talk Data to Me - the art of storytelling" by Diana Pholo

7b0645f018c0bddc8ce3900ccc3ba70c?s=47 Pycon ZA
October 11, 2019

"Talk Data to Me - the art of storytelling" by Diana Pholo

Collecting data is now seen as a very important aspect of business. Many companies therefore invest in solutions such as Business Intelligence tools, spreadsheets and dashboards in an attempt to extract useful information from their data. However, these tools still fail to present what is hidden in the data because they do not bring out their underlying significance. This is where data storytelling comes in. It helps to visualise information, understand the narrative behind it and share it.

This talk is aimed at anyone who wishes to get started with storytelling using data. What do you look for? How do you present your findings to stakeholders? Together, we will have a look at common visualisation tools provided Matplotlib, Pandas and Seaborn (accompanied with code snippets and sample outputs). We will then look at the story behind the Harambee youth employment journey data. (Data provided by Harambee Youth Employment Accelerator)


Pycon ZA

October 11, 2019


  1. Talk Data to Me: The Art of Storytelling Diana Pholo

    Predictive Insights
  2. Introduction • Data is just data if we do not

    effectively visualize and communicate it.
  3. Story components • Situation • Problem • Insights • Next

  4. The Background

  5. Background • harambee.co.za

  6. The Problem

  7. Problem Definition •

  8. Finding the story

  9. Exploring the data

  10. Exploring the data (cont.) • Nice tool: pandas ◦ head()

    ◦ tail() ◦ describe() ◦ isnull() ◦ etc.
  11. Exploring the data (cont.) •

  12. Exploring the data (cont.) •

  13. Exploring the data (cont.) •

  14. Exploring the data (cont.) •

  15. Data Wrangling • Examples: ◦ Convert data types ◦ Dealing

    with NAs ◦ Engineer features
  16. Data Wrangling (cont.) • E.g.: Type conversion

  17. Data Wrangling (cont.) • E.g.: Feature Engineering

  18. Feature Distributions • Example: boxplot with Pandas

  19. Feature Distributions (cont.) • Example: Histograms

  20. Feature Distributions (cont.) • Histogram with Pandas

  21. Feature-Feature Relationships •

  22. Feature-Feature Relationships (cont.) • Example: boxplot with Pandas

  23. Feature-Feature Relationships • Correlation

  24. Feature-Feature Relationships • Heatmaps

  25. Feature-Feature Relationships • Barplot

  26. Feature-Feature Relationships (cont). • Scatter plots

  27. Real-life example

  28. Background • South African labour market more favourable to men

    • Harambee takes in more women than men
  29. Problem • Percentage of women finding employment is still lower

  30. Insights • Example: Women generally have more responsibilities

  31. What’s next? • Provide child care • Promote inclusive workplace

    culture • Teach shared responsibilities
  32. In Conclusion

  33. Recap •

  34. Best practices • Create interesting presentation • Avoid unnecessary details

    • Define & Understand Audience • Humanize data
  35. Don’t • Determine narrative beforehand • Spend more time on

    presentation than questioning results validity • Analyse data without big picture • Use storytelling as a shortcut