Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What the HACK happened to the data?

March 07, 2020

What the HACK happened to the data?

Presentation by Tina & Nico at the #energyhack2020 event which I co-organized.


March 07, 2020

More Decks by loleg

Other Decks in Technology


  1. What the HACK happened to the data? by Tina &

    Nico Client Analysis 2.0 and the power of open data
  2. Search for ‘openly available’ customer segmentation • Gebäude mit Wohnnutzung

    • Further examples of Zurich and Lucerne • Kennzahlen der Basler Wohnviertel und Landgemeinde
  3. Segmentation of regions in Basel Data: https://data.bs.ch/explore/dataset/100011/table/ (19 factors, 21

    regions) Assumption: This dataset includes factors which are all related to the energy needs and consumption behaviour of a region Limitations: Not enough regions, too high level information (individual households would be much more interesting) and lacking many and possibly the most relevant characteristics (e.g. traffic, population density, industry, connections between the variables)
  4. Number of segments? Based on all regions and factors in

    the dataset, how many segments might be useful? E.g. elbow plot (k-means clustering, plotting SSE for possible number of segments) https://towardsdatascience.com/custo mer-segmentation-using-k-means-clu stering-d33964f238c3 3 groups
  5. Location of segments based on the example factors Buildings’ mean

    age Populations’ age ratio % of persons living in single households
  6. Potential next steps • Defining the characteristics of a segment

    (e.g. regions with high number of one person households, newer buildings -> possibly high potential region? but what about the red dots?) • Finding proxies to energy consumption in each region and see how it relates to our segmentation (could be number of devices, check data from Swisscom) • Example question which could be answered: Is the assumed high potential region consuming less or more power? Example limitation: Since we do not have many regions and miss out characteristics, outlier (e.g. a region with a great number of unemployed persons) may end up in a not appropriate segment (e.g. maybe they live in regions with otherwise low potential and then therefore may receive no offer, but actually would be persons more willing to spend time to select a better provider and they are also the ones with greatest need to benefit)
  7. Libraries on how to visualise the data into a map

    • Worldwide view: rnaturalearth is an R package to hold and facilitate interaction with natural earth vector map data. • National view: geofaceting provides a functionality for 'ggplot2'. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that preserves some of the geographical orientation.
  8. “Technology now allows people to connect anytime, anywhere, to anyone

    in the world, from almost any device. This is dramatically changing the way people work, facilitating 24/7 collaboration with colleagues who are dispersed across time zones, countries, and continents.” Michael Dell, Chairman and CEO of Dell
  9. Recommendations For virtual set up use tools like: • Code

    sharing online platforms, like Google Colab • Frequent hangout calls • Frequent exchange with challenge owner