Slide 1

Slide 1 text

What the HACK happened to the data? by Tina & Nico Client Analysis 2.0 and the power of open data

Slide 2

Slide 2 text

Research approach how the BFS calculates the energy consumption on a national level segmentation

Slide 3

Slide 3 text

Example of Lucerne

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

So let’s do it for fun!

Slide 6

Slide 6 text

Search for ‘openly available’ customer segmentation ● Gebäude mit Wohnnutzung ● Further examples of Zurich and Lucerne ● Kennzahlen der Basler Wohnviertel und Landgemeinde

Slide 7

Slide 7 text

Segmentation of regions in Basel Data: https://data.bs.ch/explore/dataset/100011/table/ (19 factors, 21 regions) Assumption: This dataset includes factors which are all related to the energy needs and consumption behaviour of a region Limitations: Not enough regions, too high level information (individual households would be much more interesting) and lacking many and possibly the most relevant characteristics (e.g. traffic, population density, industry, connections between the variables)

Slide 8

Slide 8 text

Examples of potential factors Buildings’ mean age Populations’ age ratio % of persons living in single households

Slide 9

Slide 9 text

Number of segments? Based on all regions and factors in the dataset, how many segments might be useful? E.g. elbow plot (k-means clustering, plotting SSE for possible number of segments) https://towardsdatascience.com/custo mer-segmentation-using-k-means-clu stering-d33964f238c3 3 groups

Slide 10

Slide 10 text

Location of segments based on the example factors Buildings’ mean age Populations’ age ratio % of persons living in single households

Slide 11

Slide 11 text

Potential next steps ● Defining the characteristics of a segment (e.g. regions with high number of one person households, newer buildings -> possibly high potential region? but what about the red dots?) ● Finding proxies to energy consumption in each region and see how it relates to our segmentation (could be number of devices, check data from Swisscom) ● Example question which could be answered: Is the assumed high potential region consuming less or more power? Example limitation: Since we do not have many regions and miss out characteristics, outlier (e.g. a region with a great number of unemployed persons) may end up in a not appropriate segment (e.g. maybe they live in regions with otherwise low potential and then therefore may receive no offer, but actually would be persons more willing to spend time to select a better provider and they are also the ones with greatest need to benefit)

Slide 12

Slide 12 text

Libraries on how to visualise the data into a map ● Worldwide view: rnaturalearth is an R package to hold and facilitate interaction with natural earth vector map data. ● National view: geofaceting provides a functionality for 'ggplot2'. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that preserves some of the geographical orientation.

Slide 13

Slide 13 text

“Technology now allows people to connect anytime, anywhere, to anyone in the world, from almost any device. This is dramatically changing the way people work, facilitating 24/7 collaboration with colleagues who are dispersed across time zones, countries, and continents.” Michael Dell, Chairman and CEO of Dell

Slide 14

Slide 14 text

Recommendations For virtual set up use tools like: ● Code sharing online platforms, like Google Colab ● Frequent hangout calls ● Frequent exchange with challenge owner