Reducing customer churn

Controlling and Finance Audit Reducing customer churn Andrey Sereda, Data
Analytics practice Restricted © Siemens AG 2016 Restricted

Restricted © Siemens AG 2016 08.09.2016 Page 2 Andrey Sereda
/ CF A DA Siemens generates lots of data through IoT – It is utilized by our divisions to maintain installed devices and develop better products … Siemens productive IoT applications (examples)

/ CF A DA … another big chunk of data is generated by all business activities of the company – The data is stored centrally in a data lake Siemens productive IoT applications (examples) Siemens business data lake Central data lake powered by SAP HANA 100+ local SAP ERP systems Custom connectors Icons from Flaticon.com

/ CF A DA Working on a data-driven solutions for our customers we always follow a clearly defined and aligned approach Smart-Data- Cycle 1 Understand the problem, scope and build hypotheses Ask the right questions 2 Measure elements of the business case / hypotheses Extract/use the right data – and only this data 3 Analyze data Apply various appropriate algorithms (pluralistic modeling) and derive the best solution 4 Test continuously Establish ongoing monitoring of the solution quality and optimization measures 5 Translate analyses results into business impact Find the best way of the implementation of the analyses results to create tangible impact Measure, reflect, ask new questions

/ CF A DA Our vision of customer churn has been changed during the setup phase – Churn formal definition is a cornerstone of the modeling  Customer churns when either  His contract is not renewed, or  We do not sell to him for XX months  When we decide internally, that the customer has been lost, we go out there and do whatever it takes to get him back Before we initiated the project…

/ CF A DA Our vision of customer churn has been changed during the setup phase – Churn formal definition is a cornerstone of the modeling  Siemens is HUGE, we will never have a uniform churn definition  Instead: We forecast order placement for each customer in a short term (3 to 6 months)  Whenever a particular customer deviates from the predicted pattern, let’s do something about that  And BTW, it’s OK if we are not accurate all the time, we shell get the majority right  Customer churns when either  His contract is not renewed, or  We do not sell to him for XX months  When we decide internally, that the customer has been lost, we go out there and do whatever it takes to get him back …and after we carefully thought it through Before we initiated the project…

/ CF A DA Requirements for the end product are defined by the users and their demands for day-to-day business tasks  Big picture view (depending on the position from hundreds to tens of thousands of customers)  Reports on customer dynamics in the past and future, tailored to the Siemens internal reporting periods Management  Detailed view (usually working with ten to hundred customers)  List of customers to be approached today Sales force End goal for the App  Provide both prospectives combining retrospective view (BI) and insights into the short- and mid-term future (BA)  Allow real time filtering and aggregation of details  BUT: No need for real time forecasts / data analysis

/ CF A DA With the problem defined, we could start looking into the data collected in the HANA data lake Data insights / data management Tools: • HANA cluster Insights: • Transactional variables with high info on order forecast • Customer master data – only limited value • Automated data management pipeline is crucial already on this step

/ CF A DA With SAP HANA we had a great solution for data processing, but not for the machine learning part of the envisioned application  @Siemens, we are using SAP HANA based solution for the data lake with all business-related data  SAP offers native integration of standard R (version 2.15) as a separate server with only one thread processing

/ CF A DA We could have scaled the R server, but this does not address the one thread problem Option 1: Scale vertically  Install the server on a better hardware with more RAM  Very limited scalability, no parallel computations Icons from Flaticon.com  @Siemens, we are using SAP HANA based solution for the data lake with all business-related data  SAP offers native integration of standard R (version 2.15) as a separate server with only one thread processing

/ CF A DA A cluster or R instances / servers allows for parallel processing – However special skills required Option 1: Scale vertically  Install the server on a better hardware with more RAM  Very limited scalability, no parallel computations Option 2: Scale horizontally  Use a cluster of R servers  Better scalability, but special skills & packages needed Icons from Flaticon.com  @Siemens, we are using SAP HANA based solution for the data lake with all business-related data  SAP offers native integration of standard R (version 2.15) as a separate server with only one thread processing

/ CF A DA We have decided in favor of scale out option – Use R as an interface to work with a distributed framework from H2O Option 1: Scale vertically  Install the server on a better hardware with more RAM  Very limited scalability, no parallel computations Option 2: Scale horizontally  Use a cluster of R servers  Better scalability, but special skills & packages needed Option 3: Scale out  Use R as an interface, calculate on another software (we use H2O framework to scale and distribute)  Very good scalability, no special skills needed Icons from Flaticon.com  @Siemens, we are using SAP HANA based solution for the data lake with all business-related data  SAP offers native integration of standard R (version 2.15) as a separate server with only one thread processing

/ CF A DA With the clear idea of productive infrastructure in mind, we had all instruments to proceed with data mining Data insights / data management Modeling / proof-of-concept Tools: • HANA cluster Insights: • Transactional variables with high info on order forecast • Customer master data – only limited value • Automated data management pipeline is crucial already on this step Tools: • Local instance of Microsoft R and H2O Insights: • Random hyper parameter search rocks • Use metric tailored to the needs of the end user to decide on the best model • GBM provides best results, followed by RF and Deep Neural Networks

/ CF A DA In two months, we have undergone a way from a vision to the end-user ready pilot product Data insights / data management Modeling / proof-of-concept Production Tools: • HANA cluster Insights: • Transactional variables with high info on order forecast • Customer master data – only limited value • Automated data management pipeline is crucial already on this step Tools: • Local instance of Microsoft R and H2O Insights: • Random hyper parameter search rocks • Use metric tailored to the needs of the end user to decide on the best model • GBM provides best results, followed by RF and Deep Neural Networks Tools: • Fully automated bundle HANA & R & H2O in the Siemens data center Insights: • Simplicity is the king – No ensembles if possible, models should be simple • Scalability is the queen – Parallel processing and full automation to achieve highest speed possible

/ CF A DA And here is what it looks like in the production Raw data from SAP systems Data preparation procedures Generated features Modeling Per customer order score Chart data Final model as R / POJO object QV report • Current solution to deliver app to the end customer • Can be replaced with SAP UI5, thus hosting complete application inside one platform • All analytics delivered inside SAP HANA platform Icons from Flaticon.com

/ CF A DA Dry results of data analyses are translated into the business language – Business user usually does not speak the language of statistics Order placement probability, as of current period Order placement probability, predicted two periods ago Step 1 Estimate order placement probability for all customers for two points in time: current period, and current period minus forecast horizon Step 2 Capture dynamics of the order probability, compare it to the actual customer behavior Step 3 Assign all customers to one of the quadrants of the matrix, decide on the measures

/ CF A DA We split all customers into four groups based on the predicted probability to place an order and their actual purchasing behavior Order placement probability, as of current period Order placement probability, predicted two periods ago • “Hidden chances”: rising probability of order shows, that the customer is about to place an order • “Threats”: Decreasing probability of order placement, coupled with the actual buying behavior (no order in the last two months) points at deviation from the order placement pattern

/ CF A DA Recap: Working on a data-driven solutions for our customers we always follow a clearly defined and aligned approach Smart-Data- Cycle 1 Understand the problem, scope and build hypotheses Ask the right questions 2 Measure elements of the business case / hypotheses Extract/use the right data – and only this data 3 Analyze data Apply various appropriate algorithms (pluralistic modeling) and derive the best solution 4 Test continuously Establish ongoing monitoring of the solution quality and optimization measures 5 Translate analyses results into business impact Find the best way of the implementation of the analyses results to create tangible impact Measure, reflect, ask new questions

/ CF A DA Q&A Any questions?

Reducing customer churn

Reducing customer churn

MunichDataGeeks

More Decks by MunichDataGeeks

Featured

Transcript

Controlling and Finance Audit Reducing customer churn Andrey Sereda, Data

Restricted © Siemens AG 2016 08.09.2016 Page 2 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 3 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 4 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 5 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 6 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 7 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 8 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 9 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 10 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 11 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 12 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 13 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 14 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 15 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 19 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 20 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 21 Andrey Sereda

Restricted © Siemens AG 2016 08.09.2016 Page 24 Andrey Sereda