Slide 1

Slide 1 text

Machine Learning at Zopa 7 December 2017 Vlasios Vasileiou Head of Data and Data Science Vlasios Vasileiou Head of Data Science

Slide 2

Slide 2 text

Simple loans. Smart investments. Borrowers Investors Invests Repayments Interest + capital Loans

Slide 3

Slide 3 text

A pioneering financial services company World’s 1st peer-to-peer lending platform in 2004 £2.8 billion lent to date. Strong annual growth >50% 246,000 people have taken a Zopa loan 59,000 actively invest through Zopa

Slide 4

Slide 4 text

Strong, consistent growth Amount lent and forecast by year 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 £2bn £1bn

Slide 5

Slide 5 text

ML & Advanced Analytics have been a strong supporter of our growth ü Credit Risk Assessment ü Onboarding Funnel Simulation ü Pricing Optimization ü Fraud Identification ü Document Forensics ü Optimized Prospect and Existing Customer Marketing ü Income and Rent Estimation for Affordability Evaluation 5

Slide 6

Slide 6 text

ML & Advanced Analytics have been a strong supporter of our growth 6 ü Credit Risk Assessment ü Onboarding Funnel Simulation ü Pricing Optimization ü Fraud Identification ü Document Forensics ü Optimized Prospect and Existing Customer Marketing ü Income and Rent Estimation for Affordability Evaluation

Slide 7

Slide 7 text

ML & Advanced Analytics have been a strong supporter of our growth 7 How we did get here? What were our challenges and learnings? ü Credit Risk Assessment ü Onboarding Funnel Simulation ü Pricing Optimization ü Fraud Identification ü Document Forensics ü Optimized Prospect and Existing Customer Marketing ü Income and Rent Estimation for Affordability Evaluation

Slide 8

Slide 8 text

Analytical infancy (2004 – 2013) • SQL • Excel • Some Python • Externally produced credit models

Slide 9

Slide 9 text

Analytical renaissance (2014 – 2015) 2014, £15m investment Board recognized need for data-driven growth ü Creation of Data Science function ü Investment in Data Analytics

Slide 10

Slide 10 text

Analytical renaissance (2014 – 2015) Let’s start building some models!! Created by Alekksall - Freepik.com

Slide 11

Slide 11 text

Systematizing Machine Learning at Zopa (2014) Wanted to be able to produce ML models that were: § Rapidly generated § Easily vettable § Highly predictive § Easily deployable Several considerations: n Common codebase or personal choice of tools? n Buy or build? n Which language? Which package? Created by Alekksall - Freepik.com

Slide 12

Slide 12 text

Wanted to be able to produce ML models that were: § Rapidly generated § Easily vettable § Highly predictive § Easily deployable Several considerations: n Common codebase or personal choice of tools? n Buy or build? n Which language? Which package? Systematizing Machine Learning at Zopa (2014) Common codebase

Slide 13

Slide 13 text

Systematizing Machine Learning at Zopa (2014) Common codebase Built in-house Wanted to be able to produce ML models that were: § Rapidly generated § Easily vettable § Highly predictive § Easily deployable Several considerations: n Common codebase or personal choice of tools? n Buy or build? n Which language? Which package?

Slide 14

Slide 14 text

Wanted to be able to produce ML models that were: § Rapidly generated § Easily vettable § Highly predictive § Easily deployable Several considerations: n Common codebase or personal choice of tools? n Buy or build? n Python Systematizing Machine Learning at Zopa (2014) Common codebase Built in-house

Slide 15

Slide 15 text

Predictor – Zopa’s ML Toolkit (2014) n Streamlined, Automated ML Application n Implements all stages of producing an ML model, requiring minimal user input n Leverages PyData Ecosystem n 9k lines

Slide 16

Slide 16 text

How is Predictor Used? n Same exact code used by Data Scientists and in Production √ No “Model Translation” Overhead √ No restrictions on which ML techniques can be used n Training n Needs unprocessed data and a simple config file n Driven via CLI, interacting with Python codebase, or Jupyter GUI n Querying Mode n Driven by all of the above + n rest API (Flask-based microservice) in Production, <1—2 s/ call

Slide 17

Slide 17 text

Learnings? n We have full control of how we do ML √ Methodology √ Deployment √ Features √ Competitive advantage, selling point to DSs. √ One day we will open source à Brand boost Ø Need to maintain at least 2 people familiar with the code base Ø Need to keep improving it – otherwise we’ll miss latest developments

Slide 18

Slide 18 text

How have we used Predictor? n Was used to credit assess >£2billion worth of loans since 2015 n Production: credit and fraud risk assessment, affordability evaluation n Offline: Pricing and marketing optimization n Future: Operational efficiency for Underwriting, Collections, Customer Services n Techniques we have been using n Neural Networks, n Bagged logistic regressions and Multivariate Adaptive Regression Splines n Gradient Boosted Trees, and Random Forests

Slide 19

Slide 19 text

Thank you! Further reading blog.zopa.com/2016/10/21/the-birth-of-predictor/ blog.zopa.com/2016/12/02/data-democratization/ Come work with us! Hiring for Data Scientists, Analysts, and Engineers https://jobs.lever.co/zopa [email protected]

Slide 20

Slide 20 text

No content