Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Approaching AI in Banking

Approaching AI in Banking

PyData DC 2018

Avatar for Hussain Sultan

Hussain Sultan

November 18, 2018
Tweet

Other Decks in Technology

Transcript

  1. HUSSAIN SULTAN WASHINGTON DC Computational Python Development and Data Science

    Enablement Amazon and Capital One Consulting clients: leading Fintech lenders and mega-regional banks Introduction
  2. Explosion of Data Advanced Analytics/ AI Why AI now? Cloud

    Computing Software Development Predictive Analytics Open Source Infrastructure Automation 90% of today's data has been created in that last two year1 $219.6 billion spent globally on public cloud services in 2016 and predicted to by $411 billion by 20202 The line between software development and sustainable analysis is blurring The hive-mind of open source clearly has a space in modern analytics as enterprise solutions build on top and around it Low cost compute and storage has enabled Machine Learning and Artificial Intelligence techniques that was before untenable By the end of 2018, spending on IT-as-a-Service for data centers, software, and services will be just under $550 billion worldwide3 1IBM 10 Key Marketing Trends for 2017 - https://ibm.co/2y0r7Ee 2Gartner Press Release - http://gtnr.it/2Fw5LmJ 3Deloitte Technology, Media, and Telecommunications Predictions 2017 - http://bit.ly/2jMYdwm
  3. AI use-cases in Banking Customer Experience Compliance / Regulatory Marketing

    / Decisioning Chat Bots Voice Assistants Fraud Detection KYC/AML Stress Testing / CCAR CECL Readiness Model Management (OCC-2011) Monitoring Personalized Offers Cross Product Recommendation Credit Risk Biometrics nonproprietary proprietary
  4. The most profitable companies have a healthy data flywheel, which

    fuels growth and more valuable data creation Product Segmentation Bureau Product Level Historical Performance Customer Level Historical Performance Targeting Unit Economics Data Analytics Decisions Growth More Data Off the shelf models Treatments Reduced Average Fixed Costs Improved Models Improved Strategy Customer Experience Banking Data flywheel Improved Testing nonproprietary proprietary
  5. Analytics comes together around unit economics in consumer lending Product

    Segmentation Bureau Product Level Historical Performance Customer Level Historical Performance Targeting Unit Economics Data Analytics Decisions Growth More Data Off the shelf models Treatments Reduced Average Fixed Costs Improved Models Improved Strategy Customer Experience Banking Data flywheel Improved Testing
  6. Banking comes with unique constraints Regulation Delayed Outcomes Governance Automation

    and decisioning needs to comply with federal laws: -Fair Housing Act -Equal Credit Opportunity Act -Unfair and Deceptive Acts and Practices -Community Reinvestment Act Business economics rely on measuring customers performance over long periods of time, oftentimes years, not just until checkout or conversion Models and decision making subject to internal and external audits from largely non-technical examiners
  7. Y How do I evaluate different customers for different product

    configurations? Cost to Acquire Value of Account Product Treatments (APR, Loan amount) X Z $0 $10 $20 $30 $40 $50 Rev. A Rev. B Exp. A Exp. B PV Cost to Acquire NPV NPV Waterfall for Segment “Y” $0 $10 $20 $30 $40 $50 Rev. A Rev. B Exp. A Exp. B PV Cost to Acquire NPV NPV Waterfall for Segment “X” -$30 -$10 $10 $30 $50 Rev. A Rev. B Exp. A Exp. B PV Cost to Acquire NPV Unit economics enable us to differentiate customer segments based on long-term assumptions grounded in historical data NPV Waterfall for Segment “Z” Most impactful to bottom line Improves marketing and operational scale Avoid marketing or approving
  8. Finance + Business Stakeholders Inputs from Various Sources and Businesses

    Teams Iterative Financial Forecast and Planning Process Final Decision Uploaded to System of Record Iteration and scenario evaluation leads to cumbersome, error prone excel files and versioning challenges With a process that looks like this
  9. H C B A B A A B E B

    B D F Total Revenues Total Expenses F D A C B C E C D Is doing this E Empirical drivers Assumptions Intermediate calculation Revenues Expenses C Note: • Per-open is metric divided by accounts open at that point in time • Per-original is metric divided by accounts at time of CLI eligibility determination D G A G C I F A Y
  10. We define a system which is informed by our models

    as the ground truth Y H C B A B A A B E B B D F Total Revenues Total Expenses F D A C B C E C D E Empirical drivers C D G A G C I F A Y
  11. Unique coordination of models, optimization tools and scenario planning Historic

    Data Operational Systems Implementing Policies Outputs Model Sensitivity Gaming (e.g. NPV w/ 30% risk increase) Optimization Criteria (e.g. NPV max, return rate target) Unit Economics Calculation (Equations + Models + Assumption) Assumptions (Complex Models and/or Simple Assumptions) Foundational Test (Varied Treatments for Like Customers) Integrated Framework Decision Policies (Who do we act on?) New Forward Looking Financial Expectations Common Answers Review & Ad-Hoc Analysis Business Users Power Users & Model Maintainers Python / R / SQL Notebooks NPV Tool Web App
  12. Product Segmentation Bureau Product Level Historical Performance Customer Level Historical

    Performance Targeting Unit Economics Data Analytics Decisions Growth More Data Off the shelf models Treatments Reduced Average Fixed Costs Improved Models Improved Strategy Customer Experience Banking Data flywheel Improved Testing
  13. Credit Modeling with Dask complex task graphs in the real

    world http://matthewrocklin.com/blog/work/2018/02/09/credit-models-with-dask
  14. Technology Cluster Compute Massive scaled storage, memory, and compute Initial

    Raw Dataset(s) single source of truth is the starting point for Data Science Experimentation Interactive analytics for building and experimenting modeling techniques Collaboration Human in the loop with multiple stakeholders Deployment Deploy and monitor ongoing health
  15. Build interactive analyses and share it with others with nteract/jupyterhub

    building blocks for easily extending Jupyter’s interactive computing capabilities
  16. Technology Hadoop Massive scaled storage, memory, and compute Initial Raw

    Dataset(s) single source of truth is the starting point for Data Science Experimentation Interactive analytics for building and experimenting modeling techniques Collaboration Human in the loop with multiple stakeholders Deployment Deploy and monitor ongoing health
  17. Final thoughts and questions? • Dask + Numba for Efficient

    In-Memory Model Scoring • Credit Modeling with Dask • Nteract, Building on top of Jupyter