Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predict late loan

sambaiga
August 11, 2019

Predict late loan

Build machine learning to predict late loan

sambaiga

August 11, 2019
Tweet

More Decks by sambaiga

Other Decks in Technology

Transcript

  1. Introduction Credit risk assessment is a crucial in the bank

    industry. • Evaluate if customer can be a defaulter ⇒ grant the loan or not. • Minimize possible losses ⇒ increase the volume of credits 2
  2. Dataset Publicly available data from LendingClub: a US peer-to-peer lending

    company. • Contain complete loan data for all loans issued through the 2007-2015. • A matrix of 2260668 × 145 4
  3. Data pre-processing Missing values • Drop features with more than

    70% missing values. • Fill missing values with median → numerical data • Fill missing values with mode → categorical data 0 20 40 60 80 100 Percent of missing values 0.00 0.02 0.04 0.06 0.08 Density Missing values distribution 5
  4. Data pre-processing: Target distribution 0 200000 400000 600000 800000 1000000

    Default credit policy. Status:Charged Off he credit policy. Status:Fully Paid Late (16-30 days) In Grace Period Late (31-120 days) Charged Off Current Fully Paid 0 250000 500000 750000 1000000 1250000 1500000 1750000 2000000 Risk Good 6
  5. Data pre-processing Feature processing • Normalize → numerical data •

    One hot encoding → categorical • Train test ratio 80 : 20% 0 20 40 60 80 100 Number of columns int64 object float64 Data type Number of columns distributed by Data Types 7
  6. Results Good loan Bad loan Predicted label Good loan Bad

    loan True label 392478 249 5306 54101 Confusion matrix train set Good loan Bad loan Predicted label Good loan Bad loan True label 392478 249 5306 54101 Confusion matrix train set 10