$30 off During Our Annual Pro Sale. View Details »

How I Learned to Stop Worrying and Love the Risk

How I Learned to Stop Worrying and Love the Risk

Keynote for MIDAS workshop @ECMLPKDD 2021

Gianmarco De Francisci Morales

September 17, 2021
Tweet

More Decks by Gianmarco De Francisci Morales

Other Decks in Research

Transcript

  1. How I Learned to Stop Worrying and
    Love the Risk
    Gianmarco De Francisci Morales
    Senior Researcher
    ISI Foundation

    View Slide

  2. View Slide

  3. Risk
    "Possibility of something bad happening" -- Cambridge Dictionary
    Effect of uncertainty on objectives
    Financial Risk (e.g., liquidity, systemic)
    Credit: possibility of default on loan
    Market: volatility of equity, currency, interest rate
    Project Risk: possibility of an event with negative outcome on the project

    View Slide

  4. A bit of context
    Collaboration between ISI Foundation and Intesa Sanpaolo
    ISI Foundation = Private, no-profit, fundamental research institute
    Intesa Sanpaolo (ISP) = Largest bank in Italy by capitalization
    Applied research projects of approx. 9 months
    Mixed team of researchers and domain experts

    View Slide

  5. Predicting Corporate Credit Risk:
    Network Contagion via Trade Credit
    PLOS ONE 2020

    View Slide

  6. Problem Setting
    Banks provide credit to companies in various forms (loans, cash advance)
    Credit worthiness (rating) of borrowers
    Banks use credit risk models to assess credit rating
    Affects credit conditions and possible interventions
    Changes with time and affected by context

    View Slide

  7. Pay provider in 90 days = loan
    Default on trade credit more common than default
    towards banks
    Can act as buffer for distress periods
    Network perspective
    Trade network = risk propagation
    Can trigger chain reactions to default events
    Can be used to improve credit risk models?
    Trade Credit
    &XVWRPHUILUP 3URYLGHUILUP
    0RQH\IORZ
    'HIDXOWILUP $GMDFHQWILUP
    5LVNSURSDJDWLRQ

    View Slide

  8. Goal
    Integrate network effects into prediction of default probability P(d)
    At time t, predict whether given firm will default within a short-term horizon (3 months)
    Online prediction task Prequential setting (predict, reveal, advance)
    Find risky firms in advance to enact proactive measures to avoid the default
    Limited resources Ranking task, act on top-k firms
    Probabilistic classifier to model class of interest (default)
    Few examples, high imbalance of class labels Main metric Recall@K
    (K depends on bank resources, here 5%)



    Thus, our target variable Y t
    i
    is a logical ‘or’ of lagged versi
    Y t
    i
    = Dt+1
    i
    _ Dt+2
    i
    _ Dt+3
    i
    .
    Data description
    Data is drawn from a proprietary dataset belonging to Inte
    leading Italian commercial bank. The dataset is highly rep
    1
    The definition of default was introduced by Directive 2006/48/EC
    known as the Capital Requirements Directive – CRD), later replaced
    (CRR). The definition of default of an obligor specified in Article 17
    the days past due criterion for default identification, indications of un
    return to non-defaulted status and treatment of the definition of defa
    2
    Following the financial crisis, the European Banking Authorit
    standards around the definition of default to achieve greater alignme
    A new definition of default need to be implemented by banks by the

    View Slide

  9. Longitudinal data (firms over time)
    Two perimeters (Target and Extended)
    Features from a plethora of different sources (financial statements, central bank
    registry, overdrafts, regulatory risk parameters, credit risk alerts, etc.)
    Challenge:
    Incomplete view (avg. 16% of transactions)
    Network enrichment via record linkage
    Data Model

    View Slide

  10. Network Enrichment
    Bank transfer involving external IBAN and firm name:
    does the firm have an ISP account?
    Match firm name linked to external IBAN with firm name in
    the ISP database
    Training data: variability of spellings of single firm inside ISP firm
    registry (pairs of names referring to same firm)
    For each pair, compute standard string distance metrics as
    features
    Application strategy for the model (multiple-bank phenomenon):
    if a client holds account with different banks, they are likely to
    transfer money between them
    Only test pairs of firms that are linked by a bank transfer
    Increase amount of traced transactions by 450%, coverage by
    200%, and get from 281k links to 826k links per month
    Table 1. Performance of the model for record linkage on the te
    Precision 99.98%
    Recall 73.03%
    F1 measure 84.45%

    View Slide

  11. Direct Contagion
    Percentage of adjacent nodes of insolvent firms at time t that
    experienced a default δ months later, for δ ∈ {3, 6, 9, 12}

    View Slide

  12. Models
    Model fragility of firms to network spillovers
    Capture network spillover effects
    from supply chain on P(d) of each firm
    Sequential modeling approach: output of first single-firm model used in subsequent network model
    First model captures effect of single-firm’s features
    Predicts P(d) of each firm in isolation
    Second model captures network spillovers
    Leverages output of first model, together with network structure and position of firm in the supply chain
    Determines influence of neighborhood of each firm onto the P(d) of the firm
    Single
    Firm
    Model
    Single
    Firm P(d)
    Firm
    Features
    Network Network
    Features
    Network
    Firm P(d)
    Network
    Model

    View Slide

  13. Single-Firm Model
    Prequential validation
    Trained on Extended Perimeter
    Random Forest classifier
    500 trees
    Avg. R@K ~ 64%

    View Slide

  14. Feature Importance
    Local: P(d) predicted by a model based on
    financial features (amount borrowed by the
    firm)
    Rating: P(d) coming from the officially-
    regulated rating model of the specific firm,
    longer time horizon (one year), and uses
    features from the balance sheet of the firm.
    Overdraft: the numbers of days of overdraft
    in the last three months
    Hist: This boolean indicator is 1 if the firm
    has been in default at any point in its past.

    View Slide

  15. Network Spillover Model
    Logistic Regression trained on Target Perimeter
    Features:
    Only network based (no single-firm features)
    Fragility (client and supplier)
    Normalized PPR (Effective Importance)
    Instance weighting by how much we know
    of their transaction network

    View Slide

  16. Personalized PageRank
    How close is the firm to other firms which have had a default?
    Assume risk spreads as random walk
    Restart from nodes Q, uniform over firms in default at with =0.25
    Temporal discounting (for ) does not work better
    Normalize for in-degree of node i
    t′ < t α
    Δt
    = t′ − t
    For our application, Q has non-zero values only for firms which h
    (more on the choice of Q later). Q is also called a restart vector.
    Finally, we compute the feature as the stationary distribution
    described above. This distribution exists and is unique, and can b
    the PPR algorithm. For every other node we compute the PPR fr
    as the solution to the recurrent equation
    PPR↵ = ↵ PPR↵
    M + (1 ↵)Q,
    where M is the row-stochastic adjacency matrix of the transactio
    restart vector distribution, and ↵ is a damping parameter 2 (0, 1).
    the PPR↵
    value obtained by the algorithm to reduce its bias tow
    nodes
    PPR(i) =
    PPR↵(i)
    | N (i)|
    ,
    model that firms closer to a defaulting customer are more likely to be 568
    ection Basic definitions. Therefore, we impose a restart probability 569
    r: with probability ↵ the random walker follows the transaction 570
    robability 1 ↵ it restarts its random walk from its origin. 571
    model that being closer to multiple defaulting customers is likely to 572
    firm. Therefore, we allow the random walk to (re)start from a set 573
    ented by a distribution Q over the nodes of the transaction network. 574
    Q has non-zero values only for firms which have been in default 575
    of Q later). Q is also called a restart vector. 576
    ute the feature as the stationary distribution of the random walker 577
    s distribution exists and is unique, and can be easily computed by 578
    For every other node we compute the PPR from the restart vector 579
    e recurrent equation 580
    PPR↵ = ↵ PPR↵
    M + (1 ↵)Q,

    View Slide

  17. A similar reasoning can be applied in the opposite direction, that is,
    how default risk spreads from suppliers to customers. In this case, the ec
    interpretation is more oriented to the market power of the customer with
    chain. Larger customers, in terms of purchases, have greater market pow
    reflected in the ability to obtain deferred payments and other support m
    suppliers in the event of a liquidity shortage. Moreover, higher is the trad
    the customer i owned by the supplier j, higher is the implicit stake of the
    business. In other words, higher is the customer trade debt to its supplier
    its sensitivity to the supplier’s financial soundness. The FRGs
    coefficien
    expected to be positive.
    The final formulas for computing the fragility is specified as:
    FRGc(i) =
    ARi
    Si
    ⇥ logit
    0
    @
    X
    j2 N (i)
    wji
    P(d)j
    1
    A ,
    FRGs(i) =
    APi
    Pi
    ⇥ logit
    0
    @
    X
    j2
    !
    N (i)
    wij
    P(d)j
    1
    A ,
    where AR and AP are account receivables and account payables, S a
    and purchases, N (i) and
    !
    N (i) are the in-neighbors and out-neighbors of
    transaction network, wij
    is the normalized weight of the edge between i
    P(d) is the probability of default of j as computed by the model in the
    Fragility
    Exposure to risk from network
    Account Receivables = amount of
    revenue in credit to customers
    Sales = revenue from trading
    Weight = normalized transaction
    weight of link from j to i
    P(d) = output of single-firm model

    View Slide

  18. Network Model Performance
    Instance weighting
    incoming amount over sales
    outgoing amount over purchases
    R@K ~ 50% of single-firm model without
    any local information about the firm itself
    Testimony of the power of network
    more complete our knowledge of the supply-chain, the more reliable the 613
    of the influence of the network on the risk of the company will be. 614
    is definitely the case for the fragility features, which explicitly rely on how 615
    the firm’s financial position the network captures, but is also true indirectly for 616
    feature, as the presence or absence of a link (and its weight) clearly affects the 617
    walks which the feature is based on. 618
    hese reasons, we employ an instance weighting scheme, so that the model can 619
    the data points which are more reliable. For each firm i, we define an instance 620
    or the machine learning model as: 621
    W(i) =
    1
    2
    P
    j2 N (i)
    wji
    Si
    +
    P
    j2
    !
    N (i)
    wij
    Pi
    !
    .
    ght is therefore the average of the in-coverage and the out-coverage of the 622
    with respect to the balance sheet data (sales S and purchases P). More in 623
    he first term is the ratio of the sum of in-weights of the network to the sales of 624
    while the second term is the ratio of the sum of out-weights of the network to 625
    hases of the firm. Therefore, for a well-mapped firm this weight will be close to 626
    it will be close to 0 for firms which the network has little information on. 627
    etwork spillover model. The overall combined model is as follows: 628
    Y = f(PPR, FRGc
    , FRGs
    , FRGc
    · FRGs),
    PR is the personalized PageRank, FRGc
    and FRGs
    are the two fragility 629

    View Slide

  19. Feature Importance
    Most important features
    PPR
    Fragility of clients
    Stable over time

    View Slide

  20. Hybrid Model
    XGBoost with single-firm and
    network features
    Feature engineering
    Systematic feature selection
    Deployment in pre-production
    environment
    Fig 11. Recall@K for the XGBoost model with mixed single-firm and network feature
    as a function of time in the prequential setting compared to a Logistic regression mode
    on single-firm features (baseline). The average R@K is 68.1% and the AUC is 90.5%
    Table 5. Performance of the XGBoost model with respect to the baseline on 3
    out-of-time snapshots.
    AUC P@K R@K
    Month Baseline XGBoost Baseline XGBoost Baseline XGBoost
    2018 12 68.0 91.4 3.9 6.5 40.8 68.4
    2019 03 86.3 91.9 7.6 9.9 54.3 70.0
    2019 06 85.6 89.8 7.2 9.6 47.6 63.6

    View Slide

  21. Summary
    Network-based model for short-term default forecasting
    Incorporates trade credit information in credit risk model
    by looking at transaction network
    Network features based on data mining and domain expertise
    Network model alone achieves 50% of recall of single-firm model
    Hybrid model improves over baseline by almost 20 percentage points

    View Slide

  22. Continuous-Action Reinforcement Learning for
    Portfolio Allocation of a Life Insurance Company
    ECMLPKDD 2021

    View Slide

  23. Problem Setting
    Insurance company
    Premia contribute common funds to an investment portfolio
    Clients acquire right to compensation in case of accident (e.g., death)
    Assets and Liabilities are inter-dependent
    More complex than traditional portfolio optimization
    Long time horizon (30y) and sporadic rebalancing

    View Slide

  24. Liabilities
    Derive from insurance contracts with the clients and portfolio performances
    Compensation in case of adverse events
    Annual returns of the common financial portfolio
    Withdrawals might increase whenever these returns are too low
    Annual minimum guaranteed requires the company to integrate the
    difference

    View Slide

  25. Goal
    Optimize risk-adjusted returns of the investment portfolio
    Ensure future liabilities are covered despite market fluctuations
    Liabilities are stochastic and correlated to assets
    Match investment portfolio with due dates of liabilities

    View Slide

  26. Baseline: Modern Portfolio Theory
    Markowitz's mean-variance optimization (1952)
    Efficient frontier: maximum expected return for a given variance level
    Problems:
    Does not consider liabilities and negative cash flows
    Single decision point (rebalancing), no path dependency

    View Slide

  27. Solution
    Model system as a Markov Decision Process (Markov Chain with decisions)
    MDP =
    State = current portfolio, future liabilities (continuous)
    Action = portfolio allocation (point on a k-1 simplex, k available assets, continuous)
    Solve MDP = find optimal policy: (stochastic) mapping of states to actions that
    maximizes the expected reward
    Use Reinforcement Learning to solve MDP

    View Slide

  28. Contributions
    Realistic stochastic model of asset-liability management of insurance
    Definition of risk-adjusted optimization problem, over a pre-determined time
    horizon
    Implementation of custom solution based on Deep Deterministic Policy
    Gradient (DDPG) compatible with standard python libraries for RL

    View Slide

  29. View Slide

  30. Optimization Problem
    Given time horizon T and initial portfolio P(0)
    Find asset allocation for every t ∈[1,T] that
    Maximizes the overall risk-adjusted returns of the portfolio
    Taking into account volatility (standard deviation of the annual returns)
    Respecting financial constraints
    μ = Average return within the same realization,
    σ = risk measure
    X_i = asset allocation at i-th time unit,
    λ = risk-aversion as weight of the volatility,
    = economic scenario
    ε
    AAACDnicdVDLSgMxFM3UV62vqks3wVKomzIj9dFFoehCl1X6gk4dMmmmDU1mhiQjlHG+wI3f4c6NC0Xcunbn35i2Cj4PXDg5515y73FDRqUyzTcjNTM7N7+QXswsLa+srmXXN5oyiAQmDRywQLRdJAmjPmkoqhhph4Ig7jLScofHY791SYSkgV9Xo5B0Oer71KMYKS052bzNI1iBticQjq0krie2jLgT04qVXOhX3zkvqB0nmzOL5XJpz7Lgb2IVzQly1RN4aztX/ZqTfbV7AY448RVmSMqOZYaqGyOhKGYkydiRJCHCQ9QnHU19xInsxpNzEpjXSg96gdDlKzhRv07EiEs54q7u5EgN5E9vLP7ldSLlHXZj6oeRIj6efuRFDKoAjrOBPSoIVmykCcKC6l0hHiCdjNIJZnQIn5fC/0lzt2jtF0tnOo0jMEUabIFtUAAWOABVcApqoAEwuAZ34AE8GjfGvfFkPE9bU8bHzCb4BuPlHdHHnvY=
    µ =
    1
    T
    T
    X
    i=1
    gR(t)
    AAACI3icdVC7SgNBFJ31bXxFLW0GRYiFYVd8RRBEG+1UjArZuMxOZpMhM7vrzF0hLNtZ+Bk2/oqNhSI2Flb+gJ/gJFHweeDCmXPuZe49fiy4Btt+tnp6+/oHBoeGcyOjY+MT+cmpYx0lirIyjUSkTn2imeAhKwMHwU5jxYj0BTvxmztt/+SCKc2j8AhaMatKUg95wCkBI3n5DVfzuiR4E7v6XEHqBorQ1MnSo8woifRSvulkZ+ZZqHuHBVjAi9iVycLZUubl5+xiqbS84jj4N3GKdgdzW3svV6+Nt8t9L//o1iKaSBYCFUTrimPHUE2JAk4Fy3JuollMaJPUWcXQkEimq2nnxgzPG6WGg0iZCgF31K8TKZFat6RvOiWBhv7ptcW/vEoCwXo15WGcAAtp96MgERgi3A4M17hiFETLEEIVN7ti2iAmJTCx5kwIn5fi/8nxUtFZLS4fmDS2URdDaAbNogJy0BraQrtoH5URRdfoFt2jB+vGurMeradua4/1MTONvsF6eQc7Qqh/
    =
    v
    u
    u
    t 1
    T
    T
    X
    i=1
    (gR(t) µ)2
    AAACd3icdZHNbtQwEMedUGhZPrqFG5WKxQLqoV0lVQv0UKkCgTgWqduutI6iidfZtWonkT1BrCK/Ak/Ao8BLtCduPAQXbng3gCgfI1n+6zczng9nlZIWo+hzEF5ZunpteeV658bNW7dXu2t3TmxZGy4GvFSlGWZghZKFGKBEJYaVEaAzJU6zsxdz/+lbYawsi2OcVSLRMClkLjmgR2n3PdOA07JqFrfRDZiJhnfOpQ3LwDRDLyK3xcYl2q1f5Hg7do4e0EvJWda8dI4pqSXatGUcVAtFjiMfXtNtypRvbwyUcf8oZVb6gpQZOZliknZ7UX9/f3cvjunfIu5HC+sdvtr4+OHiy6ejtHvuW+O1FgVyBdaO4qjCxE+BkivhOqy2ogJ+BhMx8rIALWzSLPbm6CNPxjQvjT8F0gX9PaMBbe1MZz5yPoz90zeH//KNasyfJY0sqhpFwdtCea0olnT+CXQsjeCoZl4AN9L3SvkUDHD0X9XxS/g5Kf2/ONnpx0/6u2/8Np6T1lbIOnlANklMnpJD8pockQHh5GtwL+gFD4Nv4f3wcbjZhobBj5y75JKF8Xe8EccE
    argmax
    ¯
    X0,..., ¯
    XT 1
    = E
    E
    [µ · ]

    View Slide

  31. RL Implementation
    Reinforcement Learning: learn how an agent should interact with an environment to maximize
    the expected (across stochastic realizations) cumulative reward
    Actor-Critic schema: agent composed by two modules
    Critic, learns to approximate the reward of an action on a given state
    (approximation of the environment)
    Actor, given a state, learns to produce actions that maximize the value estimated by the critic
    Deep Deterministic Policy Gradient algorithm to produce continuous actions
    Customized extension in order to ensure compliance with financial constraints
    Figure 1: The agent environment - interaction in reinforcement lear
    In Reinforcement Learning, the conceptual framing of the problem is based
    interaction of an agent with an environment. The agent can perform actions that m
    ment, and in turn can receive updated perceptions of the evolution of the environme
    diagram is depicted in Figure 1.
    Besides perceptions, at every iteration the agent receives a reward - a scalar va
    the desirability of the current situation. The reward is used by the agent in or
    good a sequence of actions (often called strategy) is. The starting point is a bla
    agent’s internal parameters randomly initialized to close-to-zero values; in the R
    corresponds to an agent behaving randomly, all actions being initially equivalentl
    time the agent performs an action and receives an updated description of the wo
    a reward value, these bits of knowledge are used to improve the agent’s internal co
    actions and their e↵ects. Despite this single learning step being fundamentally supe
    mentioning that many RL problems present credit assignment problems: the positive
    of a strategy is often evident (and modeled by a reward) after several actions are exe
    13

    View Slide

  32. Financial Constraints
    1. Structural, keep problem mathematically sound, e.g., allocation should be positive and sum to
    one (implemented via softmax architecture)
    2. Parametric, restrict the allocation exposition to desirable ranges, e.g., equity below 14%, and sum
    of all bonds between 20% and 80% (implemented via regularization)
    3. State-dependent, depend on the current state of the simulation, e.g., portfolio turnover limited to
    10% of the current portfolio value (implemented via optimization and projection)
    Additional regulatory constraints considered explicitly
    Keep the current discounted value of future liabilities and the market value of the assets close
    Capital injection/ejection to keep the constraint satisfied (injection equivalent to borrowing cash)

    View Slide

  33. Test Scenarios
    Simplifying assumptions
    Single-decision bandit
    Asset allocation chosen only at t=0
    No rebalancing
    Assets are sold to replenish cash
    at t>0 whenever it becomes
    negative
    Two scenarios
    3 assets: optimal solution
    known with 1% precision
    6 assets: optimal solution
    unfeasible with exhaustive
    search
    Warm-up strategy (pre-training) for
    the critic network

    View Slide

  34. Scenario 1
    Three assets: Cash, Equity, Bond
    Parametric constraint sets 0.17 as upper bound for equity
    Ground truth via set of simulations with 0.01 grid step (5151 actions)
    Extract coarser results by increasing grid step size
    Use coarsest grid (step = 0.20) for warm-up phase of the Critic

    View Slide

  35. Reward Surface

    View Slide

  36. Scenario 1: Ground Truth
    t action found with the given grid, correspond
    mated on the 500 fixed realizations of the econ
    Step # actions Best action Best reward
    0.20 21 [0.0, 0.20, 0.80] 2.552
    0.10 66 [0.0, 0.10, 0.90] 2.707
    0.05 231 [0.0, 0.15, 0.85] 2.790
    0.02 1326 [0.0 ,0.16, 0.84] 2.799
    0.01 5151 [0.0 ,0.17, 0.83] 2.811

    View Slide

  37. Scenario 1: Results

    View Slide

  38. Scenario 2
    Assets = cash, and Italian Bonds with 3, 5, 10, 20, 30 years tenors
    No parametric constraints: any part of action space might contain optimum
    λ set to high value of 4 to avoid optimal solution of only most profitable and
    most risky asset (30y bond)
    Negative cash flows concentrated at 5 and 10 years

    View Slide

  39. Scenario 2: Results

    View Slide

  40. Summary
    RL for asset-liability management of insurance company
    Minimal assumptions: generally applicable to any asset, liability, and
    economic scenario
    Improves on mean-variance optimization via grid Monte-Carlo simulations
    Designed for complex multi-period optimization (testing w.i.p.)
    Risk-adjusted optimization problem could be integrated in MDP formulation

    View Slide

  41. What Could Go Wrong?
    Some war stories

    View Slide

  42. It's a Relationship

    View Slide

  43. It's a Relationship
    Communicate effectively
    Common language

    View Slide

  44. It's a Relationship
    Communicate effectively
    Common language
    Show that you care
    Understand the problem
    Acquire domain knowledge

    View Slide

  45. It's a Relationship
    Communicate effectively
    Common language
    Show that you care
    Understand the problem
    Acquire domain knowledge
    Build trust
    Deliver on your promises

    View Slide

  46. It's Research

    View Slide

  47. It's Research
    "Works on Paper"
    10x more effort to run on realistic
    scenarios

    View Slide

  48. It's Research
    "Works on Paper"
    10x more effort to run on realistic
    scenarios
    Scope creep
    "appetite comes with eating"

    View Slide

  49. It's Research
    "Works on Paper"
    10x more effort to run on realistic
    scenarios
    Scope creep
    "appetite comes with eating"
    No ground truths
    Hard to generate trust

    View Slide

  50. The Right Level of Abstraction
    Computer scientists like crisp,
    abstract problem definitions
    Not applicable nor useful
    Domain experts want a problem
    definition as realistic as possible
    Unfeasible or too complex
    Getting it right is 50% of the job
    © Christoph Niemann
    https://www.christophniemann.com

    View Slide

  51. Interpretability
    Regulatory reasons
    Humans want to know
    Causal reasoning
    Trust
    Simple Complex

    View Slide

  52. Acknowledgements
    Claudia Berloco
    Daniele Frassineti
    Greta Greco
    Hashani Kumarasinghe
    Marco Lamieri
    Emanuele Massaro
    Arianna Miola
    Shuyi Yang
    Carlo Abrate
    Alessio Angius
    Alan Perotti
    Stefano Cozzini
    Francesca Iadanza
    Laura Li Puma
    Simone Pavanelli
    Stefano Pignataro
    Silvia Ronchiadin

    View Slide

  53. [email protected]
    Thanks!
    @gdfm7

    View Slide