Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How I Learned to Stop Worrying and Love the Risk

How I Learned to Stop Worrying and Love the Risk

Keynote for MIDAS workshop @ECMLPKDD 2021

Gianmarco De Francisci Morales

September 17, 2021
Tweet

More Decks by Gianmarco De Francisci Morales

Other Decks in Research

Transcript

  1. How I Learned to Stop Worrying and Love the Risk

    Gianmarco De Francisci Morales Senior Researcher ISI Foundation
  2. Risk "Possibility of something bad happening" -- Cambridge Dictionary Effect

    of uncertainty on objectives Financial Risk (e.g., liquidity, systemic) Credit: possibility of default on loan Market: volatility of equity, currency, interest rate Project Risk: possibility of an event with negative outcome on the project
  3. A bit of context Collaboration between ISI Foundation and Intesa

    Sanpaolo ISI Foundation = Private, no-profit, fundamental research institute Intesa Sanpaolo (ISP) = Largest bank in Italy by capitalization Applied research projects of approx. 9 months Mixed team of researchers and domain experts
  4. Problem Setting Banks provide credit to companies in various forms

    (loans, cash advance) Credit worthiness (rating) of borrowers Banks use credit risk models to assess credit rating Affects credit conditions and possible interventions Changes with time and affected by context
  5. Pay provider in 90 days = loan Default on trade

    credit more common than default towards banks Can act as buffer for distress periods Network perspective Trade network = risk propagation Can trigger chain reactions to default events Can be used to improve credit risk models? Trade Credit &XVWRPHUILUP 3URYLGHUILUP 0RQH\IORZ 'HIDXOWILUP $GMDFHQWILUP 5LVNSURSDJDWLRQ
  6. Goal Integrate network effects into prediction of default probability P(d)

    At time t, predict whether given firm will default within a short-term horizon (3 months) Online prediction task Prequential setting (predict, reveal, advance) Find risky firms in advance to enact proactive measures to avoid the default Limited resources Ranking task, act on top-k firms Probabilistic classifier to model class of interest (default) Few examples, high imbalance of class labels Main metric Recall@K (K depends on bank resources, here 5%) → → → Thus, our target variable Y t i is a logical ‘or’ of lagged versi Y t i = Dt+1 i _ Dt+2 i _ Dt+3 i . Data description Data is drawn from a proprietary dataset belonging to Inte leading Italian commercial bank. The dataset is highly rep 1 The definition of default was introduced by Directive 2006/48/EC known as the Capital Requirements Directive – CRD), later replaced (CRR). The definition of default of an obligor specified in Article 17 the days past due criterion for default identification, indications of un return to non-defaulted status and treatment of the definition of defa 2 Following the financial crisis, the European Banking Authorit standards around the definition of default to achieve greater alignme A new definition of default need to be implemented by banks by the
  7. Longitudinal data (firms over time) Two perimeters (Target and Extended)

    Features from a plethora of different sources (financial statements, central bank registry, overdrafts, regulatory risk parameters, credit risk alerts, etc.) Challenge: Incomplete view (avg. 16% of transactions) Network enrichment via record linkage Data Model
  8. Network Enrichment Bank transfer involving external IBAN and firm name:

    does the firm have an ISP account? Match firm name linked to external IBAN with firm name in the ISP database Training data: variability of spellings of single firm inside ISP firm registry (pairs of names referring to same firm) For each pair, compute standard string distance metrics as features Application strategy for the model (multiple-bank phenomenon): if a client holds account with different banks, they are likely to transfer money between them Only test pairs of firms that are linked by a bank transfer Increase amount of traced transactions by 450%, coverage by 200%, and get from 281k links to 826k links per month Table 1. Performance of the model for record linkage on the te Precision 99.98% Recall 73.03% F1 measure 84.45%
  9. Direct Contagion Percentage of adjacent nodes of insolvent firms at

    time t that experienced a default δ months later, for δ ∈ {3, 6, 9, 12}
  10. Models Model fragility of firms to network spillovers Capture network

    spillover effects from supply chain on P(d) of each firm Sequential modeling approach: output of first single-firm model used in subsequent network model First model captures effect of single-firm’s features Predicts P(d) of each firm in isolation Second model captures network spillovers Leverages output of first model, together with network structure and position of firm in the supply chain Determines influence of neighborhood of each firm onto the P(d) of the firm Single Firm Model Single Firm P(d) Firm Features Network Network Features Network Firm P(d) Network Model
  11. Feature Importance Local: P(d) predicted by a model based on

    financial features (amount borrowed by the firm) Rating: P(d) coming from the officially- regulated rating model of the specific firm, longer time horizon (one year), and uses features from the balance sheet of the firm. Overdraft: the numbers of days of overdraft in the last three months Hist: This boolean indicator is 1 if the firm has been in default at any point in its past.
  12. Network Spillover Model Logistic Regression trained on Target Perimeter Features:

    Only network based (no single-firm features) Fragility (client and supplier) Normalized PPR (Effective Importance) Instance weighting by how much we know of their transaction network
  13. Personalized PageRank How close is the firm to other firms

    which have had a default? Assume risk spreads as random walk Restart from nodes Q, uniform over firms in default at with =0.25 Temporal discounting (for ) does not work better Normalize for in-degree of node i t′ < t α Δt = t′ − t For our application, Q has non-zero values only for firms which h (more on the choice of Q later). Q is also called a restart vector. Finally, we compute the feature as the stationary distribution described above. This distribution exists and is unique, and can b the PPR algorithm. For every other node we compute the PPR fr as the solution to the recurrent equation PPR↵ = ↵ PPR↵ M + (1 ↵)Q, where M is the row-stochastic adjacency matrix of the transactio restart vector distribution, and ↵ is a damping parameter 2 (0, 1). the PPR↵ value obtained by the algorithm to reduce its bias tow nodes PPR(i) = PPR↵(i) | N (i)| , model that firms closer to a defaulting customer are more likely to be 568 ection Basic definitions. Therefore, we impose a restart probability 569 r: with probability ↵ the random walker follows the transaction 570 robability 1 ↵ it restarts its random walk from its origin. 571 model that being closer to multiple defaulting customers is likely to 572 firm. Therefore, we allow the random walk to (re)start from a set 573 ented by a distribution Q over the nodes of the transaction network. 574 Q has non-zero values only for firms which have been in default 575 of Q later). Q is also called a restart vector. 576 ute the feature as the stationary distribution of the random walker 577 s distribution exists and is unique, and can be easily computed by 578 For every other node we compute the PPR from the restart vector 579 e recurrent equation 580 PPR↵ = ↵ PPR↵ M + (1 ↵)Q,
  14. A similar reasoning can be applied in the opposite direction,

    that is, how default risk spreads from suppliers to customers. In this case, the ec interpretation is more oriented to the market power of the customer with chain. Larger customers, in terms of purchases, have greater market pow reflected in the ability to obtain deferred payments and other support m suppliers in the event of a liquidity shortage. Moreover, higher is the trad the customer i owned by the supplier j, higher is the implicit stake of the business. In other words, higher is the customer trade debt to its supplier its sensitivity to the supplier’s financial soundness. The FRGs coefficien expected to be positive. The final formulas for computing the fragility is specified as: FRGc(i) = ARi Si ⇥ logit 0 @ X j2 N (i) wji P(d)j 1 A , FRGs(i) = APi Pi ⇥ logit 0 @ X j2 ! N (i) wij P(d)j 1 A , where AR and AP are account receivables and account payables, S a and purchases, N (i) and ! N (i) are the in-neighbors and out-neighbors of transaction network, wij is the normalized weight of the edge between i P(d) is the probability of default of j as computed by the model in the Fragility Exposure to risk from network Account Receivables = amount of revenue in credit to customers Sales = revenue from trading Weight = normalized transaction weight of link from j to i P(d) = output of single-firm model
  15. Network Model Performance Instance weighting incoming amount over sales outgoing

    amount over purchases R@K ~ 50% of single-firm model without any local information about the firm itself Testimony of the power of network more complete our knowledge of the supply-chain, the more reliable the 613 of the influence of the network on the risk of the company will be. 614 is definitely the case for the fragility features, which explicitly rely on how 615 the firm’s financial position the network captures, but is also true indirectly for 616 feature, as the presence or absence of a link (and its weight) clearly affects the 617 walks which the feature is based on. 618 hese reasons, we employ an instance weighting scheme, so that the model can 619 the data points which are more reliable. For each firm i, we define an instance 620 or the machine learning model as: 621 W(i) = 1 2 P j2 N (i) wji Si + P j2 ! N (i) wij Pi ! . ght is therefore the average of the in-coverage and the out-coverage of the 622 with respect to the balance sheet data (sales S and purchases P). More in 623 he first term is the ratio of the sum of in-weights of the network to the sales of 624 while the second term is the ratio of the sum of out-weights of the network to 625 hases of the firm. Therefore, for a well-mapped firm this weight will be close to 626 it will be close to 0 for firms which the network has little information on. 627 etwork spillover model. The overall combined model is as follows: 628 Y = f(PPR, FRGc , FRGs , FRGc · FRGs), PR is the personalized PageRank, FRGc and FRGs are the two fragility 629
  16. Hybrid Model XGBoost with single-firm and network features Feature engineering

    Systematic feature selection Deployment in pre-production environment Fig 11. Recall@K for the XGBoost model with mixed single-firm and network feature as a function of time in the prequential setting compared to a Logistic regression mode on single-firm features (baseline). The average R@K is 68.1% and the AUC is 90.5% Table 5. Performance of the XGBoost model with respect to the baseline on 3 out-of-time snapshots. AUC P@K R@K Month Baseline XGBoost Baseline XGBoost Baseline XGBoost 2018 12 68.0 91.4 3.9 6.5 40.8 68.4 2019 03 86.3 91.9 7.6 9.9 54.3 70.0 2019 06 85.6 89.8 7.2 9.6 47.6 63.6
  17. Summary Network-based model for short-term default forecasting Incorporates trade credit

    information in credit risk model by looking at transaction network Network features based on data mining and domain expertise Network model alone achieves 50% of recall of single-firm model Hybrid model improves over baseline by almost 20 percentage points
  18. Problem Setting Insurance company Premia contribute common funds to an

    investment portfolio Clients acquire right to compensation in case of accident (e.g., death) Assets and Liabilities are inter-dependent More complex than traditional portfolio optimization Long time horizon (30y) and sporadic rebalancing
  19. Liabilities Derive from insurance contracts with the clients and portfolio

    performances Compensation in case of adverse events Annual returns of the common financial portfolio Withdrawals might increase whenever these returns are too low Annual minimum guaranteed requires the company to integrate the difference
  20. Goal Optimize risk-adjusted returns of the investment portfolio Ensure future

    liabilities are covered despite market fluctuations Liabilities are stochastic and correlated to assets Match investment portfolio with due dates of liabilities
  21. Baseline: Modern Portfolio Theory Markowitz's mean-variance optimization (1952) Efficient frontier:

    maximum expected return for a given variance level Problems: Does not consider liabilities and negative cash flows Single decision point (rebalancing), no path dependency
  22. Solution Model system as a Markov Decision Process (Markov Chain

    with decisions) MDP = <States, Actions, Transition Probabilities, Rewards> State = current portfolio, future liabilities (continuous) Action = portfolio allocation (point on a k-1 simplex, k available assets, continuous) Solve MDP = find optimal policy: (stochastic) mapping of states to actions that maximizes the expected reward Use Reinforcement Learning to solve MDP
  23. Contributions Realistic stochastic model of asset-liability management of insurance Definition

    of risk-adjusted optimization problem, over a pre-determined time horizon Implementation of custom solution based on Deep Deterministic Policy Gradient (DDPG) compatible with standard python libraries for RL
  24. Optimization Problem Given time horizon T and initial portfolio P(0)

    Find asset allocation for every t ∈[1,T] that Maximizes the overall risk-adjusted returns of the portfolio Taking into account volatility (standard deviation of the annual returns) Respecting financial constraints μ = Average return within the same realization, σ = risk measure X_i = asset allocation at i-th time unit, λ = risk-aversion as weight of the volatility, = economic scenario ε <latexit sha1_base64="WbIvtbyYe67Ttc0DuA85slV8Tbs=">AAACDnicdVDLSgMxFM3UV62vqks3wVKomzIj9dFFoehCl1X6gk4dMmmmDU1mhiQjlHG+wI3f4c6NC0Xcunbn35i2Cj4PXDg5515y73FDRqUyzTcjNTM7N7+QXswsLa+srmXXN5oyiAQmDRywQLRdJAmjPmkoqhhph4Ig7jLScofHY791SYSkgV9Xo5B0Oer71KMYKS052bzNI1iBticQjq0krie2jLgT04qVXOhX3zkvqB0nmzOL5XJpz7Lgb2IVzQly1RN4aztX/ZqTfbV7AY448RVmSMqOZYaqGyOhKGYkydiRJCHCQ9QnHU19xInsxpNzEpjXSg96gdDlKzhRv07EiEs54q7u5EgN5E9vLP7ldSLlHXZj6oeRIj6efuRFDKoAjrOBPSoIVmykCcKC6l0hHiCdjNIJZnQIn5fC/0lzt2jtF0tnOo0jMEUabIFtUAAWOABVcApqoAEwuAZ34AE8GjfGvfFkPE9bU8bHzCb4BuPlHdHHnvY=</latexit> µ = 1 T T X i=1 gR(t) <latexit sha1_base64="2RGORJdim15bmSO8QotV/2XD2ek=">AAACI3icdVC7SgNBFJ31bXxFLW0GRYiFYVd8RRBEG+1UjArZuMxOZpMhM7vrzF0hLNtZ+Bk2/oqNhSI2Flb+gJ/gJFHweeDCmXPuZe49fiy4Btt+tnp6+/oHBoeGcyOjY+MT+cmpYx0lirIyjUSkTn2imeAhKwMHwU5jxYj0BTvxmztt/+SCKc2j8AhaMatKUg95wCkBI3n5DVfzuiR4E7v6XEHqBorQ1MnSo8woifRSvulkZ+ZZqHuHBVjAi9iVycLZUubl5+xiqbS84jj4N3GKdgdzW3svV6+Nt8t9L//o1iKaSBYCFUTrimPHUE2JAk4Fy3JuollMaJPUWcXQkEimq2nnxgzPG6WGg0iZCgF31K8TKZFat6RvOiWBhv7ptcW/vEoCwXo15WGcAAtp96MgERgi3A4M17hiFETLEEIVN7ti2iAmJTCx5kwIn5fi/8nxUtFZLS4fmDS2URdDaAbNogJy0BraQrtoH5URRdfoFt2jB+vGurMeradua4/1MTONvsF6eQc7Qqh/</latexit> = v u u t 1 T T X i=1 (gR(t) µ)2 <latexit sha1_base64="EcmDIknNj84kiC/tU4JfponP2NI=">AAACd3icdZHNbtQwEMedUGhZPrqFG5WKxQLqoV0lVQv0UKkCgTgWqduutI6iidfZtWonkT1BrCK/Ak/Ao8BLtCduPAQXbng3gCgfI1n+6zczng9nlZIWo+hzEF5ZunpteeV658bNW7dXu2t3TmxZGy4GvFSlGWZghZKFGKBEJYaVEaAzJU6zsxdz/+lbYawsi2OcVSLRMClkLjmgR2n3PdOA07JqFrfRDZiJhnfOpQ3LwDRDLyK3xcYl2q1f5Hg7do4e0EvJWda8dI4pqSXatGUcVAtFjiMfXtNtypRvbwyUcf8oZVb6gpQZOZliknZ7UX9/f3cvjunfIu5HC+sdvtr4+OHiy6ejtHvuW+O1FgVyBdaO4qjCxE+BkivhOqy2ogJ+BhMx8rIALWzSLPbm6CNPxjQvjT8F0gX9PaMBbe1MZz5yPoz90zeH//KNasyfJY0sqhpFwdtCea0olnT+CXQsjeCoZl4AN9L3SvkUDHD0X9XxS/g5Kf2/ONnpx0/6u2/8Np6T1lbIOnlANklMnpJD8pockQHh5GtwL+gFD4Nv4f3wcbjZhobBj5y75JKF8Xe8EccE</latexit> argmax ¯ X0,..., ¯ XT 1 = E E [µ · ]
  25. RL Implementation Reinforcement Learning: learn how an agent should interact

    with an environment to maximize the expected (across stochastic realizations) cumulative reward Actor-Critic schema: agent composed by two modules Critic, learns to approximate the reward of an action on a given state (approximation of the environment) Actor, given a state, learns to produce actions that maximize the value estimated by the critic Deep Deterministic Policy Gradient algorithm to produce continuous actions Customized extension in order to ensure compliance with financial constraints Figure 1: The agent environment - interaction in reinforcement lear In Reinforcement Learning, the conceptual framing of the problem is based interaction of an agent with an environment. The agent can perform actions that m ment, and in turn can receive updated perceptions of the evolution of the environme diagram is depicted in Figure 1. Besides perceptions, at every iteration the agent receives a reward - a scalar va the desirability of the current situation. The reward is used by the agent in or good a sequence of actions (often called strategy) is. The starting point is a bla agent’s internal parameters randomly initialized to close-to-zero values; in the R corresponds to an agent behaving randomly, all actions being initially equivalentl time the agent performs an action and receives an updated description of the wo a reward value, these bits of knowledge are used to improve the agent’s internal co actions and their e↵ects. Despite this single learning step being fundamentally supe mentioning that many RL problems present credit assignment problems: the positive of a strategy is often evident (and modeled by a reward) after several actions are exe 13
  26. Financial Constraints 1. Structural, keep problem mathematically sound, e.g., allocation

    should be positive and sum to one (implemented via softmax architecture) 2. Parametric, restrict the allocation exposition to desirable ranges, e.g., equity below 14%, and sum of all bonds between 20% and 80% (implemented via regularization) 3. State-dependent, depend on the current state of the simulation, e.g., portfolio turnover limited to 10% of the current portfolio value (implemented via optimization and projection) Additional regulatory constraints considered explicitly Keep the current discounted value of future liabilities and the market value of the assets close Capital injection/ejection to keep the constraint satisfied (injection equivalent to borrowing cash)
  27. Test Scenarios Simplifying assumptions Single-decision bandit Asset allocation chosen only

    at t=0 No rebalancing Assets are sold to replenish cash at t>0 whenever it becomes negative Two scenarios 3 assets: optimal solution known with 1% precision 6 assets: optimal solution unfeasible with exhaustive search Warm-up strategy (pre-training) for the critic network
  28. Scenario 1 Three assets: Cash, Equity, Bond Parametric constraint sets

    0.17 as upper bound for equity Ground truth via set of simulations with 0.01 grid step (5151 actions) Extract coarser results by increasing grid step size Use coarsest grid (step = 0.20) for warm-up phase of the Critic
  29. Scenario 1: Ground Truth t action found with the given

    grid, correspond mated on the 500 fixed realizations of the econ Step # actions Best action Best reward 0.20 21 [0.0, 0.20, 0.80] 2.552 0.10 66 [0.0, 0.10, 0.90] 2.707 0.05 231 [0.0, 0.15, 0.85] 2.790 0.02 1326 [0.0 ,0.16, 0.84] 2.799 0.01 5151 [0.0 ,0.17, 0.83] 2.811
  30. Scenario 2 Assets = cash, and Italian Bonds with 3,

    5, 10, 20, 30 years tenors No parametric constraints: any part of action space might contain optimum λ set to high value of 4 to avoid optimal solution of only most profitable and most risky asset (30y bond) Negative cash flows concentrated at 5 and 10 years
  31. Summary RL for asset-liability management of insurance company Minimal assumptions:

    generally applicable to any asset, liability, and economic scenario Improves on mean-variance optimization via grid Monte-Carlo simulations Designed for complex multi-period optimization (testing w.i.p.) Risk-adjusted optimization problem could be integrated in MDP formulation
  32. It's a Relationship Communicate effectively Common language Show that you

    care Understand the problem Acquire domain knowledge
  33. It's a Relationship Communicate effectively Common language Show that you

    care Understand the problem Acquire domain knowledge Build trust Deliver on your promises
  34. It's Research "Works on Paper" 10x more effort to run

    on realistic scenarios Scope creep "appetite comes with eating"
  35. It's Research "Works on Paper" 10x more effort to run

    on realistic scenarios Scope creep "appetite comes with eating" No ground truths Hard to generate trust
  36. The Right Level of Abstraction Computer scientists like crisp, abstract

    problem definitions Not applicable nor useful Domain experts want a problem definition as realistic as possible Unfeasible or too complex Getting it right is 50% of the job © Christoph Niemann https://www.christophniemann.com
  37. Acknowledgements Claudia Berloco Daniele Frassineti Greta Greco Hashani Kumarasinghe Marco

    Lamieri Emanuele Massaro Arianna Miola Shuyi Yang Carlo Abrate Alessio Angius Alan Perotti Stefano Cozzini Francesca Iadanza Laura Li Puma Simone Pavanelli Stefano Pignataro Silvia Ronchiadin