Get Started
Category: Article

EDGE: GNMA Forbearance End Date Distribution

With 2021 underway and the first wave of pandemic-related FHA forbearances set to begin hitting their 12-month caps as early as March, now seems like a good time to summarize where things stand. Forbearance in mortgages backing GNMA securities continues to significantly outpace forbearance in GSE-backed loans, with 7.6% of GNMA loans in forbearance compared to 3.5% for Fannie and Freddie borrowers.[1] Both statistics have slowly declined over the past few months.

Notably, the share of forbearance varies greatly amongst GNMA cohorts, with some cohorts having more than 15% of their loans in forbearance. In the table below, we show the percentage of loans in forbearance for significant cohorts of GN2 30yr Multi-lender pools.

Percent of Loans in Forbearance for GNMA2 30yr Multi-lender Pools:

Cohorts larger than $25 billion. Forbearance as of December 2020 factor date.

Not surprisingly, newer production tends to experience much lower levels of forbearance. Those cohorts are dominated by newly refinanced loans and are comprised mostly of borrowers that have not struggled to make mortgage payments. Conversely, 2017-2019 vintage 3s through 4.5s show much higher forbearance, most likely due to survivor bias – loans in forbearance tend not to refinance and are left behind in the pool. The survivor bias also becomes apparent when you move up the coupon stack within a vintage. Higher coupons tend to see more refinancing activity, and that activity leaves behind a higher proportion of borrowers who cannot refinance due to the very same economic hardships that are requiring their loans to be in forbearance.

GNMA also reports the forbearance end date and length of the forbearance period for each loan. The table below summarizes the distribution of forbearance end dates across all GNMA production. This date is the last month of the currently requested forbearance period.[2]

For loans with forbearance ending in December 2020 (last month), half have taken a total of 9 months of forbearance, with most of the remaining loans taking either three or six months of forbearance.

For loans whose forbearance period rolls in January and February 2021, the total months of forbearance is evenly distributed between 3, 6, and 9 months. Among loans with a forbearance end date of March 2021, more than half will have taken their maximum twelve months of forbearance.[3]

In the chart below, we illustrate how things would look if every Ginnie Mae loan currently in forbearance extended to its full twelve-month maximum. As this analysis shows, a plurality of these mortgages – more than 25 percent — would have a forbearance end date of March 2021, with the remaining forbearance periods expiring later in 2021.

A successful vaccination program is expected to stabilize the economy and (hopefully) end the need for wide-scale forbearance programs. The timing of this economic normalization is unclear, however, and the distribution of current end dates, as illustrated above, suggests that the existing forbearance period may need to be extended for some borrowers in order to forestall a potentially catastrophic credit-driven prepayment spike in GNMA securities.

Contact us if you interested in seeing variations on this theme. Using Edge, we can examine any loan characteristic and generate a S-curve, aging curve, or time series.

[1] As of the December 2020 factor date, using the data reported by the GSEs and GNMA. This data may differ marginally from the Mortgage Bankers Association survey, which is a weekly survey of mortgage servicers.

[2] Data as of the December 2020 factor date.

[3] Charts of January, February and March 2021 rolls are omitted for brevity. See RiskSpan for a copy of these charts.

EDGE: COVID Forbearance and Non-Bank Buyouts

November saw a significant jump in GNMA buyouts for loans serviced by Lakeview. Initially, we suspected that Lakeview was catching up from nearly zero buyout activity in the prior months, and that perhaps the servicer was doing this to keep in front of GNMA’s requirement to keep seriously delinquent loans below the5% of UPB threshold. [1]

Buyout rates for some major non-bank servicers.

Using EDGE to dig further, we noticed that Lakeview’s buyouts affected both multi-lender and custom pools in similar proportions and were evenly split between loans with an active COVID forbearance and loans that were “naturally” delinquent.

The month-on-month jump in Lakeview buyouts on forborne loans is notable. The graph below plots Lakeview’s buyout rate (CBR) for loans that are 90-days+ delinquent.

Further, the buyouts were skewed towards premium coupons. Given this, it is plausible that the buyouts are economically driven [2] and that Lakeview is now starting to repurchase and warehouse delinquent loans, something that non-banks have struggled with due to balance sheet and funding constraints.

Where do the current exposures lie? The table below summarizes Lakeview’s 60-day+ delinquencies for loans in GN2 multi-lender pools, for coupons and vintages where Lakeview services a significant portion of the cohort. Not surprisingly, the greatest exposure lies in recent-vintage 4s through 5s.

To lend some perspective, in June 2020 Wells serviced around one-third of 2012-13 vintage 3.5s and approximately 8% of its loans were 60-days delinquent, all non-COVID related.

This analysis does not include other non-bank servicers. As a group, non-bank servicers now service more than 80% of recent-vintage GN2 loans in multi-lender pools. The Lakeview example reflects mounting evidence that COVID forbearance is not an impediment to repurchasing delinquent loans.

If you interested in seeing variations on this theme, contact us. Using Edge, we can examine any loan characteristic and generate a S-curve, aging curve, or time series.

[1] Large servicers are required to keep 90-day+ delinquencies below 5% of their overall UPB. GNMA has exempted loans that are in COVID forbearance from this tally.

[2] Servicers can repurchase GNMA loans that have missed 3 or more payments at par. If these loans cure, either naturally or due to modification, the servicer can deliver them into a new security. Given that nearly all GNMA passthroughs trade at a significant premium to par, this redelivery can create a substantial arbitrage opportunity, even after accounting for the trial period for the modification.

Chart of the Month: Fed Impact on Credit ETF Performance

On March 23rd, The Fed announced that its Secondary Market Corporate Credit Facility (SMCCF) would begin purchasing investment-grade corporate bonds in the secondary market, first through ETFs and directly in a later phase. 

In June, we charted the impact of this announcement on the credit spreads of various corporate bonds. This month we are charting its impact on ETF performance.

This month’s chart plots the price of ETFs relative to their price as of March 23rd 2020 (i.e., all ETF prices are set to 1.00 as of that date). Data runs from Feb 24th to Nov 16th 2020.

EDGE: Unexplained Prepayments on HFAs — An Update

In early October, we highlighted a large buyout event for FNMA pools serviced by Idaho HFA, the largest servicer of HFA loans. On October 28, FNMA officially announced that there were 544 base-pools with erroneous prepayments due to servicer reporting error. The announcement doesn’t mention the servicer of the affected pools, but when we look at pools that are single-servicer, every one of those pools is serviced by Idaho HFA.

FNMA reports the “September 2020 Impacted Principal Paydown” at $133MM. The September reported prepayment for FNMA Idaho HFA pools was 43 CPR on a total of just over $6B UPB. If we add back the principal from the impacted paydown, the speed should have been 26 CPR, which is closer to the Freddie-reported 25 CPR.

FNMA provides an announcement here and list of pools here. According to the announcement, FNMA will not be reversing the buyout but instead recommends that affected investors start a claims process. We note that Idaho HFA prepayment speeds will continue to show these erroneous buyouts in the October factor date.

Contact us to try Edge for free.

RiskSpan VQI: Current Underwriting Standards Q3 2020

RiskSpan’s Vintage Quality Index, which had declined sharply in the first half of the year, leveled off somewhat in the third quarter, falling just 2.8 points between June and September, in contrast to its 12 point drop in Q2.

This change, which reflects a relative slowdown in the tightening of underwriting standards reflects something of a return to stability in the Agency origination market.

Driven by a drop in cash-out refinances (down 2.3% in the quarter), the VQI’s gradual decline left the standard credit-related risk attributes (FICO, LTV, and DTI) largely unchanged.

The share of High-LTV loans (loans with loan-to-value ratios over 80%) which fell 1.3% in Q3, has fallen dramatically over the last year–1.7% in total. More than half of this drop (6.1%) occurred before the start of the COVID-19 crisis. This suggests that, even though the Q3 VQI reflects tightening underwriting standards, the stability of the credit-related components, coupled with huge volumes from the GSEs, reflects a measure of stability in credit availability.

Risk Layers – September 20 – All Issued Loans By Count

Risk Layers – September 20 – All Issued Loans By Count

Analytical And Data Assumptions

Population assumptions:

  • Monthly data for Fannie Mae and Freddie Mac.

  • Loans originated more than three months prior to issuance are excluded because the index is meant to reflect current market conditions.

  • Loans likely to have been originated through the HARP program, as identified by LTV, MI coverage percentage, and loan purpose are also excluded. These loans do not represent credit availability in the market as they likely would not have been originated today but for the existence of HARP.                                                                                                                          

Data assumptions:

  • Freddie Mac data goes back to 12/2005. Fannie Mae only back to 12/2014.

  • Certain fields for Freddie Mac data were missing prior to 6/2008.   

GSE historical loan performance data release in support of GSE Risk Transfer activities was used to help back-fill data where it was missing.

An outline of our approach to data imputation can be found in our VQI Blog Post from October 28, 2015.                                                

LIBOR Transition: Winning the Fourth Quarter

In July 2017, the United Kingdom’s Financial Conduct Authority (FCA) announced that financial institutions will no longer be required to publish LIBOR rates after December 2021, signaling the effective end of LIBOR. Given that the FCA provided a four-year transition period for market participants to identify and implement alternative reference rates, market participants are rapidly approaching the “fourth quarter” of the transition away from LIBOR. 

Winning in the fourth quarter is more difficult when you finish the third quarter down by 28 points. And so, it is critical that institutions assess their progress to date in preparing for the cessation of LIBOR and making plans to implement an alternative reference rate. At this stage of an institution’s transition plan, a number of milestones need to be completed in order for an institution to reasonably consider itself “on-track.”  

These include having the workstreams listed below and a detailed plan in place to complete the execution of these tasks over the next year: 

  • LIBOR Transition Project Team Established – Financial institutions should have established a dedicated project team responsible for managing the transition from LIBOR. For larger institutions with LIBOR exposure in multiple business units, business unit leaders should be identified and made responsible for LIBOR transition activities in their business unit. 
  • Identification of LIBOR Exposure – Legacy contracts should already have been evaluated and exposure to LIBOR products maturing beyond year-end 2021 should have been quantified. During the upcoming year, monthly and quarterly updates on LIBOR exposure should be communicated to management. 
  • Assessment of LIBOR Contracts – Contracts should be reviewed to determine whether clear fallback language has been incorporated. Contracts with a) clear fallback language, b) fallback language requiring legal interpretation, and c) no fallback language must be identified and inventoried. 
  • Remediate Contracts without Clear Fallback Language – For contracts without adequate fallback language, institutions need to identify and finalize options for alternative reference rates, remediation plans, and a communication strategy with stakeholders when LIBOR is terminated. 
  • Assess Financial Exposure to Alternative Reference Rates – Because institutions will likely be impacted by exposure to alternative reference rates beginning in January 2022, plans need to be in the works for performing analyses on how the new alternative reference rate is likely to impact income, funding, liquidity, and capital levels.  
  • Stop Use of LIBOR on New Products – It may not need to be said, but one of the most effective methods of mitigating LIBOR exposure is to stop creating new LIBOR products.  To the extent new LIBOR products need to be issued, institutions must ensure that clear, easy-to-follow fallback language has been incorporated. 
  • Update and Remediate Technology – LIBOR is likely embedded in many applications and systems that set pricing on products, determine contractual payments, and determine the fair value for instruments. Plans need to be developed and implemented to update and test technology applications with LIBOR exposure.  

Consider engaging with external data and technology vendors to ensure operational readiness to transition away from LIBOR. Each business line and core function such as Finance or Treasury needs to inventory technology, operations, and modeling tools to ensure every LIBOR touch point is properly accounted for. 

  • Validate Models With LIBOR Assumptions – As we discussed last month, many models rely on LIBOR as an assumption or as part of the cash flow discounting mechanism.  Validators of models transitioning from LIBOR to an alternative reference rate need to account for this. And unscheduled validations may become necessary for models that might not otherwise be up for review before the end of 2021. 

The cessation of LIBOR is a significant event impacting a broad set of financial products and market segments. Because it is intertwined in the products, technology, and models of a financial institution, LIBOR transition must be sufficiently planned, resources must be mobilized, and alternative reference rates must be implemented into every business and process.  

The “fourth quarter” of the LIBOR transition game is upon us and the stakes are too high to rely on the second string. Financial institutions cannot underestimate the operational, technical, legal, communication, and risk management work required to move existing transactions off LIBOR and prepare for alternative reference rates. Although these efforts to transition from LIBOR should already be in full swing, they will continue to require additional time and resources. Teams that seem to be in control of the game still need to finish strong.  

Financial institutions that have not begun a comprehensive LIBOR transition plan are running out of time and will need to mount a furious fourth-quarter comeback. It’s not too late, but with the last year of the LIBOR transition dawning, financial institutions that are behind in their planning need to hustle. No one can afford to lose this game. The costs of failing to prepare are simply too high. 

EDGE: Unexplained Behavior for Idaho HFA

People familiar with specified pool trading recognize pools serviced by the state housing finance authorities as an expanding sector with a rich set of behavior. The Idaho Housing Finance Authority leads all HFAs in servicing volume, with roughly $18B in Fannie, Freddie and Ginnie loans.[1]

In the October prepay report, an outsized acceleration in speeds on FNMA pools serviced by the Idaho HFA caught our attention because no similar acceleration was occurring in FHLMC or GNMA pools.

Speeds on Idaho HFA-serviced pools for GNMA (orange), FHLMC (blue), and FNMA (black)

Digging deeper, we analyzed a set of FNMA pools totaling around $3.5B current face that were serviced entirely by the Idaho HFA. These pools experienced a sharp dip in reported forbearance from factor dates August through October, dropping from nearly 6% in forbearance to zero before rebounding to 4.5% (black line). By comparison, FHLMC pools serviced by the Idaho HFA (blue line) show no such change.

Seeking to understand what was driving this mysterious dip/rebound, we noticed in the October report that 2.7% of the Fannie UPB serviced by the Idaho HFA was repurchased (involuntarily) on account of being 120 days delinquent, thus triggering a large involuntary prepayment which was borne by investors.

We suspect that in the September report, loans that were in COVID-forbearance were inadvertently reclassified as not in forbearance. In turn, this clerical error released these loans from the GSE’s moratorium on repurchasing forbearance-delinquent loans and triggered an automatic buyout of these 120+ day delinquent loans by FNMA.

We have asked FNMA for clarification on the matter and they have responded that they are looking into it. We will share information as soon as we are aware of it.

[1] Idaho HFA services other states’ housing finance authority loans, including Washington state and several others.


If you are interested in seeing variations on this theme, contact us. Using Edge, we can examine any loan characteristic and generate a S-curve, aging curve, or time series.


EDGE: An Update on Property Inspection Waivers

In June, we wrote about the significant prepay differences observed between loans with full inspection/appraisals and loans with property inspection waivers (PIW). In this short piece, we revisit these relationships to see if the speed differentials have persisted over the previous four months.

From an origination standpoint, PIWs continue to gain in popularity and are beginning to approach half of all new issuance (blue line). For refi loans this figure approaches 60% (green line).

Graph 1: Percent of loans with property inspection waivers, by balance. Source: RiskSpan Edge


Broadly speaking, PIW loans still pay significantly faster than loans with appraisals. In our June report, the differential was around 15 CPR for the wider cohort of borrowers. Since that time, the relationship has held steady. Loans with inspection waivers go up the S-curve faster than loans with appraisals, and top out around 13-18 CPR faster, depending on how deep in the money the borrower is.

Graph 2: S-curves for loans aged 6-48 months with balance >225k, waivers (black) vs inspection (blue). Source: RiskSpan Edge. 

The differential is smaller for purchase loans. The first chart, which reflects only purchase loans, shows PIW loans paying only 10-12 CPR faster than loans with full appraisals. In contrast, refi loans (second chart) continue to show a larger differential, ranging from 15 to 20 CPR, depending on how deep in the money the loan is.

Graph 3: Purchase loans with waivers (black) versus inspections (blue). Source: RiskSpan Edge.

Graph 4: Refi loans with waivers (black) versus inspections (blue). Source: RiskSpan Edge.

We also compared bank-serviced loans with non-bank serviced loans. The PIW speed difference was comparable between the two groups of servicers, although non-bank speeds were in general faster for both appraisal and PIW loans.

Inspection waivers have been around since 2017 but have only gained popularity in the last year. While investors disagree on what is driving the speed differential, it could be as simple as self-selection: a borrower who qualifies for an inspection waiver will also qualify upon refinancing, unless that borrower takes out a large cash-out refi which pushes the LTV above 70%[1]. In any event, the speed differential between loans with waivers and loans with full inspections continues to hold over the last four months of factor updates. Given this, appraisal loans still offer significantly better prepay profiles at all refi incentives, along with a slightly flatter S-curve, implying lower option cost, than loans with inspection waivers.

If you are interested in seeing variations on this theme, contact us. Using Edge, we can examine any loan characteristic and generate a S-curve, aging curve, or time series.

[1] No-cash-out refis qualify for waivers up to 90% LTV.

Why Model Validators Need to Care About the LIBOR Transition

The transition to the Secured Overnight Financing Rate (SOFR) as a LIBOR replacement after 2021 creates layers of risk for banks. Many of these risks are readily apparent, others less so. But the factors banks must consider while choosing replacement rates and correctly implementing contractual fallback language makes a seamless transition a daunting proposition. Though sometimes overlooked, model risk managers have an important role in ensuring this happens correctly and in a way that does not jeopardize the reliability of model outputs.   

LIBOR, SOFR and the need for transition

A quick refresher: The London Interbank Offered Rate (LIBOR) currently serves as the benchmark at which major global banks lend to one another on a short-term basis in the international interbank market. LIBOR is calculated by the Intercontinental Exchange (ICE) and is published daily. LIBOR is based on a combination of five currencies and seven maturities. The most common of these is the three-month U.S. Dollar rate.

Accusations of manipulation by major banks going back as early as 2008, however, raised concerns about the sustainability of LIBOR. A committee convened by the Federal Reserve Board and the Federal Reserve Bank of New York in 2017—the Alternative Reference Rates Committee (ARRC)—identified a broad Treasury repurchase agreement (repo) financing rate as its preferred alternative reference rate to replace LIBOR after 2021. This repo rate (now known as SOFR) was chosen for its ability to provide liquidity to underlying markets and because the volumes underlying SOFR are far larger than any other U.S. money market. This combination of size and liquidity contributes to SOFR’s transparency and protects market participants from attempts at manipulation.

What Does This Mean for MRM?

Because the transition has potential bearing on so many layers of risk—market risk, operational risk, strategic risk, reputation risk, compliance risk, not to mention the myriad risks associated with mispricing assets—any model in a bank’s existing inventory that is tasked with gauging or remediating these risks is liable to be impacted. Understanding how and the extent to which models are considering how LIBOR transition may affect pricing and other core processes are (or should be) of principal concern to model validators.

Ongoing Monitoring and Benchmarking

Regulatory guidance and model validation best practices require testing model inputs and benchmarking how the model performs with the selected inputs relative to alternatives. For this reason, the validation any model whose outputs are sensitive to variable interest rates should include an assessment of how a replacement index (such as SOFR) and adjustment methodology were selected.

Model validators should be able to ascertain whether the model developer has documented enough evidence relating to:

  • Available reference rates and the appropriateness of each to the bank’s specific products
  • System capabilities for using these replacement rates with the bank’s products.
  • Control risks associated with unavailable alternative rates

Fallback Language considerations:

Fallback language—contractual provisions that govern the process for selecting a replacement rate in the event of LIBOR termination—should also factor into a validator’s assessment of model inputs. While many existing fallback provisions can be frustratingly vague when it comes to dealing with a permanent cessation of LIBOR, validators of models that rely on reference rates as inputs have an obligation to determining compliance with fallback language containing clear and executable terms. These include:

  • Specific triggers to enact the replacement rate
  • Clarity regarding the replacement rate and spread adjustments
  • Permissible options under fallback language – and whether other options might be more appropriate than the one ultimately selected based on the potential for valuation changes, liquidity impact, hedging implications, system changes needed, and customer impact

In November 2019, the ARRC published the finalized fallback language for residential adjustable rate mortgages, bilateral business loans, floating rate notes, securitizations, and syndicated loans. It has also actively engaged with the International Swap Derivatives Association (ISDA) to finalize the fallback parameters for derivatives.

The ARRC also recommended benchmark replacement rates adjusted for spread that would replace the current benchmark due to circumstances that trigger the replacement. The recommendation included the following benchmark replacement waterfalls. Validators of models relying on these replacements may choose, as part of their best practices review, to determine the extent to which existing fallback provisions align with the recommendations.

Replacement Description
Term SOFR + spread adjustment Forward-looking term SOFR for the applicable corresponding tenor. Note: Loan recommendations allow use of the next longest tenor term SOFR rate if the corresponding tenor is unavailable  
Compounded SOFR + spread Adjustment Compounded average of daily SOFRs over the relevant period depending on the tenor of USD LIBOR being replaced
Relevant selected rate + spread adjustment   Rate selected by the Relevant Governmental Body, lender, or borrower & administrative agent
Relevant ISDA replacement rate + spread adjustment The applicable replacement rate (without spread adjustment) that is embedded in ISDA’s standard definitions  
Issuer, designated transaction representative or noteholder replacement + spread adjustment An identified party will select a replacement rate, in some cases considering any industry-accepted rate in the related market. Note: in certain circumstances this step could be omitted

Model risk managers can sometimes be lulled into believing that the validation of interest rate inputs consists solely of verifying their source and confirming that they have been faithfully brought into the model. Ultimately, however, model validators are responsible for verifying not only the provenance of model inputs but also their appropriateness. Consequently, ensuring a smooth transition to the most appropriate available reference rate replacement is of paramount importance to risk management efforts related to the models these rates feed.


The information within this section has been taken directly from the [AR1]

Managing Machine Learning Model Risk

Though the terms are often used interchangeably in casual conversation, machine learning is a subset of artificial intelligence. Simply put, ML is the process of getting a computer to learn the properties of one dataset and generalizing this “knowledge” on other datasets.

ML Financial Models

ML models have crept into virtually every corner of banking and finance — from fraud and money-laundering prevention to credit and prepayment forecasting, trading, servicing, and even marketing. These models take various forms (see Table 1, below). Modelers base their selection of a particular ML technique on a model’s objective and data availability.   

Table 1. ML Models and Application in Finance

Model Application
Linear Regression Credit Risk; Forecasting
Logistic Regression Credit Risk
Monte Carlo Simulation Capital Market; (ALM)
Artificial Neutral Networks Score Card and AML
Decision Trees Regression Models (Random Forest, Bagging) Score Card
Multinomial Logistic Regression Prepayment Projection
Deep Learning Prepayment Projection
Time Series Model Capital Forecasting; Macroeconomics Forecasting Model
Linear Regression with ARIMA Errors Capital Forecasting
Factor Models Short Rate Evolution
Fuzzy Matching AML; OFAC
Linear Discriminant Analysis (LDA) AML; OFAC
K Means Clustering AML; OFAC


ML models require large datasets relative to conventional models as well as more sophisticated computer programing and econometric/statistical skills. ML model developers are required to have deep knowledge about the ML model they want to use, its assumptions and limitations, and alternative approaches.


ML Model Risk

ML models present many of the same risks that accompany conventional models. As with any model, errors in design or application can lead to performance issues resulting in financial losses, poor decisions, and damage to reputation.

ML is all about algorithms. Failing to understand the mathematical aspects of these algorithms can lead to adopting inefficient optimization algorithms without knowing the nature or the interpretation of the optimization being solved. Making decisions under these circumstances increases model risk and can lead to unreliable outputs.

As sometimes befalls conventional regression models, ML models may perform well on the training data but not on the test data. Their complexity and high dimensionality makes them especially susceptible to overfitting. The poor performance of some ML models when applied beyond the training dataset can translate into a huge source of risk.

Finally, ML models can give rise to unintended consequences when used inappropriately or incorrectly. Model risk is magnified when the goal of a ML model’s algorithm is not aligned with the business problem or doesn’t consider all relevant considerations of the business problem. Model risk also arises when an ML model is used outside the environment for which it was designed. These risks include overstated/understated model outputs and lack of fairness. Table 2, below, presents a more comprehensive list of these risks.

Table 2. Potential risk from ML models

Bias toward protected groups
Use of poor-quality data
Job displacement
Models may produce socially unacceptable results
Automation may create model governance issues


Managing ML Model Risk

managing ML model risk

It may seem self-evident, but the first step in managing ML model risk consists of reliably  identifying every model in the inventory that relies on machine learning. This exercise is not always as straightforward as it might seem. Successfully identifying all ML models requires MRM departments to incorporate the right information requests into their model determination or model assessment forms. These should include questions designed to identify specific considerations of ML model techniques, algorithms, platforms and capabilities. MRM departments need to adopt a consistent but flexible definition about what constitutes an ML model across the institution. Models developers, owners and users should be trained in identifying ML models and those features that need to be reported in the model identification assessment form.

MRM’s next step involves risk assessing ML models in the inventory. As with traditional models, ML models should be risk assessed based on their complexity, materiality and frequency of use. Because of their complexity, however, ML models require an additional level of screening in order to account for data structure, level of algorithm sophistication, number of hyper-parameters, and how the models are calibrated. The questionnaire MRM uses to assess the risk of its conventional models often needs to be enhanced in order to adequately capture the additional risk dimensions introduced by ML models.

Managing ML model risk also involves not only ensuring that a clear model development and implementation process is in place but also that it is consistent with the business objective and the intended use of the models. Thorough documentation is important for any model, but the need to describe model theory, methodology, design and logic takes on added importance when it comes to ML models. This includes specifying the methodology (regression or classification), the type of model (linear regression, logistic regression natural language processing, etc.), the resampling method (cross-validation, bootstrap) and the subset selection method such as backward, forward or stepwise selection. Obviously, simply stating that the model “relies on a variety of machine learning techniques” is not going to pass muster.

As with traditional models, developers must document the data source, quality and any transformations that are performed. This includes listing the data sources, normalization and sampling techniques, training and test data size, the data dimension reduction technique (principal component, partial least squares, etc.) as well as controls around them. An assessment of the risk around the utilization of certain data should also be assessed.

A model implementation plan and controls around the model should be also be developed.

Finally, all model performance testing should be clearly stated, and the results documented. This helps assess whether the model is performing as intended and in line with its design and business objective. Limitations and calibrations around the models should also be documented.

Like traditional models, ML models require independent validation to ensure they are sound and performing as intended and to identify potential limitations. All components of ML models should be subject to validation, including conceptual soundness, outcomes analysis and ongoing monitoring.

Validators can assess the conceptional soundness of an ML model by evaluating its design and construction, focusing on the theory, methodology, assumptions and limitations, data quality and integrity, hyper-parameter calibration and overlays, bias and interpretability.

Validators can assess outcomes analysis by checking whether the model outputs are appropriate and in line with a priori expectations. Results of the performance metrics should also be assessed for accuracy and degree of precision. Performance metrics for ML models vary by model type. Similar to traditional predictive models, common performance metrics for ML models include the mean-squared-error (MSE), Gini coefficient, entropy, the confusion matrix, and the receiver operating characteristic (ROC) curve.

Outcomes analysis should also include out-of-sample testing, which can be conducted using cross-validation techniques. Finally, ongoing monitoring should be reviewed as a core element of the validation process. Validators should evaluate whether model use is appropriate given changes in products, exposures and market conditions. Validators should also ensure performance metrics are being monitored regularly based on the inherent risk of the model and frequency of use. Validators should ensure that a continuous performance monitoring plan exists and captures the most important metrics. Also, a change control document and access control document should be available.  

The principles outlined above will sound familiar to any experienced model validator—even one with no ML training or experience. ML models do not upend the framework of MRM best practices but rather add a layer of complexity to their implementation. This complexity requires MRM departments in many cases to adjust their existing procedures to property identify ML models and suitably capture the risk emerging from them. As is almost always the case, aggressive staff training to ensure that their well-considered process enhancements are faithfully executed and have their desired effect.       

Get Started