Linkedin    Twitter   Facebook

Get Started
Log In


Category: Article

Open Source Software for Mortgage Data Analysis

While open source has been around for decades, using open source software for mortgage data analysis is a recent trend. Financial institutions have traditionally been slow to adopt the latest data and technology innovations due to the strict regulatory and risk-averse nature of the industry, and open source has been no exception. As open source becomes more mainstream, however, many of our clients have come to us with questions regarding its viability within the mortgage industry.

The short answer is simple: open source has a lot of potential for the financial services and mortgage industries, particularly for data modeling and data analysis. Within our own organization, we frequently use open source data modeling tools for our proprietary models as well as models built for clients. While a degree of risk is inherent, prudent steps can be taken to mitigate them and profit from the many worthwhile benefits of open source.

Open source has a lot of potential for the mortgage industry, particularly for data modeling & analysis @RiskSpan (Click to Tweet)

To address the common concerns that arise with open source, we’ll be publishing a series of blog posts aimed at alleviating these concerns and providing guidelines for utilizing open source software for data analysis within your organization. Some of the questions we’ll address include:

  • Can open source programming languages be applied to mortgage data modeling and data analysis?
  • What risks does open source expose me to and what can I do to mitigate them?
  • What are the pitfalls of open source and do the benefits outweigh them?
  • How does using open source software for mortgage data analysis affect the control and governance of my models?
  • What factors do I need to consider when deciding whether to use open source at my institution?

Throughout the series, we’ll also include examples of how RiskSpan has used open source software for mortgage data analysis, why we chose to use it, and what factors were considered. Before we dive in on the considerations for open source, we thought it would be helpful to offer an introduction to open source and provide some context around its birth and development within the financial industry.

What Is Open Source Software?

Software has conventionally been considered open source when the original code is made publicly available so that anyone can edit, enhance, or modify it freely. This original concept has recently been expanded to incorporate a larger movement built on values of collaboration, transparency, and community.

Open Source Software Vs Proprietary Software

Proprietary software refers to applications for which the source code is only accessible to those who created it. Thus, only the original author(s) has control over any updates or modifications. Outside players are barred from even viewing the code to protect the owners from copying and theft. To use proprietary software, users agree to a licensing agreement and typically pay a fee. The agreement legally binds the user to the owners’ terms and prevents the user from any actions the owners have not expressly permitted.

Open source software, on the other hand, gives any user free rein to view, copy, or modify it. The idea is to foster a community built on collaboration, allowing users to learn from each other and build on each other’s work. Like with proprietary software, open source users must still agree to a licensing agreement, but the terms are very differ significantly from those of a proprietary license.1

History of Open Source Software

The idea of open source software first developed in the 1950s, when much of software development was done by computer scientists in higher education. In line with the value of sharing knowledge among academics, source code was openly accessible. By the 1960s, however, as the cost of software development increased, hardware companies were charging additional fees for software that used to be bundled with their products.

Change came again in the 1980s. At this point, it was clear that technology and software were important factors of the growing business economy. Technology leaders were frustrated with the increasing costs of software. In 1984, Richard Stallman launched the GNU Project with the purpose of creating a complete computer operating system with no limitations on the use of its source code. In 1991, the operating system now referred to as Linux was released.

The final tipping point came in 1997, when Eric Raymond published his book, The Cathedral and the Bazaar, in which he articulated the underlying principles behind open source software. His book was a driving factor in Netscape’s decision to release its source code to the public, inspired by the idea that allowing more people to find and fix bugs will improve the system for everyone. Following Netscape’s release, the term “open source software” was introduced in 1998.

In the data-driven economy of the past two decades, open source has played an ever-increasing role. The field of software development has evolved to embrace the values of open source. Open source has made it not only possible but easy for anyone to access and manipulate source code, improving our ability to create and share valuable software.2

Adoption of Open Source Software in Business

The growing relevance of open source software has also changed the way large organizations approach their software solutions. While open source software was at one point rare in an enterprise’s system, it’s now the norm. A survey conducted by Black Duck Software revealed that fewer than 3% of companies don’t rely on open source at all. Even the most conservative organizations are hopping on board the open source trend.3
Even the most conservative organizations are hopping on board the open source trend.

In a blog post from June 2016, TechCrunch writes:

“Open software has already rooted itself deep within today’s Fortune 500, with many contributing back to the projects they adopt. We’re not just talking stalwarts like Google and Facebook; big companies like Walmart, GE, Merck, Goldman Sachs — even the federal government — are fleeing the safety of established tech vendors for the promises of greater control and capability with open software. These are real customers with real budgets demanding a new model of software.”4

The expected benefits of open source software are alluring all types of institutions, from small businesses, to technology giants, to governments. This shift away from proprietary software in favor of open source is streamlining business operations. As more companies make the switch, those who don’t will fall behind the times and likely be at a serious competitive disadvantage.

Open Source Software for Mortgage Data Analysis

Open source software is slowly finding its way into the financial services industry as well. We’ve observed that smaller entities that don’t have the budgets to buy expensive proprietary software have been turning to open source as a viable substitute. Smaller companies are either building software in house or turning to companies like RiskSpan to achieve a cost-effective solution. On the other hand, bigger companies with the resources to spare are also dabbling in open source. These companies have the technical expertise in house and give their skilled workers the freedom to experiment with open source software.

Within our own work, we see tremendous potential for open source software for mortgage data analysis. Open source data modeling tools like Python, R, and Julia are useful for analyzing mortgage loan and securitization data and identifying historical trends. We’ve used R to build models for our clients and we’re not the only ones: several of our clients are now building their DFAST challenger models using R.

Open source has grown enough in the past few years that more and more financial institutions will make the switch. While the risks associated with open source software will continue to give some organizations pause, the benefits of open source will soon outweigh those concerns. It seems open source is a trend that is here to stay, and luckily, it is a trend ripe with opportunity.





Balancing Internal and External Model Validation Resources

The question of “build versus buy” is every bit as applicable and challenging to model validation departments as it is to other areas of a financial institution. With no “one-size-fits-all” solution, banks are frequently faced with a balancing act between the use of internal and external model validation resources. This article is a guide for deciding between staffing a fully independent internal model validation department, outsourcing the entire operation, or a combination of the two.

Striking the appropriate balance is a function of at least five factors:

  1. control and independence
  2. hiring constraints
  3. cost
  4. financial risk
  5. external (regulatory, market, and other) considerations

Control and Independence

Internal validations bring a measure of control to the operation. Institutions understand the specific skill sets of their internal validation team beyond their resumes and can select the proper team for the needs of each model. Control also extends to the final report, its contents, and how findings are described and rated.

Further, the outcome and quality of internal validations may be more reliable. Because a bank must present and defend validation work to its regulators, low-quality work submitted by an external validator may need to be redone by yet another external validator, often on short notice, in order to bring the initial external model validation up to spec.

Elements of control, however, must sometimes be sacrificed in order to achieve independence. Institutions must be able to prove that the validator’s interests are independent from the model validation outcomes. While larger banks frequently have large, freestanding internal model validation departments whose organizational independence is clear and distinct, quantitative experts at smaller institutions must often wear multiple hats by necessity.

Ultimately the question of balancing control and independence can only be suitably addressed by determining whether internal personnel qualified to perform model validations are capable of operating without any stake in the outcome (and persuading examiners that this is, in fact, the case).

Hiring Constraints

Practically speaking, hiring constraints represent a major consideration. Hiring limitations may result from budgetary or other less obvious factors. Organizational limits aside, it is not always possible to hire employees with a needed skill set at a workable salary range at the time when they are needed. For smaller banks with limited bandwidth or larger banks that need to further expand, external model validation resources may be sought out of sheer necessity.


Cost is an important factor that can be tricky to quantify. Model validators tend to be highly specialized; many typically work on one type of model, for example, Basel models. If your bank is large enough and has enough Basel models to keep a Basel model validator busy with internal model validations all year, then it may be cost effective to have a Basel model validator on staff. But if your Basel model validator is only busy for six months of the year, then a full-time Basel validator is only efficient if you have other projects that are suited to that validator’s experience and cost. To complicate things further, an employee’s cost is typically housed in one department, making it difficult from a budget perspective to balance an employee’s time and cost across departments.

If we were building a cost model to determine how many internal validators we should hire, the input variables would include:

  1. the number of models in our inventory
  2. the skills required to validate each model
  3. the risk classification of each model (i.e., how often validations must be completed)
  4. the average fully loaded salary expense for a model validator with those specific skills

Only by comparing the cost of external validations to the year-round costs associated with hiring personnel with the specialized knowledge required to validate a given type of model (e.g., credit models, market risk models, operational risk models, ALM models, Basel models, and BSA/AML models) can a bank arrive at a true apples-to-apples comparison.

Financial Risk

While cost is the upfront expense of internal or external model validations, financial risk accounts for the probability of unforeseen circumstances. Assume that your bank is staffed with internal validators and your team of internal validators can handle the schedule of model validations (validation projects are equally spaced throughout the year). However, operations may need to deploy a new version of a model or a new model on a schedule that requires a validation at a previously unscheduled time with no flexibility. In this case, your bank may need to perform an external validation in addition to managing and paying a fully-staffed team of internal validators.

A cost model for determining whether to hire additional internal validators should include a factor for the probability that models will need to be validated off-schedule, resulting in unforeseen external validation costs. On the other hand, a cost model might also consider the probability that an external validator’s product will be inferior and incur costs associated with required remediation.

External Risks

External risks are typically financial risks caused by regulatory, market, and other factors outside an institution’s direct control. The risk of a changing regulatory environment under a new presidential administration is always real and uncertainty clearly abounds as market participants (and others) attempt to predict President Trump’s priorities. Changes may include exemptions for regional banks from certain Dodd-Frank requirements; the administration has clearly signaled its intent to loosen regulations generally. Even though model validation will always be a best practice, these possibilities may influence a bank’s decision to staff an internal model validation team.

Recent regulatory trends can also influence validator hiring decisions. For example, our work with various banks over the past 12-18 months has revealed that regulators are trending toward requiring larger sample sizes for benchmarking and back-testing. Given the significant effort already associated with these activities, larger sample sizes could ultimately lower the number of model validations internal resources can complete per year. Funding external validations may become more expensive, as well.

Another industry trend is the growing acceptance of limited-scope validations. If only minimal model changes have occurred since a prior validation, the scope of a scheduled validation may be limited to the impact of these changes. If remediation activities were required by a prior validation, the scope may be limited to confirming that these changes were effectively implemented. This seemingly common-sense approach to model validations by regulators is a welcome trend and could reduce the number of internal and external validations required.

Joint Validations

In addition to reduced-scope validations, some of our clients have sought to reduce costs by combining internal and external resources. This enables institutions to limit hiring to validators without model-specific or highly quantitative skills. Such internal validators can typically validate a large number of lower-risk, less technical models independently.

For higher-risk, more technical models, such as ALM models, the internal validator may review the controls and documentation sufficiently, leaving the more technical portions of the validation—conceptual soundness, process verification, benchmarking, and back-testing, for example—to an external validator with specific expertise. In these cases, reports are produced jointly with internal and external validators each contributing the sections pertaining to procedures that they performed.

The resulting report often has the dual benefit of being more economical than a report generated externally and more defensible than one that relies solely on internal resources who may lack the specific domain knowledge necessary.


Model risk managers have limited time, resources, and budgets and face unending pressure from management and regulators. Striking an efficient resource-balancing strategy is critically important to consistently producing high-quality model validation reports on schedule and within budgets. The question of using internal vs. external model validation resources is not an either/or proposition. In considering it, we recommend that model risk management (MRM) professionals

  • consider the points above and initiate risk tolerance and budget conversations within the MRM framework.
  • reach out to vendors who have the skills to assist with your high-risk models, even if there is not an immediate need. Some institutions like to try out a model validation provider on a few low- or moderate-risk models to get a sense of their capabilities.
  • consider internal staffing to meet basic model validation needs and external vendors (either for full validations or outsourced staff) to fill gaps in expertise.

Credit Risk Transfer: Front End Execution – Why Does It Matter?

This article was originally published on the GoRion blog.

Last month I described an overview of the activities of Credit Risk Transfer (CRT) as outlined from the Federal Finance Housing Agency (FHFA) guidance to Fannie Mae and Freddie Mac (the GSEs). This three-year-old program has shown great promise and success in creating a deeper residential credit investor segment and has enabled risk increments to be shifted from the GSEs and taxpayer to the private sector.

The FHFA issued an RFI to solicit feedback from stakeholders on proposals from the GSEs to adopt additional front-end credit risk transfer structures and to consider additional credit risk transfer policy issues. There is firm interest in this new and growing execution for risk transfer by investors who have confidence in the underwriting and servicing of mortgage loans through new and improved GSE standards.

In addition to the back-end industry appetite for CRT, there is also a growing focus to increase risk share at the front-end of the origination transaction. In particular, the mortgage industry and insurers (MIs) are interested in exploring risk sharing more actively on the front-end of the mortgage process. The MIs desire to participate in this new and growing market opportunity would increase their traditional coverage to much deeper levels than the standard 30% coverage.


Front-End Credit Risk Transfer

In 2016 FHFA expanded the GSE scorecards to include broadening the types of loans and risk transfer which included expanding to the front-end CRT. In addition to many prescriptive outlines on CRT, they also included wording such as “…Work with FHFA to conduct an analysis and assessment of front-end credit risk transfer transactions, including work to support a forthcoming FHFA Request for Input. Work with FHFA to engage key stakeholders and solicit their feedback. After conducting the necessary analysis and assessment, work with FHFA to take appropriate steps to continue front-end credit risk transfer transactions.”

Two additional ways to work with risk sharing on the front-end are using 1) recourse transactions and 2) deeper mortgage insurance.


Recourse Transaction

Recourse as a form of credit enhancement is not a new concept. In years past, some institutions would sell loans with recourse to the GSEs but it was usually determined to be capital intensive and not an efficient way of selling loans to the secondary markets. However, some of the non-depositories have found recourse to be an attractive way to sell loans to the GSEs.

To date from 2013 through December 2015, the GSE’s have executed 12 deals with recourse on $12.6 billion in UPB. The pricing and structures are very different and the transactions are not transparent. While this can be attractive to both parties if structured adequately, the transactions are not as scalable and each deal requires significant review and assessment. Arguments against recourse note this diminishes opportunities for the small to medium sized player who would like to participate in this new form of reduced g-fee structure and front-end CRT transaction.

Penny Mac shared their perspective on this activity at a recent CRT conference. They use the recourse structure with Fannie Mae and it leverages their capital structure and allows flexibility. Importantly, Penny Mac reminds us that both parties’ interests are aligned as there is skin in the game for quality originations.


Deeper Mortgage Insurance

The GSE model has a significant amount of counter-party risk with MIs through their standard business offerings. Through their charter, they require credit enhancements on loans of 80% or higher Loan to Value (LTV). This traditionally plays out to be 30% first loss coverage of such loans. For example, a 95% LTV loan is insured down to 65%. The mortgage insurers are integral to most of the GSE’ higher LTV books of business. Per the RFI, as of December 2015, the MI industry collectively has counter-party exposure of $185.5 billion, covering $724.5 billion of loans. So as a general course of business, this is already a risk they share of higher LTV lending without adding any additional exposure.

Through the crisis, the MIs were unable to pay dollar for dollar initial claims. This has caused hesitation on embracing a more robust model with more counter-party risk than the model of today. It is well documented that the MIs did pay a great deal of claims and buffered the GSEs by taking the first loss on billions of dollars before any losses were incurred by the GSEs. While much of that has been paid back, memories are long and this has generated pause as to how to value the insurance which is different than the back-end transactions. Today the MI industry is in much better shape through capital raises and increased standards directed from the GSEs and state regulators. (Our recent blog post on mortgage insurance haircuts explores this phenomenon in greater detail.)

FHFA instituted the PMIERs which required higher capital for the MI business transacted with the GSEs. The state regulators also increased the regulatory capital for the residential insurance sector and today the industry has strengthened their hand as a partner to the GSEs. In fact, the industry has new entrants who do not have the legacy books of losses which also adds new opportunities for the GSEs to expand the counter-party pools.

The MI companies can be a front-end model and play a more significant role in the risk share business by having deeper MI on the front-end (to 50% coverage) as a way of de-levering the GSE’s and ultimately, the taxpayers. And, like the GSE’s, MI’s may also participate in reinsurance markets to shed risk and balance out their own portfolios. Other market participants may also participate in this type of transaction and we will observe what opportunities avail themselves in the longer term. While nothing is ever black and white, there appear to be benefits to expanding the risk share efforts to the front-end of the business.



1) Strong execution: Pricing and executing on mortgage risk, at the front of the origination will allow for options in a counter-cyclical volatile market.

2) Transparency: Moving risk metrics and pricing to the front-end will drive more front-end price transparency for mortgage credit risk.

3) Inclusive institutional partnering: Smaller entities may participate in a front-end risk share effort thereby creating opportunities outside of the largest financial institutions.

4) Inclusive borrower process: Front-end CRT may reach more borrowers and provide options as more institutions can take part of this opportunity.

5) Expands options for CRT in pilot phase: By driving the risk share to the front-end, the GSE’s reach their goals in de-risking their credit guarantee while providing a timely trade off of G-fee and MI pricing on the front-end of the transaction.

As part of the RFI response, the trade representing the MIs summarized principal benefits of front-end CRT as follows:

  1. Increased CRT availability and market stability
  2. Reduced first-loss holding risk
  3. Beneficial stakeholder familiarity and equitable access
  4. Increased transparency.

The full letter may be found at

In summary, whether it is recourse to a lending institution or participation in the front-end MI cost structure, pricing this risk at origination will continue to bring forward price discovery and transparency. This means the consumer and lender will be closer to the true credit costs of origination. With experience pricing and executing on CRT, it may become clearer where the differential cost of credit lies. The additional impact of driving more front-end CRT will be scalability and less process on the back-end for the GSEs. By leveraging the front-end model, GSEs will reach more borrowers and utilize a wider array of lending partners through this process.

As of November 8, we experienced a historic election which may take us in new directions. However, credit risk transfer is an option that may be used in the future regardless of GSE status, even if they 1) revert back to the old model with recap and release; 2) re-emerge after housing reform post legislation; or 3) remain in conservatorship and continue to be led by FHFA down this path.

**Footnote: All data was retrieved from the Federal Housing Finance Agency, FHFA, Single Family Credit Risk Transfer request for input, June 2016. More information may be found at

This is the second installment in a monthly Credit Risk Transfer (CRT) series on the GoRion blog. CRT is a significant accomplishment in bringing back private capital to the housing sector. This young effort, three years strong, has already shown promising investor appetite while discussions are underway to expand offerings to front-end risk share executions. My goal in this series is to share insights around CRT as it evolves with the private sector.

Validating Model Inputs: How Much Is Enough?

In some respects, the OCC 2011-12/SR 11-7 mandate to verify model inputs could not be any more straightforward: “Process verification … includes verifying that internal and external data inputs continue to be accurate, complete, consistent with model purpose and design, and of the highest quality available.” From a logical perspective, this requirement is unambiguous and non-controversial. After all, the reliability of a model’s outputs cannot be any better than the quality of its inputs. From a functional perspective, however, it raises practical questions around the amount of work that needs to be done in order to consider a particular input “verified.” Take the example of a Housing Price Index (HPI) input assumption. It could be that the modeler obtains the HPI assumption from the bank’s finance department, which purchases it from an analytics firm. What is the model validator’s responsibility? Is it sufficient to verify that the HPI input matches the data of the finance department that supplied it? If not, is it enough to verify that the finance department’s HPI data matches the data provided by its analytics vendor? If not, is it necessary to validate the analytics firm’s model for generating HPI assumptions? It depends. Just as model risk increases with greater model complexity, higher uncertainty about inputs and assumptions, broader use, and larger potential impact, input risk increases with increases in input complexity and uncertainty. The risk of any specific input also rises as model outputs become increasingly sensitive to it.

Validating Model Inputs Best Practices

So how much validation of model inputs is enough? As with the management of other risks, the level of validation or control should be dictated by the magnitude or impact of the risk. Like so much else in model validation, no ‘one size fits all’ approach applies to determining the appropriate level of validation of model inputs and assumptions. In addition to cost/benefit considerations, model validators should consider at least four factors for mitigating the risk of input and assumption errors leading to inaccurate outputs.

  • Complexity of inputs
  • Manual manipulation of inputs from source system prior to input into model
  • Reliability of source system
  • Relative importance of the input to the model’s outputs (i.e., sensitivity)

Consideration 1: Complexity of Inputs

The greater the complexity of the model’s inputs and assumptions, the greater the risk of errors. For example, complex yield curves with multiple data points will be inherently subject to greater risk of inaccuracy than binary inputs such as “yes” and “no.” In general, the more complex an input is, the more scrutiny it requires and the “further back” a validator should look to verify its origin and reasonability.

Consideration 2: Manual Manipulation of Inputs from Source System Prior to Input into Model

Input data often requires modification from the source system to facilitate input into the model. More handling and manual modifications increase the likelihood of error. For example, if a position input is manually copied from Bloomberg and then subjected to a manual process of modification of format to enable uploading to the model, there is a greater likelihood of error than if the position input is extracted automatically via an API. The accuracy of the input should be verified in either case, but the more manual handling and manipulation of data that occurs, the more comprehensive the testing should be. In this example, more comprehensive testing would likely take the form of a larger sample size.

In addition, the controls over the processes to extract, transform, and load data from a source system into the model will impact the risk of error. More mature and effective controls, including automation and reconciliation, will decrease the likelihood of error and therefore likely require a lighter verification procedure.

Consideration 3: Reliability of Source Systems

More mature and stable source systems generally produce more consistently reliable results. Conversely, newer systems and those that have produced erroneous results increase the risk of error. The results of previous validation of inputs, from prior model validations or from third parties, including internal audit and compliance, can be used as an indicator of the reliability of information from source systems and the magnitude of input risk. The greater the number of issues identified, the greater the risk, and the more likely it is that a validator should seek to drill deeper into the fundamental sources of source data.

Consideration 4: Output Sensitivity to Inputs

No matter how reliable an input data’s source system is deemed to be, or the amount of manual manipulation to which an input is subjected, perhaps the most important consideration is the individual input’s power to affect the model’s outputs. Returning to our original example, if a 50 percent change in the HPI assumption has only a negligible impact on the model’s outputs, then a quick verification against the report supplied by the finance department may be sufficient. If, however, the model’s outputs are extremely sensitive to even small shifts in the HPI assumption, then additional testing is likely warranted—perhaps even to include a validation of the analytics vendor’s HPI model (along with all of its inputs).

A Cost-Effective Model Input Validation Strategy

When it comes to verifying model inputs, there is no theoretical limitation to the lengths to which a model validator can go. Model risk managers, who do not have unlimited time or budgets, would benefit from applying practical limits to validation procedures using a risk-based approach to determine the most cost-effective strategies to ensure that models are sufficiently validated. Applying the considerations listed above on a case-by-case basis will help validators appropriately define and scope model input reviews in a manner commensurate with appropriate risk management principles.

Performance Testing: Benchmarking Vs. Back-Testing

When someone asks you what a model validation is, what is the first thing you think of? If you are like most, then you would immediately think of performance metrics— those quantitative indicators that tell you not only if the model is working as intended, but also its performance and accuracy over time and compared to others. Performance testing is the core of any model validation and generally consists of the following components:

  • Benchmarking
  • Back-testing
  • Sensitivity Analysis
  • Stress Testing

Sensitivity analysis and stress testing, while critical to any model validation’s performance testing, will be covered by a future article. This post will focus on the relative virtues of benchmarking versus back-testing—seeking to define what each is, when and how each should be used, and how to make the best use of the results of each.


Benchmarking is when the validator is providing a comparison of the model being validated to some other model or metric. The type of benchmark utilized will vary, like all model validation performance testing does, with the nature, use, and type of model being validated. Due to the performance information it provides, benchmarking should always be utilized in some form when a suitable benchmark can be found.

Choosing a Benchmark

Choosing what kind of benchmark to use within a model validation can sometimes be a very daunting task. Like all testing within a model validation, the kind of benchmark to use depends on the type of model being tested. Benchmarking takes many forms and may entail comparing the model’s outputs to:

  • The model’s previous version
  • An externally produced model
  • A model built by the validator
  • Other models and methodologies considered by the model developers, but not chosen
  • Industry best practice
  • Thresholds and expectations of the model’s performance

One of the most used benchmarking approaches is to compare a new model’s outputs to those of the version of the model it is replacing. It remains very common throughout the industry for models to be replaced due to a deterioration of performance, change in risk appetite, new regulatory guidance, need to capture new variables, or the availability of new sets of information. In these cases, it is important to not only document but also prove that the new model performs better and does not have the same issues that triggered the old model’s replacement.

Another common benchmarking approach compares the model’s outputs to those of an external “challenger” model (or one built by the validator) which functions with the same objective and data. This approach is likely to return more apt output comparisons than those generated by benchmarking against older versions that are likely to be out of date since the challenger model is developed and updated with the same data as the champion model.

Another benchmark set which could be used for model validation includes other models or methodologies reviewed by the model developers as possibilities for the model being validated but ultimately not used. Model developers as best practice should always list any alternative methodologies, theories, or data which were omitted from the model’s final version. Additionally, model validators should always leverage their experience and understanding of the current best practices throughout the industry, along with any analysis previously completed on similar models. Model validation should then take these alternatives and use them as benchmarks to the model being validated.

Model validators have multiple, distinct ways to incorporate benchmarking into their analysis. The use of the different types of benchmarking discussed here should be based on the type of model, its objective, and the validator’s best judgment. If a model cannot be reasonably benchmarked, then the validator should record why not and discuss the resulting limitations of the validation.


Back-testing is used to measure model outcomes. Here, instead of measuring performance with a comparison, the validator is specifically measuring whether the model is both working as intended and is accurate. Back-testing can take many forms based on the model’s objective. As with benchmarking, back-testing should be a part of every full-scope model validation to the extent possible.

What Back-Tests to Perform

As a form of outcomes analysis, back-testing provides quantitative metrics which measure the performance of a model’s forecast, the accuracy of its estimates, or its ability to rank-order risk. For instance, if a model produces forecasts for a given variable, back-testing would involve comparing the model’s forecast values against actual outcomes, thus indicating its accuracy.

A related function of model back-testing evaluates the ability of a given model to adequately measure risk. This risk could take any of several forms, from the probability of a given borrower to default to the likelihood of a large loss during a given trading day. To back-test a model’s ability to capture risk exposure, it is important first to collect the right data. In order to back-test a probability of default model, for example, data would need to be collected containing cases where borrowers have actually defaulted in order to test the model’s predictions.

Back-testing models that assign borrowers to various risk levels necessitate some special considerations. Back-testing these and other models that seek to rank-order risk involves looking at the model’s performance history and examining its accuracy through its ability to rank and order the risk. This can involve analyzing both Type 1 (false positive) and Type 2 (false negative) statistical errors against the true positive and true negative rates for a given model.  Common statistical tests used for this type of back-testing analysis include, but are not limited to, a Kolmogorov-Smirnov score (KS), a Brier score, or a Receiver Operating Characteristic (ROC).

Benchmarking vs Backtesting

Back-testing measures a model’s outcome and accuracy against real-world observations, while benchmarking measures those outcomes against those of other models or metrics. Some overlap exists when the benchmarking includes comparing how well different models’ outputs back-test against real-world observations and the chosen benchmark. This overlap sometimes leads people to mistakenly conclude that model validations can rely on just one method. In reality, however, back-testing and benchmarking should ideally be performed together in order to bring their individual benefits to bear in evaluating the model’s overall performance. The decision, optimally, should not be whether to create a benchmark or to perform back-testing. Rather, the decision should be what form both benchmarking and back-testing should take.

While benchmarking and back-testing are complementary exercises that should not be viewed as mutually exclusive, their outcomes sometimes appear to produce conflicting results. What should a model validator do, for example, if the model appears to back-test well against real-world observations but do not benchmark particularly well against similar model outputs? What about a model that returns results similar to those of other benchmark models but does not back-test well? In the first” scenario, the model owner can derive a measure of comfort from the knowledge that the model performs well in hindsight. But the owner also runs the very real risk of being “out on an island” if the model turns out to be wrong. The second scenario affords the comfort of company in the model’s projections. But what if the models are all wrong together?

Scenarios where benchmarking and back-testing do not produce complementary results are not common, but they do happen. In these situations, it becomes incumbent on model validators to determine whether back-testing results should trump benchmarking results (or vice-versa) or if they should simply temper one another. The course to take may be dictated by circumstances. For example, a model validator may conclude that macro-economic indicators are changing to the point that a model which back-tests favorably is not an advisable tool because it is not tuned to the expected forward-looking conditions. This could explain why a model that back-tests favorably remains a benchmarking outlier if the benchmark models are taking into account what the subject model is missing. On the other hand, there are scenarios where it is reasonable to conclude that back-testing results trump benchmarking results. After all, most firms would rather have an accurate model than one that lines up with all the others.

As seen in our discussion here, benchmarking and back-testing can sometimes produce distinct or similar metrics depending on the model being validated. While those differences or similarities can sometimes be significant, both benchmarking and back-testing provide critical complementary information about a model’s overall performance. So when approaching a model validation and determining its scope, your choice should be what form of benchmarking and back-testing needs to be done, rather than whether one needs to be performed versus the other.

The Real Reason Low Down Payment VA loans Don’t Default Like Comparable FHA Loans

On this Veterans Day, I was reminded of the Urban Institute’s 2014 article on VA loan performance and its explanation of why VA loans outperform FHA loans.1 The article illustrated how VA loans outperformed comparable FHA loans despite controlling for key variables like FICO, income, and DTI. The article further explained the structural differences and similarities between the veterans program and FHA loans—similarities that include owner occupancy, loan size, and low down payments.

The analysis was well thought out and clearly showed how VA loans outperformed FHA. The article took great care to understand how FICOs, DTI, and income levels could impact default performance. The article further demonstrated VA outperformance wasn’t a recent or short-lived trend.

Its concluding rationale for superior performance was based on:

  1. VA’s residual income test
  2. VA’s loss mitigation efforts
  3. Lender’s “skin in the game”
  4. Lender concentration
  5. Military culture

Two of these reasons assume VA’s internal policies made the difference. Two of the reasons assume lenders’ ability to self-regulate credit policy or capital had an influence. The final reason centered on veterans as a social group with differing values that contributed to the difference.

As someone who has spent his mortgage career between modeling credit, counterparty risks and managing credit underwriters, the lack of good analytical data or anecdotal evidence makes it hard to see how these reasons can account for VA’s strong relative default performance versus FHA. While I understand their rationale, I don’t see how it makes for a compelling explanation.

VA Internal Policies

The residual income test is a tertiary measure used by lenders to qualify borrowers. It is used after applying the traditional MTI (mortgage payment to income) and DTI (total debt to income) ratios. The test mandates that the borrowing veteran have a minimum net income after paying all mortgage and debt payments. But as a third-level underwriting test, it is hard to see how it could be the source of so much of default performance improvement.

The same goes for VA’s loss mitigation outreach efforts. It sounds good in the press, but it is just a secondary loss mitigation effort used in conjunction with the servicer’s own loss mitigation efforts. Perhaps it is responsible for some of the incremental improvement, but it’s hard to believe it accounts for much more.

Lenders’ Ability to Self-Regulate

“Skin in the Game” attempts to explain how lenders manage their retained credit risk when the VA insurance payment is insufficient to cover all loan losses.2 The “Skin in the Game” theory holds when lenders have exposure to losses they will adjust their lending policies to reduce their risk. This translates into tighter credit policies, like floors on credit scores or ceilings on DTIs. Having worked for companies with strong credit cultures that nearly failed in the recent financial crisis, I find it hard to believe privately held mortgage bankers can manage this risk. Quite the opposite, my experience tells me: 1) lenders always underestimate their residual credit risks, and 2) pressure for volume, market share, and profits quickly overwhelm any attempts to maintain credit discipline.

The notion that lender concentration in the VA originations market somehow means that those lenders have more capital or the ability to earn more money also makes little sense. Historically, the largest source of origination revenue is the capitalized value of the MSRs created when the loan is securitized. Over the past several years Ginnie Mae MSR prices have collapsed. This severely limits the profit margins from VA loans. Trust me, the top lenders are not making it up on volume.

Effect of Military Culture on VA Loans

So, what’s left? Military culture. This I believe. And not just because it is the only reason left. As the son of a retired Army colonel and the brother of both a retired Navy captain and a retired Marine Corps lieutenant colonel, I think I understand what the military culture is.

My view of military culture isn’t that veterans are more disciplined or responsible than the rest of the American public. My view of military culture is that institutional, structural, and societal differences make the military personnel workforce different than that of the general public. How?

Job Security

  • Active duty military personnel are not subject to mass layoffs or reductions in force typical in the business world.
  • Poor performing military personnel are typically eased out of the military and not fired.  This process of “getting passed over” spans several years and not weeks or months.
  • Most active duty military personnel have the flexibility to determine their exit strategy/retirement date, so they can defer when economic times are bad.


  • Military personnel can retire with 50% pay of their base pay after 20 years of service.
  • This pension is received immediately upon retirement and is indexed to inflation.

Job Skills

  • A high percentage of retired military personnel re-enter the workforce and work for governmental agencies.
  • Military personnel typically leave active duty with more marketable skills than their similarly educated peers.
  • Active duty, retired, and former military personnel have jobs/careers/professions that are in higher demand than the rest of the American population.

Military Pay

  • Active and retired military pay is transparent, public, and socialized. The only variables are rank and years in service. The pay schedule provides for automatic pay increases based on the number of years in service and rank. Every two years you will get a pay increase. If you get promoted, you will get a pay raise. You know how much your boss makes. Female service-members make the same pay as their male counterparts.
  • Military pay is indexed to inflation.
  • In addition to their base pay, all military personnel are given a tax-free monthly housing allowance which is adjusted regionally.


  • All military families, active duty and retired, receive full medical care.
  • This health insurance has nearly no deductible or out-of-pocket expenses.
  • This means veterans don’t default due to catastrophic medical emergencies or have their credit capacity impacted by unpaid medical bills. 

It is for these reasons that I believe VA borrowers default less frequently than FHA borrowers. The U.S. military is not a conscripted force, but rather an all-volunteer force. The structural programs offered by the U.S. Government provide the incentives necessary for people to remain in the military.

The Federal Government has designed a military force with low turnover and backed by an institutionalized social safety net to help recruit, retain, and reward its personnel.  Lower default rates associated with the VA loan program are just the secondary benefits of a nation trying to keep its citizens safe.

So, on this Veterans Day, remember to thank a veteran for his service. But also, remember the efforts of the Federal Government to ease the difficulties of those protecting our nation.

[1] Housing Finance Policy Center Commentary, “VA Loans Outperform FHA Loans. Why. And What Can We Learn?”, Laura Goodman, Ellen Seidman, Jun Zhu.  Urban Institute – July 16, 201

[2] VA insurance is a first loss guaranty like MI insurance.  If the loss is greater than the insurance payment the mortgage servicer is responsible for the additional loss

4 Questions to Ask When Determining Model Validation Scope

Model risk management is a necessary undertaking for which model owners must prepare on a regular basis. Model risk managers frequently struggle to strike an appropriate cost-benefit balance in determining whether a model requires validation, how frequently a model needs to be validated, and how detailed subsequent and interim model validations need to be. The extent to which a model must be validated is a decision that affects many stakeholders in terms of both time and dollars. Everyone has an interest in knowing that models are reliable, but bringing the time and expense of a full model validation to bear on every model, every year is seldom warranted. What are the circumstances under which a limited-scope validation will do and what should that validation look like? We have identified four considerations that can inform your decision on whether a full-scope model validation is necessary:

  1.  What about the model has changed since the last full-scope validation?
  2. How have market conditions changed since the last validation?
  3.  How mission-critical is the model?
  4. How often have manual overrides of model output been necessary?

What Constitutes a Model Validation

Comprehensive model validations consist of three main components: conceptual soundness, ongoing monitoring and benchmarking, and outcomes analysis and back-testing.[1] A comprehensive validation encompassing all these areas is usually required when a model is first put into use. Any validation that does not fully address all three of these areas is by definition a limited-scope validation. 1 Comprehensive validations on ‘black box’ models developed and maintained by third-party vendors are therefore problematic because the mathematical code and formulas are not typically available for review (in many cases a validator can only hypothesize the cause and effect relationships between the inputs and outputs based on a reading of the model’s documentation).  Ideally, regular comprehensive validations are supplemented by limited-scope validations and outcomes analyses on an ongoing, interim basis to ensure that the model performs as expected.

Key Considerations for Model Validation

There is no ‘one size fits all’ question for determining how often a comprehensive validation is necessary, versus when a limited-scope review would be appropriate. Beyond the obvious time and cost considerations, model validation managers would benefit from asking themselves a minimum of four questions in making this determination:

Question 1: What about the model has changed since the last full-scope validation?

Many models layer economic assumptions on top of arithmetic equations. Most models consist of three principal components:

  1. inputs (assumptions and data)
  2. processing (underlying mathematics and code that transform inputs into estimates)
  3. output reporting (processes that translate estimates into useful information)

Changes to either of the first two components are more likely to require a comprehensive validation than changes to the third component. A change that materially impacts how the model output is computed, either by changing the inputs that drive the calculation or by changing the calculations themselves, is more likely to merit a more comprehensive review than a change that merely affects how the model’s outputs are interpreted.

For example, say I have a model that assigns a credit rating to a bank’s counterparties on a 100-point scale. The requirements the bank establishes for the counterparty are driven by how the model rates the counterparty. Say, for example, that the bank lends to counterparties that score between 90 and 100 with no restrictions, between 80 and 89 with pledged collateral, between 70 and 79 with delivered collateral, and does not lend to counterparties scoring below a 70. Consider two possible changes to the model:

  1. Changes in model calculations that result in what used to be a 65 now being a 79.
  2. Changes in grading scale that result in a counterparty that receives a rating of 65 now being deemed creditworthy.

While the second change impacts the interpretation of model output and may require only a limited-scope validation to determine whether the amended grading scale is defensible, the first change is almost certain to require that the validator go deeper ‘under the hood’ for verification that the model is working as intended. Assuming that the inputs did not change, the first type of change may be the result of changes to assumptions (e.g., weighting schemes) or simply a revision to a perceived calculation error. The second is a change on the reporting component, where a comparison of the model’s forecasts to those of challenger models and back-testing with historical data may be sufficient for validation. Not every change that affects model outputs necessarily requires a full-scope validation. The insertion of recently updated economic forecasts into a recently validated model may require only a limited set of tests to demonstrate that changes in the model estimates are consistent with the new economic forecast inputs. The magnitude of the impact on output also matters. Altering several input parameters that results in a material change to model output is more likely to require a full validation.

Question 2: How have market conditions changed since the last validation?

Even models that do not change at all require periodic, full-scope validations because macroeconomic conditions or other external factors call one or more of the model’s underlying assumptions into question. The 2008 global financial crisis is a perfect example. Mortgage credit and prepayment models prior to 2008 were built on assumptions that appeared reasonable and plausible based on market observations prior to 2008. Statistical models based solely on historical data before, during, or after the crisis are likely to require full-scope validations as their underlying datasets are expanded to capture a more comprehensive array of observed economic scenarios. It doesn’t always have to be bad news in the economy to instigate model changes that require full-scope validations. The federal funds rate has been hovering near zero since the end of 2008. With a period of gradual and sustained recovery potentially on the horizon, many models are beginning to incorporate rising interest rates into their current forecasts. These foreseeable model adjustments will likely require more comprehensive validations geared toward verifying that model outputs are appropriately sensitive to the revised interest rate assumptions.

Question 3: How mission-critical is the model?

The more vital the model’s outputs are to financial statements or mission-critical business decisions, the greater the need for frequent and detailed third-party validations. Model risk is amplified when the model outputs inform reports that are provided to investors, regulators, or compliance authorities. Particular care should be given when deciding whether to partially validate models with such high-stake outputs. Models whose outputs are used for internal strategic planning are also important. That being said, some models are more critical to a bank’s long-term success than others. Ensuring the accuracy of the risk algorithms used for DFAST stress testing is more imperative than the accuracy of a model that predicts wait times in a customer service queue. Consequently, DFAST models, regardless of their complexity, are likely to require more frequent full-scope validations than models whose results likely undergo less scrutiny.

Question 4: How often have manual overrides of model output been necessary?

Another issue to consider revolves around the use of manual overrides to the model’s output. In cases where expert opinion is permitted to supersede the model outputs on a regular basis, more frequent full-scope validations may be necessary in order to determine whether the model is performing as intended. Counterparty credit scoring models, cited in our earlier example, are frequently subjected to manual overrides by human underwriters to account for new or other qualitative information that cannot be processed by the model. The decision of whether it is necessary to revise or re-estimate a model is frequently a function of how often such overrides are required and what the magnitude of these overrides tends to be. Models that frequently have their outputs overridden should be subjected to more frequent full-scope validations. And models that are revised as a result of numerous overrides should also likely be fully validated, particularly when the revision includes significant changes to input variables and their respective weightings.

Full or Partial Model Validation?

Model risk managers need to perform a delicate balancing act in order to ensure that an enterprise’s models are sufficiently validated while keeping to a budget and not overly burdening model owners. In many cases, limited-scope validations are the most efficient means to this end. Such validations allow for the continuous monitoring of model performance without bringing in a Ph.D. with a full team of experts to opine on a model whose conceptual approach, inputs, assumptions, and controls have not changed since its last full-scope validation. While gray areas abound and the question of full versus partial validation needs to be addressed on a case-by-case basis, the four basic considerations outlined above can inform and facilitate the decision. Incorporating these considerations into your model risk management policy will greatly simplify the decision of how detailed your next model validation needs to be. An informed decision to perform a partial model validation can ultimately save your business the time and expense required to execute a full model validation.

[1] In the United States, most model validations are governed by the following sets of guidelines: 1) OCC 2011-12 (institutions regulated by the OCC), and 2) FRB SR-11 (institutions regulated by the Federal Reserve). These guidelines are effectively identical to one another. Model validations at Government-sponsored enterprises, including Fannie Mae, Freddie Mac, and the Federal Home Loan Banks, are governed by Advisory Bulletin 2013-07, which, while different from the OCC and Fed guidance, shares many of the same underlying principles.

Single Family Rental Securitization Market

The Single Family Rental Market

The single family rental market has existed for decades as a thriving part of the U.S. housing market.  Investment in single family homes for rental purposes has provided many opportunities for the American “mom and pop” investors to build and maintain wealth, prepare for retirement, and hold residual cash flow producing assets.   According to the National Rental Home Council (NRHC) (“Single-Family Rental Primer”; Green Street Advisors, June 6, 2016) as of year-end 2015, the single-family rental market comprised approximately 13% (16 million detached single-family rentals) of all occupied housing and roughly 37% of the entire United States rental market.

Single-Family Rental Securitization Structure

Introduce the credit crisis of 2008.  Limited credit for non-prime borrowers in combination with record setting delinquency and foreclosure rates prompted a significant reduction of housing prices. According to the S&P CoreLogic Case-Shiller U.S. National Home Price NSA Index, since the index’s launch in May 18, 2006 (initial index value = 184.38), national house prices had dropped 25% (index value = 138.5) by April 2012.

The market dynamic combination of low prices and post-crises rental demand along with highly restrictive mortgage credit qualifications alerted particular investors to an opportunity.  Specific private institutional investors, mostly private equity firms, began acquiring large quantities of distressed single family homes. According to the working paper entitled “The Emerging Economic Geography of Single-Family Rental Securitization” by the Federal Reserve Bank of San Francisco (Fields, Kohli, Schafran; January 2016) the entrance of these “large institutional investors into their new role as ‘corporate landlords’ [represented] a paradigm shift for the single-family rental market.”

Not only did they rehabilitate the homes and rent them out to non-prime borrowers, they then in turn introduced these assets into the capital markets by pledging the collateral and rental receipts into publicly issued REIT’s as well as issuing single-family rental securitizations (SFR).  The issuance of single family rental securitizations was a new concept utilizing an old vehicle, the issuance of a bankruptcy remote special purpose vehicle for the purpose of issuing debt via pledged collateral assets.

In this case, the collateral is generally a loan secured by a first priority mortgage (that was placed in an LP or LLC) backed by the pledging or sale of the underlying single family homes operated as rental properties (also normally placed in a previous LP or LLC).  Not only did this provide a strong exit strategy for investors because it allowed them to obtain immediate capital, but they were also able to increase their leveraged return on equity.

When Did Single-Family Rental Securitization Begin?

The first securitization transaction was issued in November 2013 by Invitation Homes (IH, 2013-1), a subsidiary of the Blackstone Group BX. As of July 2016, 32 single-borrower (26) and multi-borrower (six) SFR transactions have been issued. The table below provides a list of all SFR single and multi- borrower securitization transactions rated as of July 2016.123

Table: SFR Securitization Transactions Rated as of July 2016

Interestingly, the current inventory owned as well as securitized is only approximately 1 to 2% of the overall market.  Also of particular interest is the recent consolidation of institutions active in this market and the introduction of new participants.  American Homes 4 Rent (AM4R) acquired Beazer Rental Homes in July 2014 and Colony American Homes (Colony) merged with Starwood Waypoint Residential Trust (SWAY) in January 2016.  Subsequent to the Colony and SWAY merger, this newly formed company issued its own SFR securitization in June 2016 of approximately 3,600 properties with a loan balance of $536 million (CSH, 2016-1).  Moreover introducing themselves into the SFR securitization market was Home Partners of America (formerly Hyperion Homes, Inc.), which issued its first single-family rental securitization earlier this year (approximately $654mm, property count of 2,232).

Single-Family Rental Securitization Market Outlook

The question remains, is the SFR securitization market here to stay? On the one hand, issuance still appears to be strong; however, SFRs could be an efficient market’s response to the market dislocation of 2008, the effects of which may now appear to be fading away.  At a minimum this type of securitization demonstrates the effectiveness of the capital markets in moving quickly to fill the gaps left by the bursting of the housing bubble.

[1] Source: Kroll Bond Rating Agency, Inc. (KBRA)

[2] Source:

[3] Source: Yahoo Finance

Sample Size Requirements for CECL Modeling

Part One of a Two-Part Series on CECL Data Requirements

With CECL implementation looming, many bankers are questioning whether they have enough internal loan data for CECL modeling. Ensuring your data is sufficient is a critical first step in meeting the CECL requirements, as you will need to find and obtain relevant third-party data if it isn’t. This article explains in plain English how to calculate statistically sufficient sample sizes to determine whether third-party data is required. More importantly, it shows modeling techniques that reduce the required sample size. Investing in the right modeling approach could ultimately save you the time and expense of obtaining third-party data.

CECL Data Requirements: Sample Size for a Single Homogenous Pool

Exhibit 1: Required Sample Size

Let’s first consider the sample required for a single pool of nearly identical loans. In the case of a uniform pool of loans — with the same FICO, loan-to-value (LTV) ratio, loan age, etc. — there is a straightforward formula to calculate the sample size we need to estimate the pool’s default rate, shown in Exhibit 1.1 As the formula shows, the sample size depends on several variables, some of which must be estimated:

  • Materiality Threshold and Confidence Level: Suppose you have a $1 billion loan portfolio and you determine that, from a financial statement materiality standpoint, your ALLL estimate needs to be reliable to within +/- $2.5 million. Statistically, we would say that we need to be 95% confident that our loss reserve estimate is within an error margin of +/- $2.5 million of the true figure. The wider our materiality thresholds and lower our required confidence levels, the smaller the sample size we need.
  • Loss Severity: As your average loss severity increases, you need a greater sample size to achieve the same error margin and confidence level. For example, if your average loss severity is 0%, you will estimate zero losses regardless of your default rates. Theoretically, you don’t even need to perform the exercise of estimating default rates, and your required sample size is zero. On the opposite end, if your average loss severity is 100%, every dollar of defaulted balance translates into a dollar of loss, so you can least afford to misestimate default rates. Your required sample size will therefore be great.
  • Default Rates: Your preliminary estimate of default rate, based on your available sample, also affects the sample size you will require. (Of course, if you lack any internal sample, you already know you need to obtain third-party data for CECL modeling.) Holding dollar error margin constant, you need fewer loans for low default-rate populations.

Example: Suppose we have originated a pool of low-risk commercial real estate loans. We have historical observations for 500 such loans, of which 495 paid off and five defaulted, so our preliminary default rate estimate is 1%. Of the five defaults, loss severity averaged 25% of original principal balance. We deem ALLL estimate errors within 0.25% of the relevant principal balance to be immaterial. Is our internal sample of 500 loans enough for CECL modeling purposes, or do we need to obtain proxy data? Simply apply the formula from Exhibit 1: In this case, our internal sample of 500 loans is more than enough to give us a statistical confidence interval that is narrower than our materiality thresholds. We do not need proxy data to inform our CECL model in this case.

CECL Data Requirements: Sample Size Across an Asset Class

If we have an asset class with loans of varying credit risk characteristics, one way to determine the needed sample is just to carve up the portfolio into many buckets of loans with like-risk characteristics, determine the number of loans needed for each bucket on a standalone basis per the formula above, and then sum these amounts. The problem with this approach – assuming our concern is to avoid material ALLL errors at the asset class level – is that it will dramatically overstate the aggregate number of loans required. A better approach, which still involves segregating the portfolio into risk buckets, is to assign varying margins of error across the buckets in a way that minimizes the aggregate sample required while maintaining a proportional portfolio mix and keeping the aggregate margin of error within the aggregate materiality threshold. A tool like Solver within Microsoft Excel can perform this optimization task with precision. The resulting error margins (as a percentage of each bucket’s default rate estimates) are much wider than they would be on a standalone basis for buckets with low frequencies and slightly narrower for buckets with high default frequencies. Even at its most optimized, though, the total number of loans needed to estimate the default rates of multiple like-risk buckets will skyrocket as the number of key credit risk variables increases. A superior approach to bucketing is loan-level modeling, which treats the entire asset class as one sample but estimates loan-specific default rates according to the individual risk characteristics of each loan.

Loan-Level Modeling


Suppose within a particular asset class, FICO is the only factor that affects default rates, and we segregate loans into four FICO buckets based on credit performance. (Assume for simplicity that each bucket holds an equal number of loans.) The buckets’ default rates range from 1% to 7%. As before, average loss severity is 25% and our materiality threshold is 0.25% of principal balance. Whether with a bucketing approach or loan-level modeling, either way we need a sample of about 5,000 loans total across the asset class. (We calculate the sample required for bucketing with Solver as described above and calculate the sample required for loan-level modeling with an iterative approach described below.) Now suppose we discover that loan age is another key performance driver. We want to incorporate this into our model because an accurate ALLL minimizes earnings volatility and thereby minimizes excessive capital buffers. We create four loan age buckets, leaving us now with 4 × 4 = 16 buckets (again, assume the buckets hold equal loan count). With four categories each of two variables, we would need around 9,000 loans for loan-level modeling but 20,000 loans for a bucketing approach, with around 1,300 in each bucket. (These are ballpark estimates that assume that your loan-level model has been properly constructed and fit the data reasonably well. Your estimates will vary somewhat with the default rates and loss severities of your available sample. Also, while this article deals with loan count sufficiency, we have noted previously that the same dataset must also cover a sufficient timespan, whether you are using loan-level modeling or bucketing.) Finally, suppose we include a third variable, perhaps stage in the economic cycle, LTV, Debt Service Coverage Ratio, or something else.

Exhibit 2: Loan-Level Modeling Yields Greater Insight from Smaller Samples

Again assume we segregate loans into four categories based on this third variable. Now we have 4^3= 64 equal-sized buckets. With loan-level modeling we need around 12,000 loans. With bucketing we need around 100,000 loans, an average of around 1,600 per bucket. As the graph shows in Exhibit 2, a bucketing approach forces us to choose between less insight and an astronomical sample size requirement. As we increase the number of variables used to forecast credit losses, the sample needed for loan-level modeling increases slightly, but the sample needed for bucketing explodes. This points to loan-level modeling as the best solution because well-performing CECL models incorporate many variables. (Another benefit of loan-level credit models, one that is of particular interest to investors, is that the granular intelligence they provide can facilitate better loan screening and pricing decisions.)

CECL Data Requirements: Sample Size for Loan-Level Modeling

Determining the sample size needed for loan-level modeling is an iterative process based on the standard errors reported in the model output of a statistical software package. After estimating and running a model on your existing sample, convert the error margin of each default rate (1.96 × the standard error of the default rate estimate to generate a 95% confidence interval) into an error margin of dollars lost by multiplying the default rate error margin by loss severity and the relevant principal balance. Next, sum each dollar error margin to determine whether the aggregate dollar error margin is within the materiality threshold, and adjust the sample size up or down as necessary. The second part in our series on CECL data requirements will lay out the data fields that should be collected and preserved to support CECL modeling.


Mortgage Insurance and Loss Severity: Causes and Effects of Mortgage Insurance Shortfalls

Mortgage Insurance and Loss Severity

This blog post is the first in a two-part series about Mortgage Insurance and Loss Severity. During the implementation of RiskSpan’s Credit Model, which enables users to estimate loan-level default, prepayment, and loss severity based on loan-level credit characteristics and macroeconomic forecasts, our team explored the many variables that affect loss severity. This series will highlight what our team discovered about Mortgage Insurance and loss severity, enabling banks to use this GSE data to benchmark their own MI recovery rates and help estimate their credit risk from MI shortfalls. RiskSpan reviewed the historical performance of Mortgage Insurers providing loan loss benefits between 1999 and 2015. Our analysis centered on Borrower and Lender-Paid Mortgage Insurance (referred to collectively as MI in this post) in Freddie Mac’s Single Family Loan-Level Dataset. Similar data is available from Fannie Mae, however, we’ve limited our initial analysis to Freddie Mac as its data more clearly reports the recovery amounts coming from Mortgage Insurers.

Mortgage Insurance Benefit Options

Exhibit 1: Mortgage Insurance Percentage Option Benefit Calculation

Mortgage Insurance Benefit = Calculated Losses x MI Percent Coverage Calculated Losses include:

  • UPB at time of default
  • Unpaid Interest
  • Other costs, such as attorney and statutory fees, taxes, insurance, and property maintenance.

Mortgage insurance protects investors against the event a borrower defaults. Mortgage Insurers have many options in resolving MI claims and determining the expected benefit, the amount the insurer pays in the event of a defaulted loan. The primary claim option is the Percentage Option, where the loan loss is multiplied by the MI percentage, as shown in Exhibit 1. Freddie Mac’s dataset includes the MI percentage and several loss fields, as well as other loan characteristics necessary to calculate the loss amount for each loan. The Mortgage Insurer will elect to use other claim options if they result in a lower claim than the Percentage Option. For example, if the Calculated Losses less the Net Proceeds from the liquidation of the property (i.e., net losses) are less than the Mortgage Insurance Benefit via the Percentage Option, the Mortgage Insurer can select to reimburse the net losses. Mortgage insurers can choose to acquire the property, known as the Acquisition Option. The mortgage insurer acquires the property after paying the full amount of the Calculated Losses on the loan to the investor. There were no instances in the data of Mortgage Insurers exercising the Acquisition Option after 2006.

Causes of Mortgage Insurance Shortfalls

Freddie Mac’s loan- level dataset allows us to examine loans with MI, which experienced default and sustained losses. We find that there are cases in which mortgages with MI coverage are not receiving their expected MI benefits after liquidation. These occurrences can be explained by servicing and business factors not provided in the data, for example: Cancellation: Mortgage Insurance may be cancelled either by non-payment of MI premium to the insurer or loan reaching a certain CLTV threshold. Per the Homeowners Protection Act of 1998, servicers automatically terminate private mortgage insurance (PMI) once the principal balance of the mortgage reaches 78% of the original value or if the borrower asks for MI to be cancelled once mark-to-market loan-to-value is below 80%. Denial: Mortgage Insurers may deny a claim for multiple factors, such as

  •  not filing a Notice of Default with the Mortgage Insurer within MI policy guideline’s time frame,
  • not submitting the claim within a timely period after the liquidation event,
  • inability to transfer title, or
  • not providing the necessary claim documentation, usually from underwriting, from loan origination to Mortgage Insurers at time of claim.

Rescission: Mortgage Insurers will rescind an MI claim but will refund the MI premiums to the servicer. Rescission of claims are usually linked to the original underwriting of the loan and might be caused by multiple factors, such as

  • underwriting Negligence by the lender,
  • third-party fraud, or
  • misrepresentation of the Borrower.

Curtailment: Mortgage Insurers will partially reimburse the filed claim if the expenses are outside of their MI policy scope. Examples of curtailment to MI claims include

  • excess interest, taxes, and insurance expenses beyond coverage provision of the Master Policy. Most current MI policies do not have these restrictions,
  • non-covered expenses such as costs associated with physical damage to the property, tax penalties, etc., and
  • delays in reaching foreclosure in a timely manner.

Receivership: During the mortgage crisis, several of the Mortgage Insurers (for instance Triad, PMI, and RMIC) became insolvent and the state insurance regulators placed them into receivership. For the loans that were insured by Mortgage Insurers in receivership, claims are currently being partially paid (at around 50% of the expected benefit) with the unpaid benefit being deferred. This unpaid benefit runs the risk of not being paid. These factors are evident in the data and our analysis as follows: Cancellations: The Freddie Mac dataset does not provide the MI in force at the time of default, so we cannot identify cases of cancellation. These cases would show up as an instance of no MI payment. Denials & Rescissions: Our analysis excludes any loans that were repurchased by the lender, which would likely exclude most instances of MI rescission and denial. In instances where the Mortgage Insurer found sufficient case to rescind or deny, Freddie Mac would most likely find sufficient evidence for a lender repurchase as well. Curtailments: The analysis includes the impact of MI curtailment. Receivership: The analysis includes the impact of Mortgage Insurers going into receivership.

Shortfalls of Expected Mortgage Insurance Recoveries

In the exhibits below, we provide the calculated MI Haircut Rate by Vintage Year and by Disposition Year for the loans in our analysis. We define the MI Haircut Rate as the shortfall between our calculated expected MI proceeds and the actual MI proceeds reported in the dataset. The shortfall in MI recoveries is separated into two categories: MI Gap and No MI Payment. 

  • MI Gap represents instances where some actual MI proceeds exist, but they are less than our calculated expected amount. The shortfall in actual MI benefit could be due to either Curtailment or partial payment due to Receivership. 
  • No MI Payment represents instances where there was no MI recovery associated with loans that experienced losses and had MI at origination. No payment could be due to Rescission, Cancellation, Denial, or Receivership.

For purposes of this analysis, the Severity Rate represented below does not include the portion of the loss outside of the MI scope. For example, in 2001, average severity rate was 30%, but only 19% was eligible to be offset by MI. This was done in order to give a better understanding of the MI haircut’s effect on the Severity Rate. Exhibit 1: Mortgage Insurance Haircut Rate by Vintage Years Mortgage Insurance Haircut Rate by Vintage Years We can observe an MI Haircut Rate averaging at 19.50% for vintages 1999 to 2011 with higher haircuts for the distressed vintages 2003 to 2008 at 23.50%. Exhibit 2: Mortgage Insurance Haircut Rate by Disposition Year Mortgage Insurance Haircut Rate by Disposition Year Our analysis shows the MI Haircut Rate prior to 2008 on average was 6.5% and steadily increased to an average of 25% from 2009 thru 2014. We will explain below. Exhibit 3: Mortgage Insurance Haircut Rate and Expense to Delinquent UPB Percentage by Months Non-Performing Mortgage Insurance Haircut Rate and Expense to Delinquent UPB Percentage by Months Non-Performing In this analysis, we observe the MI Haircut Rate steadily increased by the number of months between when a loan was first classified as non-performing and when a loan liquidated. This increase can be explained by increased curtailments tied to expenses that increase over time, such as expenses associated with physical damage of the property, tax penalties, delinquent interest, insurance and taxes outside the coverage period, and excessive maintenance or attorney fees. Interest, taxes, and insurance typically constitute 85% of all loss expenses. This analysis of mortgage insurance is an exploratory post into what causes the shortfall in MI claims and how those shortfalls can affect loss severity. RiskSpan will be addressing a series of topics related to Mortgage Insurance and loss severity.  In our next post we will address how banks can use this GSE data to benchmark their own MI recovery rates and help estimate their credit risk from MI shortfalls.

Get Started
Log in