Get Started
Articles Tagged with: Model Validation

Performance Testing: Benchmarking Vs. Back-Testing

When someone asks you what a model validation is, what is the first thing you think of? If you are like most, then you would immediately think of performance metrics— those quantitative indicators that tell you not only if the model is working as intended, but also its performance and accuracy over time and compared to others. Performance testing is the core of any model validation and generally consists of the following components:

  • Benchmarking
  • Back-testing
  • Sensitivity Analysis
  • Stress Testing

Sensitivity analysis and stress testing, while critical to any model validation’s performance testing, will be covered by a future article. This post will focus on the relative virtues of benchmarking versus back-testing—seeking to define what each is, when and how each should be used, and how to make the best use of the results of each.

Benchmarking

Benchmarking is when the validator is providing a comparison of the model being validated to some other model or metric. The type of benchmark utilized will vary, like all model validation performance testing does, with the nature, use, and type of model being validated. Due to the performance information it provides, benchmarking should always be utilized in some form when a suitable benchmark can be found.

Choosing a Benchmark

Choosing what kind of benchmark to use within a model validation can sometimes be a very daunting task. Like all testing within a model validation, the kind of benchmark to use depends on the type of model being tested. Benchmarking takes many forms and may entail comparing the model’s outputs to:

  • The model’s previous version
  • An externally produced model
  • A model built by the validator
  • Other models and methodologies considered by the model developers, but not chosen
  • Industry best practice
  • Thresholds and expectations of the model’s performance

One of the most used benchmarking approaches is to compare a new model’s outputs to those of the version of the model it is replacing. It remains very common throughout the industry for models to be replaced due to a deterioration of performance, change in risk appetite, new regulatory guidance, need to capture new variables, or the availability of new sets of information. In these cases, it is important to not only document but also prove that the new model performs better and does not have the same issues that triggered the old model’s replacement.

Another common benchmarking approach compares the model’s outputs to those of an external “challenger” model (or one built by the validator) which functions with the same objective and data. This approach is likely to return more apt output comparisons than those generated by benchmarking against older versions that are likely to be out of date since the challenger model is developed and updated with the same data as the champion model.

Another benchmark set which could be used for model validation includes other models or methodologies reviewed by the model developers as possibilities for the model being validated but ultimately not used. Model developers as best practice should always list any alternative methodologies, theories, or data which were omitted from the model’s final version. Additionally, model validators should always leverage their experience and understanding of the current best practices throughout the industry, along with any analysis previously completed on similar models. Model validation should then take these alternatives and use them as benchmarks to the model being validated.

Model validators have multiple, distinct ways to incorporate benchmarking into their analysis. The use of the different types of benchmarking discussed here should be based on the type of model, its objective, and the validator’s best judgment. If a model cannot be reasonably benchmarked, then the validator should record why not and discuss the resulting limitations of the validation.

Back-Testing

Back-testing is used to measure model outcomes. Here, instead of measuring performance with a comparison, the validator is specifically measuring whether the model is both working as intended and is accurate. Back-testing can take many forms based on the model’s objective. As with benchmarking, back-testing should be a part of every full-scope model validation to the extent possible.

What Back-Tests to Perform

As a form of outcomes analysis, back-testing provides quantitative metrics which measure the performance of a model’s forecast, the accuracy of its estimates, or its ability to rank-order risk. For instance, if a model produces forecasts for a given variable, back-testing would involve comparing the model’s forecast values against actual outcomes, thus indicating its accuracy.

A related function of model back-testing evaluates the ability of a given model to adequately measure risk. This risk could take any of several forms, from the probability of a given borrower to default to the likelihood of a large loss during a given trading day. To back-test a model’s ability to capture risk exposure, it is important first to collect the right data. In order to back-test a probability of default model, for example, data would need to be collected containing cases where borrowers have actually defaulted in order to test the model’s predictions.

Back-testing models that assign borrowers to various risk levels necessitate some special considerations. Back-testing these and other models that seek to rank-order risk involves looking at the model’s performance history and examining its accuracy through its ability to rank and order the risk. This can involve analyzing both Type 1 (false positive) and Type 2 (false negative) statistical errors against the true positive and true negative rates for a given model.  Common statistical tests used for this type of back-testing analysis include, but are not limited to, a Kolmogorov-Smirnov score (KS), a Brier score, or a Receiver Operating Characteristic (ROC).

Benchmarking vs Backtesting

Back-testing measures a model’s outcome and accuracy against real-world observations, while benchmarking measures those outcomes against those of other models or metrics. Some overlap exists when the benchmarking includes comparing how well different models’ outputs back-test against real-world observations and the chosen benchmark. This overlap sometimes leads people to mistakenly conclude that model validations can rely on just one method. In reality, however, back-testing and benchmarking should ideally be performed together in order to bring their individual benefits to bear in evaluating the model’s overall performance. The decision, optimally, should not be whether to create a benchmark or to perform back-testing. Rather, the decision should be what form both benchmarking and back-testing should take.

While benchmarking and back-testing are complementary exercises that should not be viewed as mutually exclusive, their outcomes sometimes appear to produce conflicting results. What should a model validator do, for example, if the model appears to back-test well against real-world observations but do not benchmark particularly well against similar model outputs? What about a model that returns results similar to those of other benchmark models but does not back-test well? In the first” scenario, the model owner can derive a measure of comfort from the knowledge that the model performs well in hindsight. But the owner also runs the very real risk of being “out on an island” if the model turns out to be wrong. The second scenario affords the comfort of company in the model’s projections. But what if the models are all wrong together?

Scenarios where benchmarking and back-testing do not produce complementary results are not common, but they do happen. In these situations, it becomes incumbent on model validators to determine whether back-testing results should trump benchmarking results (or vice-versa) or if they should simply temper one another. The course to take may be dictated by circumstances. For example, a model validator may conclude that macro-economic indicators are changing to the point that a model which back-tests favorably is not an advisable tool because it is not tuned to the expected forward-looking conditions. This could explain why a model that back-tests favorably remains a benchmarking outlier if the benchmark models are taking into account what the subject model is missing. On the other hand, there are scenarios where it is reasonable to conclude that back-testing results trump benchmarking results. After all, most firms would rather have an accurate model than one that lines up with all the others.

As seen in our discussion here, benchmarking and back-testing can sometimes produce distinct or similar metrics depending on the model being validated. While those differences or similarities can sometimes be significant, both benchmarking and back-testing provide critical complementary information about a model’s overall performance. So when approaching a model validation and determining its scope, your choice should be what form of benchmarking and back-testing needs to be done, rather than whether one needs to be performed versus the other.


4 Questions to Ask When Determining Model Validation Scope

Model risk management is a necessary undertaking for which model owners must prepare on a regular basis. Model risk managers frequently struggle to strike an appropriate cost-benefit balance in determining whether a model requires validation, how frequently a model needs to be validated, and how detailed subsequent and interim model validations need to be. The extent to which a model must be validated is a decision that affects many stakeholders in terms of both time and dollars. Everyone has an interest in knowing that models are reliable, but bringing the time and expense of a full model validation to bear on every model, every year is seldom warranted. What are the circumstances under which a limited-scope validation will do and what should that validation look like? We have identified four considerations that can inform your decision on whether a full-scope model validation is necessary:

  1.  What about the model has changed since the last full-scope validation?
  2. How have market conditions changed since the last validation?
  3.  How mission-critical is the model?
  4. How often have manual overrides of model output been necessary?

What Constitutes a Model Validation

Comprehensive model validations consist of three main components: conceptual soundness, ongoing monitoring and benchmarking, and outcomes analysis and back-testing.[1] A comprehensive validation encompassing all these areas is usually required when a model is first put into use. Any validation that does not fully address all three of these areas is by definition a limited-scope validation. 1 Comprehensive validations on ‘black box’ models developed and maintained by third-party vendors are therefore problematic because the mathematical code and formulas are not typically available for review (in many cases a validator can only hypothesize the cause and effect relationships between the inputs and outputs based on a reading of the model’s documentation).  Ideally, regular comprehensive validations are supplemented by limited-scope validations and outcomes analyses on an ongoing, interim basis to ensure that the model performs as expected.

Key Considerations for Model Validation

There is no ‘one size fits all’ question for determining how often a comprehensive validation is necessary, versus when a limited-scope review would be appropriate. Beyond the obvious time and cost considerations, model validation managers would benefit from asking themselves a minimum of four questions in making this determination:

Question 1: What about the model has changed since the last full-scope validation?

Many models layer economic assumptions on top of arithmetic equations. Most models consist of three principal components:

  1. inputs (assumptions and data)
  2. processing (underlying mathematics and code that transform inputs into estimates)
  3. output reporting (processes that translate estimates into useful information)

Changes to either of the first two components are more likely to require a comprehensive validation than changes to the third component. A change that materially impacts how the model output is computed, either by changing the inputs that drive the calculation or by changing the calculations themselves, is more likely to merit a more comprehensive review than a change that merely affects how the model’s outputs are interpreted.

For example, say I have a model that assigns a credit rating to a bank’s counterparties on a 100-point scale. The requirements the bank establishes for the counterparty are driven by how the model rates the counterparty. Say, for example, that the bank lends to counterparties that score between 90 and 100 with no restrictions, between 80 and 89 with pledged collateral, between 70 and 79 with delivered collateral, and does not lend to counterparties scoring below a 70. Consider two possible changes to the model:

  1. Changes in model calculations that result in what used to be a 65 now being a 79.
  2. Changes in grading scale that result in a counterparty that receives a rating of 65 now being deemed creditworthy.

While the second change impacts the interpretation of model output and may require only a limited-scope validation to determine whether the amended grading scale is defensible, the first change is almost certain to require that the validator go deeper ‘under the hood’ for verification that the model is working as intended. Assuming that the inputs did not change, the first type of change may be the result of changes to assumptions (e.g., weighting schemes) or simply a revision to a perceived calculation error. The second is a change on the reporting component, where a comparison of the model’s forecasts to those of challenger models and back-testing with historical data may be sufficient for validation. Not every change that affects model outputs necessarily requires a full-scope validation. The insertion of recently updated economic forecasts into a recently validated model may require only a limited set of tests to demonstrate that changes in the model estimates are consistent with the new economic forecast inputs. The magnitude of the impact on output also matters. Altering several input parameters that results in a material change to model output is more likely to require a full validation.

Question 2: How have market conditions changed since the last validation?

Even models that do not change at all require periodic, full-scope validations because macroeconomic conditions or other external factors call one or more of the model’s underlying assumptions into question. The 2008 global financial crisis is a perfect example. Mortgage credit and prepayment models prior to 2008 were built on assumptions that appeared reasonable and plausible based on market observations prior to 2008. Statistical models based solely on historical data before, during, or after the crisis are likely to require full-scope validations as their underlying datasets are expanded to capture a more comprehensive array of observed economic scenarios. It doesn’t always have to be bad news in the economy to instigate model changes that require full-scope validations. The federal funds rate has been hovering near zero since the end of 2008. With a period of gradual and sustained recovery potentially on the horizon, many models are beginning to incorporate rising interest rates into their current forecasts. These foreseeable model adjustments will likely require more comprehensive validations geared toward verifying that model outputs are appropriately sensitive to the revised interest rate assumptions.

Question 3: How mission-critical is the model?

The more vital the model’s outputs are to financial statements or mission-critical business decisions, the greater the need for frequent and detailed third-party validations. Model risk is amplified when the model outputs inform reports that are provided to investors, regulators, or compliance authorities. Particular care should be given when deciding whether to partially validate models with such high-stake outputs. Models whose outputs are used for internal strategic planning are also important. That being said, some models are more critical to a bank’s long-term success than others. Ensuring the accuracy of the risk algorithms used for DFAST stress testing is more imperative than the accuracy of a model that predicts wait times in a customer service queue. Consequently, DFAST models, regardless of their complexity, are likely to require more frequent full-scope validations than models whose results likely undergo less scrutiny.

Question 4: How often have manual overrides of model output been necessary?

Another issue to consider revolves around the use of manual overrides to the model’s output. In cases where expert opinion is permitted to supersede the model outputs on a regular basis, more frequent full-scope validations may be necessary in order to determine whether the model is performing as intended. Counterparty credit scoring models, cited in our earlier example, are frequently subjected to manual overrides by human underwriters to account for new or other qualitative information that cannot be processed by the model. The decision of whether it is necessary to revise or re-estimate a model is frequently a function of how often such overrides are required and what the magnitude of these overrides tends to be. Models that frequently have their outputs overridden should be subjected to more frequent full-scope validations. And models that are revised as a result of numerous overrides should also likely be fully validated, particularly when the revision includes significant changes to input variables and their respective weightings.

Full or Partial Model Validation?

Model risk managers need to perform a delicate balancing act in order to ensure that an enterprise’s models are sufficiently validated while keeping to a budget and not overly burdening model owners. In many cases, limited-scope validations are the most efficient means to this end. Such validations allow for the continuous monitoring of model performance without bringing in a Ph.D. with a full team of experts to opine on a model whose conceptual approach, inputs, assumptions, and controls have not changed since its last full-scope validation. While gray areas abound and the question of full versus partial validation needs to be addressed on a case-by-case basis, the four basic considerations outlined above can inform and facilitate the decision. Incorporating these considerations into your model risk management policy will greatly simplify the decision of how detailed your next model validation needs to be. An informed decision to perform a partial model validation can ultimately save your business the time and expense required to execute a full model validation.


[1] In the United States, most model validations are governed by the following sets of guidelines: 1) OCC 2011-12 (institutions regulated by the OCC), and 2) FRB SR-11 (institutions regulated by the Federal Reserve). These guidelines are effectively identical to one another. Model validations at Government-sponsored enterprises, including Fannie Mae, Freddie Mac, and the Federal Home Loan Banks, are governed by Advisory Bulletin 2013-07, which, while different from the OCC and Fed guidance, shares many of the same underlying principles.


Managing Model Risk and Model Validation

Over the course of several hundred model validations we have observed a number of recurring themes and challenges that appear to be common to almost every model risk management department. At one time or another, every model risk manager will puzzle over questions around whether an application is a model, whether a full-scope validation is necessary, how to deal with challenges surrounding “black box” third-party vendor models, and how to elicit assistance from model owners. This series of blog posts aims to address these and other related questions with what we’ve learned while helping our clients think through these issues.

As model validators, we frequently find ourselves in the middle of debates between spreadsheet owners and enterprise risk managers over the question of whether a particular computing tool rises to the level of a “model.” To the uninitiated, the semantic question, “Is this spreadsheet a model?” may appear to be largely academic and inconsequential. But its ramifications are significant, and getting the answer right is of critical importance to model owners, to enterprise risk managers, and to regulators.

Part 2: Validating Vendor Models: Special Considerations

Many of the models we validate on behalf of our clients are developed and maintained by third-party vendors. These validations present a number of complexities that are less commonly encountered when validating “home-grown” models.

Notwithstanding these challenges, the OCC’s Supervisory Guidance on Model Risk Management (OCC 2011-12) specifies that “Vendor products should nevertheless be incorporated into a bank’s broader model risk management framework following the same principles as applied to in-house models, although the process may be somewhat modified.”

Part 3: Preparing for Model Validation: Ideas for Model Owners

Though not its intent, model validation can be disruptive to model owners and others seeking to carry out their day-to-day work. We have performed enough model validations over the past decade to have learned how cumbersome the process can be to business unit model owners and others we inconvenience with what at times must feel like an endless barrage of touch-point meetings, documentation requests and other questions relating to modeling inputs, outputs, and procedures.

Part 4: 4 Questions to Ask When Determining Model Scope

Model risk management is a necessary undertaking for which model owners must prepare on a regular basis. Model risk managers frequently struggle to strike an appropriate cost-benefit balance in determining whether a model requires validation, how frequently a model needs to be validated, and how detailed subsequent and interim model validations need to be. The extent to which a model must be validated is a decision that affects many stakeholders in terms of both time and dollars. Everyone has an interest in knowing that models are reliable, but bringing the time and expense of a full model validation to bear on every model, every year is seldom warranted. What are the circumstances under which a limited-scope validation will do and what should that validation look like?

We have identified four considerations that can inform your decision on whether a full-scope model validation is necessary…

Part 5: Performance Testing: Benchmarking vs. Back-Testing

When someone asks you what a model validation is what is the first thing you think of? If you are like most, then you would immediately think of performance metrics— those quantitative indicators that tell you not only if the model is working as intended, but also its performance and accuracy over time and compared to others. Performance testing is the core of any model validation and generally consists of the following components:

  • Benchmarking
  • Back-testing
  • Sensitivity Analysis
  • Stress Testing

Sensitivity analysis and stress testing, while critical to any model validation’s performance testing, will be covered by a future article. This post will focus on the relative virtues of benchmarking versus back-testing—seeking to define what each is, when and how each should be used, and how to make the best use of the results of each.

Part 6: Model Input Data Validation – How much is Enough? 

In some respects, the OCC 2011-12/SR 11-7 mandate to verify model inputs could not be any more straightforward: “Process verification … includes verifying that internal and external data inputs continue to be accurate, complete, consistent with model purpose and design, and of the highest quality available.” From a logical perspective, this requirement is unambiguous and non-controversial. After all, the reliability of a model’s outputs cannot be any better than the quality of its inputs.


Preparing for Model Validation: Ideas for Model Owners

Though not its intent, model validation can be disruptive to model owners and others seeking to carry out their day-to-day work. We have performed enough model validations over the past decade to have learned how cumbersome the process can be to business unit model owners and others we inconvenience with what at times must feel like an endless barrage of touch-point meetings, documentation requests and other questions relating to modeling inputs, outputs, and procedures.

We recognize that the only thing these business units did to deserve this inconvenience was to devise or procure a methodology for systematically improving how something gets estimated. In some cases, the business owner of an application tagged for validation may view it simply as a calculator or other tool, and not as a “model.” And in some cases we agree with the business owner. But in every case, the system under review has been designated as a model requiring validation either by an independent risk management department within the institution or (worse) by a regulator, and so, the validation project must be completed.

As with so many things in life, when it comes to model validation preparation, an ounce of prevention goes a long way. Here are some ideas model owners might consider for making their next model validation a little less stressful.

Overall Model Documentation

Among the first questions we ask at the beginning of a model validation is whether the model has been validated before. In reality, however, we can make a fairly reliable guess about the model’s validation history simply by reading the model owner’s documentation. A comprehensive set of documentation that clearly articulates the model’s purpose, its inputs’ sources, how it works, what happens to the outputs and how the outputs are monitored is an almost sure sign that the model in question has been validated multiple times.

In contrast, it’s generally apparent that the model is being validated for the first time when our initial request for documentation yields one or more of the following:

  • An 800-page user guide from the model’s vendor, but no internally developed documentation or procedures
  • Incomplete (or absent) lists of model inputs with little or no discussion of how inputs and assumptions are obtained, verified, or used in the model
  • No discussion of the model’s limitations
  • Perfunctory monitoring procedures, such as, “The outputs are reviewed by an analyst for reasonableness”
  • Vague (or absent) descriptions of the model’s outputs and how they are used
  • Change logs with just one or two entries

No one likes to write model documentation. There never seems to be enough time to write model documentation. Compounding this challenge is the fact that model validations frequently seem to occur at the most inopportune moments for model owners. A bank’s DFAST models, for example, often undergo validation while the business owners who use them are busy preparing the bank’s DFAST submission. This is not the best time to be tweaking documentation and assembling data for validators.

Documentation would ideally be prepared during periods of lower operational stress. Model owners can accomplish this by predicting and staying in front of requests from model risk management by independently generating documentation for all their models that satisfies the following basic criteria:

  • Identifies the model’s purpose, including its business and functional requirements, and who is responsible for using and maintaining the model
  • Comprehensively lists and justifies of the model’s inputs and assumptions
  • Describes the model’s overall theory and approach, i.e., how the model goes about transforming the inputs and assumptions into reliable outputs (including VBA or other computer code if the model was developed in house)
  • Lays out the developmental evidence supporting the model
  • Identifies the limitations of the model
  • Explains how the model is controlled—who can access it, who can change it, what sorts of approvals are required for different types of changes
  • Comprehensively identifies and describes the model’s outputs, how they are used, and how they are tested

Any investment of time beforehand to incorporate the items above into the model’s documentation will pay dividends when the model validation begins. Being able to simply hand this information over to the validators will likely save model owners hours of attending follow-up meetings and fielding requests. Additional suggestions for getting the model’s inputs and outputs in order follow below.

All of the model’s inputs and assumptions need to be explicitly spelled out, as well as their relevance to the model, their source(s), and any processes used to determine their reliability. Simply emailing an Excel file containing the model and referring the validator to the ‘Inputs’ tab is probably going to result in more meetings, more questions, and more time siphoned out of the model owner’s workday by the validation team.

A useful approach for consolidating inputs and assumptions that might be scattered around different areas of the model involves the creation of a simple table that captures everything a validator is likely to ask about each of the model’s inputs and assumptions.

Systematically capturing all of the model’s inputs and assumptions in this way enable the validators to quickly take inventory of what needs to be tested without having to subject the model owner to a time-consuming battery of questions designed to make sure they haven’t missed anything.

Model Outputs

Being prepared to explain to the validator all the model’s outputs individually and how each is used in reporting and downstream applications greatly facilitates the validation process. Accounting for all the uses of every output becomes more complicated when they are used outside the business unit, including as inputs to another model. At the discretion of the institution’s model risk management group, it may be sufficient to limit this exercise only to uses within the model owner’s purview and to reports provided to management. As with inputs, this can be facilitated by a table.

Outputs that impact directly on financial statements are especially important. Model validators are likely to give these outputs particular scrutiny and model owners would do well to be prepared to explain not only how such outputs are computed and verified, but how the audit trails surrounding them are maintained, as well.

To the extent that outputs are subjected to regular benchmarking, back-testing, or sensitivity analyses, these should be gathered as well.

A Series of Small Investments

A model owner might look at these suggestions and conclude that they seem like a lot of work just to get ready for a model validation. We agree. Bear in mind, however, that the model validator is almost certain to ask for these things at some point during the validation, when, chances are, a model owner is likely to wish she had the flexibility to do her real job. Making a series of small-time investments to assemble these items well in advance of the validator’s arrival not only will make the validation more tolerable for model owners but will likely improve the overall modeling process as well.


Vendor Model Validation

Many of the models we validate on behalf of our clients are developed and maintained by third-party vendors. These validations present a number of complexities that are less commonly encountered when validating “home-grown” models. These often include:

  1. Inability to interview the model developer
  2. Inability to review the model code
  3. Inadequate documentation
  4. Lack of developmental evidence and data sets
  5. Lack of transparency into the impact custom settings

Notwithstanding these challenges, the OCC’s Supervisory Guidance on Model Risk Management (OCC 2011-12)1 specifies that, “Vendor products should nevertheless be incorporated into a bank’s broader model risk management framework following the same principles as applied to in-house models, although the process may be somewhat modified.”

The extent of these modifications depends on the complexity of the model and the cooperation afforded by the model’s vendor. We have found the following general principles and practices to be useful.

Model Validation for Vendor Models

Vendor Documentation is Not a Substitute for Model Documentation

Documentation provided by model vendors typically includes user guides and other materials designed to help users navigate applications and make sense of outputs. These documents are written for a diverse group of model users and are not designed to identify and address particular model capabilities specific to the purpose and portfolio of an individual bank. A bank’s model documentation package should delve into its specific implementation of the model, as well as the following:

  • Discussion of the model’s purpose and specific application, including business and functional requirements achieved by the model
  • Discussion of model theory and approach, including algorithms, calculations, formulas, functions and programming
  • Description of the model’s structure
  • Identification of model limitations and weaknesses
  • Comprehensive list of model inputs and assumptions, including their sources
  • Comprehensive list of outputs and reports and how they are used, including downstream systems that rely on them
  • Description of testing (benchmarking and back-testing)

Because documentation provided by the vendor is likely to include very few if any of these items, it falls to the model owner (at the bank) to generate this documentation. While some of these items (specific algorithms, calculations, formulas, and programming, for example) are likely to be deemed proprietary and will not be disclosed by the vendor, most of these components are obtainable and should be requested and documented.

Model documentation should also clearly lay out all model settings (e.g., knobs) and justification for the use of (or departure from) vendor default settings.

Model Validation Testing Results Should Be Requested of the Vendor

OCC 2011-12 states that “Banks should expect vendors to conduct ongoing performance monitoring and outcomes analysis, with disclosure to their clients, and to make appropriate modifications and updates over time.” Many vendors publish the results of their own internal testing of the model. For example, a prepayment model vendor is likely to include back-testing results of the model’s forecasts for certain loan cohorts against actual, observed prepayments. An automated valuation model (AVM) vendor might publish the results of testing comparing the property values it generates against sales data. If a model’s vendor does not publish this information, model validators should request it and document the response in the model validation report. Where available, this information should be obtained and incorporated into the model validation process, along with a discussion of its applicability to data the bank is modeling. Model validators should attempt to replicate the results of these studies, where feasible, and use them to enhance their own independent benchmarking and back-testing activities.

Developmental Evidence Should Be Requested of the Vendor

OCC 2011-12 directs banks to “require the vendor to provide developmental evidence explaining the product components, design, and intended use.” This should be incorporated into the bank’s model documentation. Where feasible, model validators should also ask model vendors to provide information about data sets that were used to develop and test the model.

Contingency plans should be maintained: OCC 2011-12 cites the importance of a bank’s having “as much knowledge in-house as possible, in case the vendor or the bank terminates the contract for any reason, or if the vendor is no longer in business. Banks should have contingency plans for instances when the vendor model is no longer available or cannot be supported by the vendor.” For simple applications whose inner workings are well understood and replicable, a contingency plan may be as simple as Microsoft Excel. This requirement can pose a significant challenge, however, for banks that purchase off-the-shelf asset-liability and market risk models and do not have the in-house expertise to quickly and adequately replicate these models’ complex computations. Situations such as this argue for the implementation of reliable challenger models, which not only assist in meeting benchmarking requirements but can also function as a contingency plan backup.

Consult the Model Risk Management Group During the Process of Procuring Any Application That Might Possibly be Classified as a “Model”

In a perfect world, model validation considerations would be contemplated as part of the procurement process. An agreement to provide developmental evidence, testing results, and cooperation with future model validation efforts would ideally figure into the negotiations before the purchase of any application is finalized. Unfortunately, our experience has shown that banks often acquire what they think of as a simple third-party application, only to be informed after the fact, by either a regulator or the model risk management group, that they have in fact purchased a model requiring validation. A model vendor, particularly one not inclined to think of its product as a “model,” may not always be as responsive to requests for development and testing data after sale if those items have not been requested as a condition for the sale. It is, therefore, a prudent practice for procurement departments to have open lines of communication with model risk management groups so that the right questions can be asked and requirements established prior to application acquisition.


[1] See also: Federal Reserve Board of Governors Guidance on Model Risk Management (SR 11-7)


Model Validation: Is This Spreadsheet a Model?

As model validators, we frequently find ourselves in the middle of debates between spreadsheet owners and enterprise risk managers over the question of whether a particular computing tool rises to the level of a “model.” To the uninitiated, the semantic question, “Is this spreadsheet a model?” may appear to be largely academic and inconsequential. But its ramifications are significant, and getting the answer right is of critical importance to model owners, to enterprise risk managers, and to regulators.

Stakeholders of Model Validation

In the most important respects, the incentives of these stakeholder groups are aligned. Everybody has an interest in knowing that the spreadsheet in question is functioning as it should and producing accurate and meaningful outputs. Appropriate steps should be taken to ensure that every computing tool does this, regardless of whether it is ultimately deemed a model. But classifying something as a model carries with it important consequences related to cost and productivity, as well as overall model risk management.

It is here where incentives begin to diverge. Owners and users of spreadsheets, in particular, are generally inclined to classify them as simple applications or end-user computing (EUC) tools whose reliability can (and ought to) be ascertained using testing measures that do not rise to the level of formal model validation procedures required by regulators.1 These formal procedures can be both expensive for the institution and onerous for the model owner. Models require meticulous documentation of their approach, economic and financial theory, and code. The painstaking statistical analysis is frequently required to generate the necessary developmental evidence, and further cost is then incurred to validate all of it.

Enterprise risk managers and regulators, who do not necessarily feel these added costs and burdens, may be inclined to err on the side of classifying spreadsheets as models “just to be on the safe side.” But incurring unnecessary costs is not a prudent course of action for a financial institution (or any institution). And producing more model validation reports than is needful can have other unintended, negative consequences. Model validations pull model owners away from their everyday work, adversely affecting productivity and, sometimes, quality of work. Virtually every model validation report identifies issues that must be reviewed and addressed by management. Too many unnecessary reports containing findings that are comparatively unimportant can bury enterprise risk managers and distract them from the most urgent findings.

Definition of a Model

So what, then, are the most important considerations in determining which spreadsheets are in fact models that should be subject to formal validation procedures? OCC and FRB guidance on model risk management defines a model as follows:2

A quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.

The same guidance refers to models as having three components:

  1. An information input component, which delivers assumptions and data to the mode
  2. A processing component, which transforms inputs into estimates
  3. A reporting component, which translates the estimates into useful business information

This definition and guidance leave managers with some latitude. Financial institutions employ many applications that apply mathematical concepts to defined inputs in order to generate outputs. But the existence of inputs, outputs, and mathematical concepts alone does not necessarily justify classifying a spreadsheet as a model.

Note that the regulatory definition of a model includes the concept of quantitative estimates. The term quantitative estimate implies a level of uncertainty about the outputs. If an application is generating outputs about which there is little or no uncertainty, then one can argue the output is not a quantitative estimate but, rather, simply a defined arithmetic result. While quantitative estimates typically result from arithmetic processes, not every defined arithmetic result is a quantitative estimate.

For example, a spreadsheet that sums all the known balances of ten bank accounts as of a given date, even if it is supplied by automated feeds, and performs the summations in a complete lights-out process, likely would not rise to the level of a model requiring validation because it is performing a simple arithmetic function; it is not generating a quantitative estimate.3

In contrast, a spreadsheet that projects what the sum of the same ten bank balances will be as of a given future date (based on assumptions about interest rates, expected deposits, and decay rates, for example) generates quantitative estimates and would, therefore, qualify as a model requiring validation. Management and regulators would want to have comfort that the assumptions used by this spreadsheet model are reasonable and that they are being applied and computed appropriately.

Is this Spreadsheet a Model?

We have found the following questions to be particularly enlightening in helping our clients determine whether a spreadsheet should be classified as 1) a model that transforms inputs into quantitative estimates or 2) a non-model spreadsheet that generates defined arithmetic results.

Question 1: Does the Spreadsheet Produce a Demonstrably “Right” Answer?

A related question is whether benchmarking yields results that are comparable, as opposed to exactly the same. If spreadsheets designed by ten different people can reasonably be expected to produce precisely the same result (because there is only one generally accepted way of calculating it), then the result probably does not qualify as a quantitative estimate and the spreadsheet probably should not be classified as a model.

Example 1 (Non-Model): Mortgage Amortization Calculator: Ten different applications   would be expected to transform the same loan amount, interest rate, and term information into precisely the same amortization table. A spreadsheet that differed from this expectation would be considered “wrong.” We would not consider this output to be a quantitative estimate and would be inclined to classify such a spreadsheet as something other than a model.

Example 2 (Model): Spreadsheet projecting the expected UPB of a mortgage portfolio in 12 months:  Such a spreadsheet would likely need to apply and incorporate prepayment and default assumptions. Different spreadsheets could compute and apply these assumptions differently, without one particularly   necessarily   being recognized as “wrong.” We would consider the resulting UPB projections to be quantitative estimates and would be likely to classify such as spreadsheet as a model.

Note that the spreadsheets in both examples tell their users what a loan balance will be in the future. But only the second example layers economic assumptions on top of its basic arithmetic calculations. Economic assumptions can be subjected to verification after the fact, which relates to our second question:

Question 2: Can the Spreadsheet’s Output Be Back-Tested?

Another way of stating this question would be, “Is back-testing required to gauge the accuracy of the spreadsheet’s outputs?” This is a fairly unmistakable indicator of a forward-looking quantitative estimate. A spreadsheet that generates forward-looking estimates is almost certainly a model and should be subjected to formal model validation.

Back-testing would not be of any particular value in our first (non-model) example, above, as the spreadsheet is simply calculating a schedule. In our second (model) example, however, back-testing would be an invaluable tool for judging the reliability of the prepayment and default assumptions driving the balance projection.

Question 3: Is the Spreadsheet Simply Applying a Defined Set of Business Rules?

Spreadsheets are sometimes used to automate the application of defined business rules in order to arrive at a prescribed course of action. This question is a corollary to the first question about whether the spreadsheet produces output that is, by definition, “correct.”

Examples of business-rule calculators are spreadsheets that determine a borrower’s eligibility for a particular loan product or loss mitigation program. Such spreadsheets are also used to determine how much of a haircut to apply to various collateral types based on defined rules.

These spreadsheets do not generate quantitative estimates and we would not consider them models subject to formal regulatory validation.

Should I Validate This Spreadsheet?

All spreadsheets that perform calculations should be subject to review. Any spreadsheet that produces incorrect or otherwise unreliable outputs should not be used until its errors are corrected. Formal model validation procedures, however, should be reserved for spreadsheets that meet certain criteria. Subjecting non-model spreadsheets to model validation unnecessarily drives up costs and dilutes the findings of bona fide model validations by cluttering enterprise risk management’s radar with an unwieldy number of formal issues requiring tracking and resolution.

Spreadsheets should be classified as models (and validated as such) when they produce forward-looking estimates that can be back-tested. This excludes simple calculators that do not rely on economic assumptions or apply business rules that produce outputs that can be definitively identified before the fact as “right” or “wrong.”

We believe that the systematic application of these principles will alleviate much of the tension between spreadsheet owners, enterprise risk managers, and regulators as they work together to identify those spreadsheets that should be subject to formal model validation.


[1] In the United States, most model validations are governed by one of the following sets of guidelines: 1) OCC 2011-12 (institutions regulated by the OCC), 2) FRB SR-11 (institutions regulated by the Federal Reserve) and 3) FHFA 2013-07 (Fannie Mae, Freddie Mac, and the Federal Home Loan Banks). These documents have much in common and the OCC and FRB guidelines are identical to one another.

[2] See footnote 1.

[3] Management would nevertheless want to obtain assurances that such an application was functioning correctly. This, however, can be achieved via less intrusive means than a formal model validation process. This might be addressed via conventional auditing, SOX reviews, or EUC quality gates. All of these are less intrusive.


Reducing the Cost of Model Validation Programs

Across the financial services industry, increased oversight has led to significant increases in expenses related to assessing and monitoring risk. We see over and over that institutions are weighted with significantly higher regulatory standards, but are not given commensurate financial resources. Model validation is an area where banks are incurring significant expenses to meet regulatory and internal requirements.In response to client demand, RiskSpan makes the following recommendations to institutions that are looking to maintain the quality of the model validation process while reducing the associated costs.

Model Governance Policy

The first step is devising a model governance policy that is aligned with regulations and the institution’s approach to risk management. Fundamental to the policy is the identification of the models themselves. Once the models are identified, model owners must be notified that their respective models (or tools or applications) are defined as a model, and as such, are expected to adhere to the institutions’ model governance policy. Model owners will need to fully understand the expectations of the model validation regulatory guidance in order to prepare their business units for successful model validation reviews.

Model Documentation

Once the policy is created and expectations are communicated to model owners, a risk ranking will need to be performed (for example, High, Medium and Low), which will shape the scope and prioritization of model validation activities. Risk managers within the organization may want to consider different standards for model documentation and the detail of model validation reports based on the risk ranking of the model. Model documentation is one area where there are significant cost-saving opportunities. Model validation is less costly when business owners have been given easy-to-follow documentation templates based on model governance policies. Template-building is an up-front activity that guides business units to produce quality documentation. Inversely, the lack of proper model documentation exposes the business to risk and a drag on financial resources. When there is limited communication of the model’s capabilities, purpose and limitations, the workload of a model owner ends up being transferred to a model validation team as the validators attempt to gain a basic understanding of the model. This can end up being a costly activity, and consume valuable resources.

Validation Scheduling

Scheduling of the actual validation itself should take place after model documentation is complete. In fact, the price of admission into a model validation program should be a robust set of model documents. Risk managers that coordinate the validation must be sensitive to business cycle of model owners and times of validators and regulator expectations. At the same time, validation activities should not be pushed to the very end of the year.

Validation Test Plans

An additional element that can be developed early-on is a validation test plan which increases transparency (particularly when third-party validators are used) and allows for testing to be run on a periodic basis more efficiently. Test plans can be used for years after they are first developed, and may be modified to account for market changes that could impact model performance.

Buy versus Build?

Bank executives are faced with a choice: outsource model validation or maintain an internal staff to perform model validation activities. Depending on the complexity and required technical expertise to understand a model, a bank may not have the specific expertise contained within an internal model validation department, therefore outsourcing all or part of the validation may be necessary. Alternatively, banks that prefer to keep validation resources “in-house” may consider periodic staffing support with subject matter expertise to make it through periods of high volume of validation activities (for many banks, validations end up occurring in the 2nd half of the year, and end up being in a crunch at year-end). About the Author: Pat Greene currently supports strategic and tactical initiatives by RiskSpan to enhance a suite of valuation tools that provide pricing, analytics, and risk reporting for multiple asset classes, including mortgages and structured securities. He has delivered technology solutions and provided financial model validation support to multiple RiskSpan clients whose business practices rely on credit models, interest-rate models, prepayment models, income simulation models, counter-party risk models, whole loan valuation models, and bond redemption forecasting models. Pat is an experienced executive who has been responsible for the management of a multi-billion dollar asset securitization program for a national financial institution. He has experience in the development and implementation of business unit objectives, management of a $4 million operating budget, and the oversight and monitoring of service levels with legal resources, accountants, and other financial institutions that supported an industry leading asset sales program. He is a skilled manager experienced in the development of business strategy that leads to business process change and technology implementation. Pat is a graduate of the United States Naval Academy and received a M.B.A. from Loyola College in Baltimore, Maryland.

Download RiskSpan Insight-September 2014


Get Started