Validating Model Inputs: How Much Is Enough?
In some respects, the OCC 2011-12/SR 11-7 mandate to verify model inputs could not be any more straightforward: “Process verification … includes verifying that internal and external data inputs continue to be accurate, complete, consistent with model purpose and design, and of the highest quality available.” From a logical perspective, this requirement is unambiguous and non-controversial. After all, the reliability of a model’s outputs cannot be any better than the quality of its inputs. From a functional perspective, however, it raises practical questions around the amount of work that needs to be done in order to consider a particular input “verified.” Take the example of a Housing Price Index (HPI) input assumption. It could be that the modeler obtains the HPI assumption from the bank’s finance department, which purchases it from an analytics firm. What is the model validator’s responsibility? Is it sufficient to verify that the HPI input matches the data of the finance department that supplied it? If not, is it enough to verify that the finance department’s HPI data matches the data provided by its analytics vendor? If not, is it necessary to validate the analytics firm’s model for generating HPI assumptions? It depends. Just as model risk increases with greater model complexity, higher uncertainty about inputs and assumptions, broader use, and larger potential impact, input risk increases with increases in input complexity and uncertainty. The risk of any specific input also rises as model outputs become increasingly sensitive to it.
Validating Model Inputs Best Practices
So how much validation of model inputs is enough? As with the management of other risks, the level of validation or control should be dictated by the magnitude or impact of the risk. Like so much else in model validation, no ‘one size fits all’ approach applies to determining the appropriate level of validation of model inputs and assumptions. In addition to cost/benefit considerations, model validators should consider at least four factors for mitigating the risk of input and assumption errors leading to inaccurate outputs.
- Complexity of inputs
- Manual manipulation of inputs from source system prior to input into model
- Reliability of source system
- Relative importance of the input to the model’s outputs (i.e., sensitivity)
Consideration 1: Complexity of Inputs
The greater the complexity of the model’s inputs and assumptions, the greater the risk of errors. For example, complex yield curves with multiple data points will be inherently subject to greater risk of inaccuracy than binary inputs such as “yes” and “no.” In general, the more complex an input is, the more scrutiny it requires and the “further back” a validator should look to verify its origin and reasonability.
Consideration 2: Manual Manipulation of Inputs from Source System Prior to Input into Model
Input data often requires modification from the source system to facilitate input into the model. More handling and manual modifications increase the likelihood of error. For example, if a position input is manually copied from Bloomberg and then subjected to a manual process of modification of format to enable uploading to the model, there is a greater likelihood of error than if the position input is extracted automatically via an API. The accuracy of the input should be verified in either case, but the more manual handling and manipulation of data that occurs, the more comprehensive the testing should be. In this example, more comprehensive testing would likely take the form of a larger sample size.
In addition, the controls over the processes to extract, transform, and load data from a source system into the model will impact the risk of error. More mature and effective controls, including automation and reconciliation, will decrease the likelihood of error and therefore likely require a lighter verification procedure.
Consideration 3: Reliability of Source Systems
More mature and stable source systems generally produce more consistently reliable results. Conversely, newer systems and those that have produced erroneous results increase the risk of error. The results of previous validation of inputs, from prior model validations or from third parties, including internal audit and compliance, can be used as an indicator of the reliability of information from source systems and the magnitude of input risk. The greater the number of issues identified, the greater the risk, and the more likely it is that a validator should seek to drill deeper into the fundamental sources of source data.
Consideration 4: Output Sensitivity to Inputs
No matter how reliable an input data’s source system is deemed to be, or the amount of manual manipulation to which an input is subjected, perhaps the most important consideration is the individual input’s power to affect the model’s outputs. Returning to our original example, if a 50 percent change in the HPI assumption has only a negligible impact on the model’s outputs, then a quick verification against the report supplied by the finance department may be sufficient. If, however, the model’s outputs are extremely sensitive to even small shifts in the HPI assumption, then additional testing is likely warranted—perhaps even to include a validation of the analytics vendor’s HPI model (along with all of its inputs).
A Cost-Effective Model Input Validation Strategy
When it comes to verifying model inputs, there is no theoretical limitation to the lengths to which a model validator can go. Model risk managers, who do not have unlimited time or budgets, would benefit from applying practical limits to validation procedures using a risk-based approach to determine the most cost-effective strategies to ensure that models are sufficiently validated. Applying the considerations listed above on a case-by-case basis will help validators appropriately define and scope model input reviews in a manner commensurate with appropriate risk management principles.