Jan Matas, Author at RiskSpan

Validating Vendor Models

RiskSpan validates a diverse range of models, including many that have been developed by third-party vendors. Vendor models present unique implications when it comes to model risk management (MRM). In this article, we seek to describe how we align our approach to validating these models with existing regulatory guidance and provide an explanation of what financial institutions should expect when it comes time to validate their vendor models.

Our clients use third-party vendor models that touch on virtually every risk function. The most common ones include:

Anti-money laundering (AML) solutions for Suspicious Activity Monitoring (SAM) and Customer Due Diligence (CDD).
Asset-Liability Management models that simulate the whole balance sheet under different interest rate scenarios to provide analytics for interest rate risk monitoring.
Structured assets and mortgage loan analytics platforms (similar to RiskSpan’s Edge Platform).
Mortgage pipeline management platforms, including loan pricing, best execution determination, analytics, and trade tracking.
Climate risk models that quantify the risk associated with the future effects of climate change on assets at different locations.
Artificial intelligence (AI) platforms that help model developers optimize the machine learning (ML) algorithm, feature selection, and hyperparameter tuning automatically with the target performance metric.

Vendor Models and MRM Considerations

Regardless of whether a model is fully home grown or a “black box” purchased from a third-party vendor, the same basic MRM principles apply. Banks are expected to validate their own use of vendor products [OCC 2011-12, p.15] and thus institutions should understand the specifics of vendor models that pose model risk and require considerations for validation. The following table outlines specific risks that vendor models pose, along with mitigating considerations and strategies model risk managers should consider.

Specifics	Description	MRM and Validation Implications
Complexity	Some vendor models offer many functionalities and sub-models dedicated to different tasks. These various models are often highly integrated into the client’s internal systems and databases.	Well-crafted model documentation is important to make the validation efficient. Validation requires more time since all model functionalities and components must be mapped.
Specialized Expertise	Vendor models are often developed based on accumulated know-how of a specific field of study.	Validation requires professionals with specific field of study experience and who understand the model in relation to industry standards.
Regulatory Requirements and Compliance	Many models need to comply with existing regulations (ex: fair lending in credit scoring) or are implemented to ensure compliance (BSA/AML and the PATRIOT Act).	Validation requires expertise in specific regulatory compliance.
Opaque design, assumptions, and imitations	Vendors usually do not provide code for review and some aspects of the model may be based on proprietary research or data.	Banks should require the vendor to provide developmental evidence explaining the product components, design, and intended use, to determine whether the model is appropriate for the bank’s products, exposures, and risks. They should also clearly indicate the model’s limitations and assumptions and where the product’s use may be problematic. [OCC 2011-12, pp. 15-16].
Vague or incomplete documentation from the Vendor	Often in the name of protecting IP, model documentation provided by the vendor may be vague or incomplete.	Banks should ensure that appropriate documentation of the third-party approach is available so that the model can be appropriately validated [OCC 2011-12, p.21]. Institutions must also develop their own internal documentation that describes the intended use of the model, lists all inputs and outputs, lists model assumptions and limitations, and summarizes all relevant information about the model provided by the vendor such as model design, methodology, etc.
Limited Model Testing	Model Testing is critical in assessing whether a model is performing as intended. However, vendors may not provide detailed results of their thorough testing of model performance, outcomes, sensitivity, assumptions appropriateness, or the results of ongoing monitoring. Moreover, there are usually limited possibilities to perform testing by the client or the validator since many parts of the model are proprietary.	Vendors should provide appropriate testing results demonstrating that the model works as expected. Banks should expect vendors to conduct ongoing performance monitoring and outcomes analysis [OCC 2011-12, pp. 15-16]. A bank also should conduct ongoing monitoring and outcomes analysis of vendor model performance using the bank’s own outcomes [OCC 2011-12, pp. 15-16]. Validation should consist of a review of the testing results provided by the vendor and of any additional testing that is feasible and practical. This usually includes analysis of outcomes and benchmarking, sometimes also manual replication, sensitivity analysis, or stress testing. Benchmarking may, however, be limited due to the uniqueness or complexity of the model, or because proprietary data were used for development.
Customization	Out-of-the-box solutions often need to be customized to meet the internal systems, policies, and specific intended use of a particular institution.	A bank’s customization choices should be documented and justified as part of the validation [OCC 2011-12, p.15].
External Data	Vendor models often rely on external input data or external data used for its development.	An important part of any validation is to determine all input data sources and assess the quality, completeness, and appropriateness of the data. OCC 2011-12, p. 16, states that banks should obtain information regarding the data used to develop the model and assess the extent to which that data is representative of the bank’s situation. OCC 2011-12, p.6, stresses that a rigorous review is particularly important for external data and information (from a vendor or outside party), especially as they relate to new products, instruments, or activities. Moreover, AB-2022-03, p.3, states that regulated entities should map their external dependencies to significant internal systems and processes to determine their systemic dependencies and interconnections. In particular, the regulated entities should have an inventory of key dependencies on externally sourced models, data, software, and cloud providers. This inventory should be regularly updated and reviewed by senior management and presented to the board of directors, as deemed appropriate.
Reliance on Vendor’s Support	Since the access to the code and implementation details is limited for vendor models, ongoing servicing and support is necessary.	Roles and responsibilities around the model should be defined and the bank’s point of contact with their vendor should not rely solely on one person. It is also critical that the bank has in-house knowledge, in case the vendor or the bank terminates the contract for any reason, or if the vendor goes out of business or otherwise ceases to support the model [OCC 2011-12, p. 16].

LEARN MORE ABOUT RISKSPAN’S MRM SERVICES

Validation Approach

Validation of vendor models follows the same general principles as validation of any other model. These principles are laid out in regulatory guidance. This guidance, along with general MRM principles, provides details specifically about model risk management related to vendor models and specifically addresses vendor and other third-party products. Based on these guidelines and our experience validating numerous vendor models, RiskSpan’s approach includes the following:

Request documents and access to:
- internal model documentation,
- vendor documentation and user manual,
- implementation documentation with a description of any customizations to the model (see Customization point in the section above),
- performance testing conducted by the model owner or vendor,
- vendor certifications,
- the model interface, if applicable, to conduct independent testing.
Documentation review: We review both the internal documentation and vendor documentation and assess its thoroughness and completeness. According to OCC 11-12, p.21, documentation should be sufficiently detailed so that parties unfamiliar with a model can understand how the model operates, its limitations, and its key assumptions. For internal documentation, we focus on the statement of model purpose, list of inputs and their sources, documentation of assumptions and limitation, description of outputs and their use, controls and governance, and any testing conducted internally. We also review the documentation of the customizations made to the vendor model.
Conceptual soundness review: Combining information from both the internal and vendor documentation, information from the model owner, and the industry expertise of our SMEs, we assess whether the model meets the stated model purpose, as well as whether the design, underlying theory, and logic are justifiable and supportable by existing research and industry standards. We also critically assess all known model assumptions and limitations and possibly identify additional assumptions that might be hidden or limitations that were not documented.
Data review: We aim to identify all data inputs, their sources, and controls related to gathering, loading, and quality of data. We also assess the quality of data by performing exploratory data analysis. Assessing development data is often not possible as the data are proprietary to the vendor.
Independent testing: To supplement, update, or verify the testing performed by the vendor, we perform internal testing where applicable. Typically, different models allow different testing methods but permission to access model interfaces is often needed for validators. This is also acknowledged in OCC 11-12, p.15: External models may not allow full access to computer coding and implementation details, so the bank may have to rely more on sensitivity analysis and benchmarking. The following are the testing methods we often use to devise effective challenges for specific models in our practice:
- AML systems for transaction monitoring and customer due diligence: manual replication for a sample of customers/alerts, exploratory data analysis, outcomes analysis
- Asset-Liability Management models: outcomes analysis and review of reporting, sensitivity analysis and stress testing
- Loan pricing models: manual replication, outcomes analysis, sensitivity analysis, stress testing, benchmarking to RS Edge
- Climate risk models that quantify the risk associated with the future effects of climate change on assets at different locations: Outcomes analysis, benchmarking to online services with open access such as National Risk Index, ClimateCheck, and Risk Factor.
- ML AI system: outcome analysis based on the performance metrics, manual replication of the final model in Python, benchmarking with the alternative algorithm.
Ongoing monitoring review: As explained in the previous section, vendors are expected to conduct ongoing monitoring of their models, but banks should monitor their own outcome as well. Our review thus consists of an assessment of the client’s ongoing monitoring plan as well as the results of both the client’s and vendor’s monitoring results. When the validated model does not produce predictions or estimations such as AML models, the ongoing monitoring typically consists of periodical revalidations and data quality monitoring.
Governance review: We review the client’s policies, roles, and responsibilities defined for the model. We also investigate whether a contingency plan is in place for instances when the vendor is no longer supporting the model. We also typically investigate and assess controls around the model’s access and use.
Compliance review: If a model is implemented to make the institution compliant to certain regulations (BSA/AML, PATRIOT Act) or the model itself must comply to regulations, we conduct a compliance review with the assistance of subject matter experts (SMEs) who possess industry experience. This review is conducted to verify that the model and its implementation align with the regulatory requirements and standards set forth by the relevant authorities. The expertise of the SMEs helps ensure that the model effectively addresses compliance concerns and operates within the legal and ethical boundaries of the industry.

Project Management Considerations

In order for validation projects to be successful, a strong project management discipline must be followed to ensure it is completed on schedule, within budget and meets all key stakeholder objectives. In addition to adapting our validation approach, we thus also take our project management approach into consideration. For vendor model validation projects, we specifically follow these principles:

Schedule a periodical status meeting: We typically hold weekly meetings with the client’s MRM to communicate the status of the validation, align client’s expectation, discuss observations, and address any concerns. Since vendor models are often complex, these meetings also serve as a place to discuss any road blockers such as access to the model’s UI, shared folders, database, etc.
Schedule a model walkthrough session with the model owner: Vendor models are often complex and the client may use only specific components/functionalities. The most efficient way to understand the big picture and the particular way the model is used proved to be a live (typically remote) session with the model owner. Asking targeted questions right at the beginning of the engagement helps us to quickly get grasp of the critical areas to focus on during the validation.
Establish a communication channel with the model owner: Be it direct messages or emails sent to and forwarded by the client’s MRM, it is important to be in touch with the model owner as not every detail may be documented.

Conclusion

Vendor models pose unique risks and challenges for MRM and validation. Taking additional steps to mitigate these risks is vital to ensuring a well-functioning MRM program. An effective model validation approach takes these unique considerations into account and directly applies guidelines related specifically to validation of vendor models outlined in SR 11-7 (OCC 11-12). Effectively carrying out this type of testing often requires making adjustments to the management of vendor model validation projects.

LEARN MORE ABOUT RISKSPAN’S MRM SERVICES

References

OCC 2011-12, p.6: The data and other information used to develop a model are of critical importance; there should be rigorous assessment of data quality and relevance, and appropriate documentation. Developers should be able to demonstrate that such data and information are suitable for the model and that they are consistent with the theory behind the approach and with the chosen methodology. If data proxies are used, they should be carefully identified, justified, and documented. If data and information are not representative of the bank’s portfolio or other characteristics, or if assumptions are made to adjust the data and information, these factors should be properly tracked and analyzed so that users are aware of potential limitations. This is particularly important for external data and information (from a vendor or outside party), especially as they relate to new products, instruments, or activities.

OCC 2011-12, p.9: All model components, including input, processing, and reporting, should be subject to validation; this applies equally to models developed in-house and to those purchased from or developed by vendors or consultants. The rigor and sophistication of validation should be commensurate with the bank’s overall use of models, the complexity and materiality of its models, and the size and complexity of the bank’s operations.

OCC 2011-12, p.12: Many of the tests employed as part of model development should be included in ongoing monitoring and be conducted on a regular basis to incorporate additional information as it becomes available. New empirical evidence or theoretical research may suggest the need to modify or even replace original methods. Analysis of the integrity and applicability of internal and external information sources, including information provided by third-party vendors, should be performed regularly.

A whole section in OCC 2011-12 dedicated to validation of vendor models on pp. 15-16:

Validation of Vendor and Other Third-Party Products

The widespread use of vendor and other third-party products—including data, parameter values, and complete models—poses unique challenges for validation and other model risk management activities because the modeling expertise is external to the user and because some components are considered proprietary. Vendor products should nevertheless be incorporated into a bank’s broader model risk management framework following the same principles as applied to in-house models, although the process may be somewhat modified.

As a first step, banks should ensure that there are appropriate processes in place for selecting vendor models. Banks should require the vendor to provide developmental evidence explaining the product components, design, and intended use, to determine whether the model is appropriate for the bank’s products, exposures, and risks. Vendors should provide appropriate testing results that show their product works as expected. They should also clearly indicate the model’s limitations and assumptions and where use of the product may be problematic. Banks should expect vendors to conduct ongoing performance monitoring and outcomes analysis, with disclosure to their clients, and to make appropriate modifications and updates over time. Banks are expected to validate their own use of vendor products. External models may not allow full access to computer coding and implementation details, so the bank may have to rely more on sensitivity analysis and benchmarking. Vendor models are often designed to provide a range of capabilities and so may need to be customized by a bank for its particular circumstances. A bank’s customization choices should be documented and justified as part of validation. If vendors provide input data or assumptions, or use them to build models, their relevance to the bank’s situation should be investigated. Banks should obtain information regarding the data used to develop the model and assess the extent to which that data is representative of the bank’s situation. The bank also should conduct ongoing monitoring and outcomes analysis of vendor model performance using the bank’s own outcomes. Systematic procedures for validation help the bank to understand the vendor product and its capabilities, applicability, and limitations. Such detailed knowledge is necessary for basic controls of bank operations. It is also very important for the bank to have as much knowledge in-house as possible, in case the vendor or the bank terminates the contract for any reason, or the vendor is no longer in business. Banks should have contingency plans for instances when the vendor model is no longer available or cannot be supported by the vendor.

OCC 2011-12, p.17: Policies should emphasize testing and analysis, and promote the development of targets for model accuracy, standards for acceptable levels of discrepancies, and procedures for review of and response to unacceptable discrepancies. They should include a description of the processes used to select and retain vendor models, including the people who should be involved in such decisions.

OCC 2011-12, p.21, Documentation: For cases in which a bank uses models from a vendor or other third party, it should ensure that appropriate documentation of the third-party approach is available so that the model can be appropriately validated.

AB 2022-03, p.3: Since the publication of AB 2013-07, FHFA has observed a wider adoption of technologies in the mortgage industry. Many of these technologies reside externally to the regulated entities and are largely outside of the regulated entities’ control. Examples of such technologies are cloud servers, vendor models, and external data used by the regulated entities as inputs for their models. Although FHFA has published guidance related to externally sourced technologies, such as AB 2018-04: Cloud Computing Risk Management (Aug. 14, 2018) and AB 2018-08: Oversight of Third-Party Provider Relationships (Sept. 28, 2018), FHFA expects the regulated entities to take a macro-prudential view of the risks posed by externally sourced data and technologies. The regulated entities should map their external dependencies to significant internal systems and processes to determine their systemic dependencies and interconnections. In particular, the regulated entities should have an inventory of key dependencies on externally sourced models, data, software, and cloud providers. This inventory should be regularly updated and reviewed by senior management and presented to the board of directors, as deemed appropriate.

AB-2022-03, p.3: The regulated entities should map their external dependencies to significant internal systems and processes to determine their systemic dependencies and interconnections. In particular, the regulated entities should have an inventory of key dependencies on externally sourced models, data, software, and cloud providers. This inventory should be regularly updated and reviewed by senior management and presented to the board of directors, as deemed appropriate.

AB 2022-03, p.5: When using an external vendor to complete an independent model validation, the regulated entity’s model validation group is accountable for the quality, recommendations, and opinions of any third-party review. When evaluating a third-party model validation, a regulated entity should implement model risk management policies and practices that align the vendor-completed specific standards for an independent validation with the specific standards included in AB 2013-07.

“Reject Inference” Methods in Credit Modeling: What are the Challenges?

Reject inference is a popular concept that has been used in credit modeling for decades. Yet, we observe in our work validating credit models that the concept is still dynamically evolving. The appeal of reject inference, whose aim is to develop a credit scoring model utilizing all available data, including that of rejected applicants, is easy enough to grasp. But the technique also introduces a number of fairly vexing challenges.

The technique seeks to rectify a fundamental shortcoming in traditional credit modeling: Models predicting the probability that a loan applicant will repay the loan can be trained to historical loan application data with a binary variable representing whether a loan was repaid or charged off. This information, however, is only available for accepted applications. And many of these applications are not particularly recent. This limitation results in a training dataset that may not be representative of the broader loan application universe.

Credit modelers have devised several techniques for getting around this data representativeness problem and increasing the number of observations by inferring the repayment status of rejected loan applications. These techniques, while well intentioned, are often treated empirically and lack a deeper theoretical basis. They often result in “hidden” modeling assumptions, the reasonableness of which is not fully investigated. Additionally, no theoretical properties of the coefficient estimates, or predictions are guaranteed.

This article summarizes the main challenges of reject inference that we have encountered in our model validation practice.

Speak With A MODEL VALIDATION EXPERT

Selecting the Right Reject Inference Method

Many approaches exist for reject inference, none of which is clearly and universally superior to all the others. Empirical studies have been conducted to compare methods and pick a winner, but the conclusions of these studies are often contradictory. Some authors argue that reject inference cannot improve scorecard models[1]and flatly recommend against their use. Others posit that certain techniques can outperform others[2] based on empirical experiments. The results of these experiments, however, tend to be data dependent. Some of the most popular approaches include the following:

Ignoring rejected applications: The simplest approach is to develop a credit scoring model based only on accepted applications. The underlying assumption is that rejected applications can be ignored and that the “missingness” of this data from the training dataset can be classified as missing at random. Supporters of this method point to the simplicity of the implementation, clear assumptions, and good empirical results. Others argue that the rejected applications cannot be dismissed simply as random missing data and thus should not be ignored.
Hard cut-off method: In this method, a model is first trained using only accepted application data. This trained model is then used to predict the probabilities of charge-off for the rejected applications. A cut-off value is then chosen. Hypothetical loans from rejected applications with probabilities higher than this cut-off value are considered charged off. Hypothetical loans from the remaining applications are assumed to be repaid. The specified model is then re-trained using a dataset including both accepted and rejected applications.
Fuzzy augmentation: Similar to the hard cut-off method, fuzzy augmentation begins by training the model on accepted applications only. The resulting model with estimated coefficients is then used to predict charge-off probabilities for rejected applications. Data from rejected applications is then duplicated and a repaid or charged-off status is assigned to each. The specified model is then retrained on the augmented dataset—including accepted applications and the duplicated rejects. Each rejected application is weighted by either a) the predicted probability of charge-off if its assigned status is “charged-off,” or b) the predicted probability of it being repaid if its assigned status is “repaid.”
Parceling: The parceling method resembles the hard cut-off method. However, rather than classifying all rejects above a certain threshold as charged-off, this method classifies the repayment status in proportion to the expected “bad” rate (charge-off frequency) at that score. The predicted charge-off probabilities are partitioned into k intervals. Then, for each interval, an assumption is made about the bad rate, and loan applications in each interval are assigned a repayment status randomly according to the bad rate. Bad rates are assumed to be higher in the reject dataset than among the accepted loans. This method considers the missingness to be not at random (MNAR), which requires the modeler to supplement the additional information about the distribution of charge-offs among rejects.

Proportion of Accepted Applications to Rejects

An institution with a relatively high percentage of rejected applications will necessarily end up with an augmented training dataset whose quality is heavily dependent on the quality of the selected reject inference method and its implementation. One might argue it is best to limit the proportion of rejected applications to acceptances. The level at which such a cap is established should reflect the “confidence” in the method used. Estimating such a confidence level, however, is a highly subjective endeavor.

The Proportion of Bad Rates for Accepts and Rejects

It is reasonable to assume that the “bad rate,” i.e., proportion of charged-off loans to repaid loans, will be higher among rejected applications. Some modelers set a threshold based on their a priori belief that the bad rate among rejects is at least p-times the bad rate among acceptances. If the selected reject inference method produces a dataset with a bad rate that is perceived to be artificially low, actions are taken to increase the bad rate above some threshold. Identifying where to establish this threshold is notoriously difficult to justify.

Variable Selection

As outlined above, most approaches begin by estimating a preliminary model based on accepted applications only. This model is then used to infer how rejected loans would have performed. The preliminary model is then retrained on a dataset consisting both of actual data from accepted applications and of the inferred data from rejects. This means that the underlying variables themselves are selected based only on the actual loan performance data from accepted applications. The statistical significance of the selected variables might change, however, when moving to the complete dataset. Variable selection is sometimes redone using the complete data. This, however, can lead to overfitting.

Measuring Model Performance

From a model validator’s perspective, an ideal solution would involve creating a control group in which applications would not be scored and filtered and every application would be accepted. Then the discriminating power of a credit model could be assessed by comparing the charge-off rate of the control group with the charge-off rate of the loans accepted by the model. This approach of extending credit indiscriminately is impractical, however, as it would require the lender to engage in some degree of irresponsible lending.

Another approach is to create a test set. The dilemma here is whether to include only accepted applications. A test set that includes only accepted applications will not necessarily reflect the population for which the model will be used. Including rejected applications, however, obviously necessitates the use of reject inference. For all the reasons laid out above, this approach risks overstating the model’s performance due to the fact that a similar model (trained only on the accepted cases) was used for reject inference.

A third approach that avoids both of these problems involves using information criteria such as AIC and BIC. This, however, is useful only when comparing different models (for model or variable selection). The values of information criteria cannot be interpreted as an absolute measure of performance.

A final option is to consider utilizing several models in production (the main model and challenger models). Under this scenario, each application would be evaluated by a model selected at random. The models can then be compared retroactively by calculating their bad rates on accepted application after the financed loans mature. Provided that the accept rates are similar, the model with the lowest bad rate is the best.

Conclusion

Reject inference remains a progressing field in credit modeling. Its ability to improve model performance is still the subject of intense debate. Current results suggest that while reject inference can improve model performance, its application can also lead to overfitting, thus worsening the ability to generalize. The lack of a strong theoretical basis for reject inference methods means that applications of reject inference need to rely on empirical results. Thus, if reject inference is used, key model stakeholders need to possess a deep understanding of the modeled population, have strong domain knowledge, emphasize conducting experiments to justify the applied modeling techniques, and, above all, adopt and follow a solid ongoing monitoring plan.

Doing this will result in a modeling methodology that is most likely to produce reliable outputs for the institutions while also satisfying MRM and validator requirements.

[1] https://www.sciencedirect.com/science/article/abs/pii/S0378426603002036

[2] https://economix.fr/pdf/dt/2016/WP_EcoX_2016-10.pdf

Validating Vendor Models

Vendor Models and MRM Considerations

Validation Approach

Project Management Considerations

Conclusion

References

“Reject Inference” Methods in Credit Modeling: What are the Challenges?

Selecting the Right Reject Inference Method

Proportion of Accepted Applications to Rejects

The Proportion of Bad Rates for Accepts and Rejects

Variable Selection

Measuring Model Performance

Conclusion

Company

Products

Security & Compliance