Rethink Analytics Computational Processing – Solving Yesterday’s Problems with Today’s Technology and Access
We sat down with RiskSpan’s co-founder and chief technology officer, Suhrud Dagli, to learn more about how one mortgage investor successfully overhauled its analytics computational processing. The investor migrated from a daily pricing and risk process that relied on tens of thousands of rep lines to one capable of evaluating each of the portfolio’s more than three-and-a-half million loans individually (and how they actually saved money in the process).
Here is what we learned.
Could you start by talking a little about this portfolio — what asset class and what kind of analytics the investor was running?
SD: Our client was managing a large investment portfolio of mortgage servicing rights (MSR) assets, residential loans and securities.
The investor runs a battery of sophisticated risk management analytics that rely on stochastic modeling. Option-adjusted spread, duration, convexity, and key rate durations are calculated based on more than 200 interest rate simulations.
Why was the investor running their analytics computational processing using a rep line approach?
SD: They used rep lines for one main reason: They needed a way to manage computational loads on the server and improve calculation speeds. Secondarily, organizing the loans in this way simplified their reporting and accounting requirements to a degree (loans financed by the same facility were grouped into the same rep line).
This approach had some downsides. Pooling loans by finance facility was sometimes causing loans with different balances, LTVs, credit scores, etc., to get grouped into the same rep line. This resulted in prepayment and default assumptions getting applied to every loan in a rep line that differed from the assumptions that likely would have been applied if the loans were being evaluated individually.
The most obvious solution to this would seem to be one that disassembles the finance facility groups into their individual loans, runs all those analytics at the loan level, and then re-aggregates the results into the original rep lines. Is this sort of analytics computational processing possible without taking all day and blowing up the server?
SD: That is effectively what we are doing. The process is not a speedy as we’d like it to be (and we are working on that). But we have worked out a solution that does not overly tax computational resources.
The analytics computational processing we are implementing ignores the rep line concept entirely and just runs the loans. The scalability of our cloud-native infrastructure enables us to take the three-and-a-half million loans and bucket them equally for computation purposes. We run a hundred loans on each processor and get back loan-level cash flows and then generate the output separately, which brings the processing time down considerably.
So we have a proof of concept that this approach to analytics computational processing works in practice for running pricing and risk on MSR portfolios. Is it applicable to any other asset classes?
SD: The underlying principles that make analytics computational processing possible at the loan level for MSR portfolios apply equally well to whole loan investors and MBS investors. In fact, the investor in this example has a large whole-loan portfolio alongside its MSR portfolio. And it is successfully applying these same tactics on that portfolio.
An investor in any mortgage asset benefits from the ability to look at and evaluate loan characteristics individually. The results may need to be rolled up and grouped for reporting purposes. But being able to run the cash flows at the loan level ultimately makes the aggregated results vastly more meaningful and reliable.
A loan-level framework also affords whole-loan and securities investors the ability to be sure they are capturing the most important loan characteristics and are staying on top of how the composition of the portfolio evolves with each day’s payoffs.
ESG factors are an important consideration for a growing number of investors. Only a loan-level approach makes it possible for these investors to conduct the kind of property- and borrower-level analyses to know whether they are working toward meeting their ESG goals. It also makes it easier to spot areas of geographic concentration risk, which simplifies climate risk management to some degree.
Say I am a mortgage investor who is interested in moving to loan-level pricing and risk analytics. How do I begin?
SD: Three things:
- It begins with having the data. Most investors have access to loan-level data. But it’s not always clean. This is especially true of origination data. If you’re acquiring a pool – be it a seasoned pool or a pool right after origination – you don’t have the best origination data to drive your model. You also need a data store that can generate loan-loan level output to drive your analytics and models.
- The second factor is having models that work at the loan level – models that have been calibrated using loan-level performance and that are capable of generating loan-level output. One of the constraints of several existing modeling frameworks developed by vendors is they were created to run at a rep line level and don’t necessarily work very well for loan-level projections.
- The third thing you need is a compute farm. It is virtually impossible to run loan-level analytics if you’re not on the cloud because you need to distribute the computational load. And your computational distribution requirements will change from portfolio to portfolio based on the type of analytics that you are running, based on the types of scenarios that you are running, and based on the models you are using.
The cloud is needed not just for CPU power but also for storage. This is because once you go to the loan level, every loan’s data must be made available to every processor that’s performing the calculation. This is where having the kind of shared databases, which are native to a cloud infrastructure, becomes vital. You simply can’t replicate it using a on-premise setup of computers in your office or in your own data center.
So, 1) get your data squared away, 2) make sure you’re using models that are optimized for loan-level, and 3) max out your analytics computational processing power by migrating to cloud-native infrastructure. Thank you, Suhrud, for taking the time to speak with us.