Linkedin    Twitter   Facebook

Get Started
Log In

Linkedin

Articles Tagged with: Technology

GenAI Applications for Loans and Mapping Data

RiskSpan is actively furthering the advancement of several GenAI applications aimed at transforming how mortgage loan and private credit investors work and maximizing their efficiency and performance. They include:

1. Tape-Cracking 3.0: Making RiskSpan’s Smart Mapper Even Smarter

RiskSpan’s Edge Platform currently uses machine learning techniques as part of its Smart Mapper ETL Tool. When a new portfolio is loaded in a new format, the fuzzy logic that powers the Platform’s recommended mappings gets continually refined based on user activity.

In the coming months, the Platform’s existing ML-driven ETL process will be further refined to leverage the latest GenAI technology.

GenAI lends additional context to the automated mapping process by incorporating an understanding not only of the data in an individual column, but also of surrounding data as well as learned characteristics of the asset class in question. The resulting evolution from simply trying to ensure the headers match up a more holistic understanding of what the data actually is and the meaning it seeks to convey will be a game changer for downstream analysts seeking to make reliable data-driven investment decisions.

RiskSpan made several updates in 2023 to help users automate the end-to-end workflow for loan valuation and surveillance. AI-based data loading combined with the Platform’s loan risk assumptions and flexible data model will enable users to obtain valuation and risk metrics simply by dragging and dropping a loan file into the application.

2. Modeling Private Credit Transactions

Many financial institutions and legal advisors still spend an extraordinary amount of time reading and extracting relevant information from legal documents that accompany structured private credit transactions.

RiskSpan has partnered with clients to develop a solution to extract key terms from private credit and funding transactions. Trained multimodal AI models are further extended to generate executable code valuations. This code will be fully integrated into RiskSpan’s risk and pricing platform.

The application solves a heretofore intractable problem in which the information necessary to generate accurate cash flows for private credit transactions is spread across multiple documents (a frequent occurrence when terms for individual classes can only be obtained from deal amendments).

Execution code for cash flow generation and valuation utilizes RiskSpan’s validated analytics routines, such as day count handling, payment calculations, discounting, etc.

3. “Insight Support”

Tech support is one of today’s most widely known (and widely experienced) GenAI use cases. Seemingly all-knowing chatbots immediately answer users’ questions, sparing them the inconvenience of having to wait for the next available human agent. Like every other company, RiskSpan is enhancing its traditional tech support processes with GenAI to answer questions faster and and embed user-facing AI help within the Platform itself. But RiskSpan is taking things a step further by also exploring how GenAI can upend and augment its clients’ workflows.

RiskSpan refers to this workflow augmentation as “Insight Support.”

With Insight Support, GenAI evaluates an individual user’s data, dynamically serves up key insights, and automatically completes routine analysis steps without prompting. The resulting application can understand an individual user’s data and recognize what is most important to identify and highlight as part of a loan data analysis workflow.

Insight Support, for example, can leverage insights obtained by the AI-driven “Smarter Mapping” process to identify what specific type of collateral reporting is necessary. It can produce reports that highlight outliers, recognize the typical analytical/valuation run settings a user would want to apply, and then execute the analytical run and summarize the results in management-ready reporting. All in the name of shortening the analysis time needed to evaluate new investment opportunities.

Conclusion

Considered collectively, these three applications are building toward having RiskSpan’s SaaS platform function as a “virtual junior analyst” capable of handling much of the tedious work involved in analyzing loan and structured product investments and freeing up human analysts for higher-order tasks and decision making.

GenAI is the future of data and analytics and is therefore the future of RiskSpan’s Edge Platform. By revolutionizing the way data is analyzed, AI-created and -validated models, dashboards, and sorted data are already allowing experts to redirect their attention away from time-consuming data wrangling tasks and toward more strategic critical thinking. The more complete adoption of fully optimized AI solutions throughout the industry, made possible by a rising generation of “AI-native” data scientists will only accelerate this phenomenon.

RiskSpan’s commitment to pushing the boundaries of innovation in the Loan and Structured Product Space is underscored by its strategic approach to GenAI. While acknowledging the challenges posed by GenAI, RiskSpan remains poised for the future, leveraging its expertise to navigate the evolving landscape. As the industry anticipates the promised benefits of GenAI, RiskSpan’s vision and applications stand as a testament to its role as a thought leader in shaping the future of data analytics.

Stay tuned for more updates on RiskSpan’s innovative solutions, as we continue to lead the way in harnessing the power of GenAI for the benefit of our clients and the industry at large.


RiskSpan to Launch Usage-based Pricing for its Edge Platform at SFVegas 2024 

New innovative pricing model offers lower costs, transparency, and flexibility for analytics users 

RiskSpan, a top provider of cloud-based analytics solutions for loans, MSRs, structured products and private credit, announced today the launch of a usage-based pricing model for its Edge Platform. The new pricing model enables clients flexibility to pay only for the compute they use. It also gives clients access to the full platform, including data, models, and analytics, without having to license individual product modules. 

Usage-based pricing is a trend that reflects the evolving nature of analytics and the increasing demand for more flexible, transparent, and value-driven pricing models. It is especially suited for the dynamic and diverse needs of analytics users, whose data volumes, usage patterns, and analytical complexity requirements often fluctuate with the markets.

RiskSpan was an early adopter of the Amazon Web Services (AWS) cloud in 2010. Its new usage-based pricing, powered by the AWS cloud, enables RiskSpan to invoice its clients based on user-configured workloads, which can scale up or down as needed. 

“Usage-based pricing is a game-changer for our clients and the industry,” said Bernadette Kogler, CEO of RiskSpan. “It aligns our pricing with the value we deliver and the outcomes we enable for our clients. It also eliminates the waste and inefficiency of paying for unused, fixed-fee compute capacity, year after year in long-term, set price contracts. Now our clients can optimize their spending while experimenting with all the features our platform has to offer.”

“We are excited RiskSpan chose AWS to launch its new pricing model. Our values are aligned in earning trust through transparent variable pricing that allows our customers to innovate and remain agile.” said Ben Schreiner, Head of Business Innovation, at Amazon Web Services. “By leveraging the latest in AWS technology, including our generative AI services, RiskSpan is accelerating the value they deliver to their customers, and ultimately, the entire financial services industry.”

Usage-based pricing offers several benefits for RiskSpan clients, including: 

  • Lower Costs: Clients pay only for what they need, rather than being locked into an expensive contract that may not suit their current or future situation. 
  • Cost Sharing: Clients can share costs across the enterprise and better manage expense based on usage by individual functions and business units. 
  • Transparency: Clients can monitor their usage and directly link their analytics configuration and usage to their results and goals. They can also better control their spending by tracking their usage and seeing how it affects their bill. 
  • Flexibility: Clients can experiment with different features and options of RiskSpan’s Edge Platform, as they are not restricted by a predefined package or plan. 

For a free demo, visit https://riskspan.com/ubp/.

### 

About RiskSpan, Inc. 

RiskSpan offers cloud-native SaaS analytics for on-demand market risk, credit risk, pricing and trading. With an unparalleled team of data science experts and technologists, RiskSpan is the leader in data as a service and end-to-end solutions for loan-level data management and analytics.

Its mission is to be the most trusted and comprehensive source of data and analytics for loans and structured finance investments. Learn more at www.riskspan.com.


RiskSpan, Dominium Advisors Announce Market Color Dashboard for Mortgage Loan Investors


ARLINGTON, Va., January 24, 2024 – RiskSpan, the leading tech provider of data management and analytics services for loans and structured products, has partnered with tech-enabled asset manager Dominium Advisors to introduce a new whole loan market color dashboard to RiskSpan’s Edge Platform.

This new dashboard combines loan-level market pricing and trading data with risk analytics for GSE-eligible and non-QM loans. It enables loan investors unprecedented visibility into where loans are currently trading and insight on how investors can currently achieve excess risk-adjusted yields.

Dashboard

The dashboard highlights Dominium’s proprietary loan investment and allocation approach, which allows investors to evaluate any set of residential loans available for bid. Leveraging RiskSpan’s collateral models and risk analytics, Dominium’s software helps investors maximize yield or spread subject to investment constraints, such as a risk budget, or management constraints, such as concentration limits.

“Our strategic partnership with RiskSpan is a key component of our residential loan asset management operating platform ,” said Peter A. Simon, Founder and CEO of Dominium Advisors. “It has enabled us to provide clients with powerful risk analytics and data management capabilities in unprecedented ways.”

“The dashboard is a perfect complement to our suite of analytical tools,” noted Janet Jozwik, Senior Managing Director and Head of Product for RiskSpan’s Edge Platform. “We are excited to be a conduit for delivering this level of market color to our mortgage investor clients.”

The market color dashboard (and other RiskSpan reporting) can be accessed by registering for a free Edge Platform login at https://riskspan.com/request-access/.

### 

About RiskSpan, Inc. 

RiskSpan offers cloud-native SaaS analytics for on-demand market risk, credit risk, pricing and trading. With an unparalleled team of data science experts and technologists, RiskSpan is the leader in data as a service and end-to-end solutions for loan-level data management and analytics.

Its mission is to be the most trusted and comprehensive source of data and analytics for loans and structured finance investments. Learn more at www.riskspan.com.

About Dominium Advisors Dominium Advisors is a tech-enabled asset manager specializing in the acquisition and management of residential mortgage loans for insurance companies and other institutional investors. The firm focuses on newly originated residential mortgage loans made to high quality borrowers – GSE eligible, jumbo and non-QM. Its proprietary loan-level software makes possible the construction of loan portfolios that achieve investor defined objectives such as higher risk-adjusted yields and spreads or limited exposure to tail risk events. Learn more at dominiumadvisors.com.


Snowflake Tutorial Series: Episode 3

Using External Tables Inside Snowflake to work with Freddie Mac public data (13 million loans across 116 fields)

Using Freddie Mac public loan data as an example, this five-minute tutorial succinctly demonstrates how to:

  1. Create a storage integration
  2. Create an external stage
  3. Grant access to stage to other roles in Snowflake
  4. List objects in a stage
  5. Create a format file
  6. Read/Query data from external stage without having to create a table
  7. Create and use an external table in Snowflake

This is the third in a 10-part tutorial series demonstrating how RiskSpan’s Snowflake integration makes mortgage and structured finance analytics easier than ever before.

Episode 1, Setting Up a Database and Uploading 28 Million Mortgage Loans, is available here.

Episode 2, Using Python User-Defined Functions in Snowflake SQL, is available here.

Future topics will include:

  • OLAP vs OLTP and hybrid tables in Snowflake
  • Time Travel functionality, clone and data replication
  • Normalizing data and creating a single materialized view
  • Dynamic tables data concepts in Snowflake
  • Data share
  • Data masking
  • Snowpark: Data analysis (pandas) functionality in Snowflake

RiskSpan’s Snowflake Tutorial Series: Ep. 2

Learn how to use Python User-Defined Functions in Snowflake SQL

Using CPR computation for a pool of mortgage loans as an example, this six-minute tutorial succinctly demonstrates how to:

  1. Query Snowflake data using SQL
  2. Write and execute Python user-defined functions inside Snowflake
  3. Compute CDR using Python UDF inside Snowflake SQL

This is this second in a 10-part tutorial series demonstrating how RiskSpan’s Snowflake integration makes mortgage and structured finance analytics easier than ever before.

Episode 1, Setting Up a Database and Uploading 28 Million Mortgage Loans, is available here.

Future topics will include:

  • External Tables (accessing data without a database)
  • OLAP vs OLTP and hybrid tables in Snowflake
  • Time Travel functionality, clone and data replication
  • Normalizing data and creating a single materialized view
  • Dynamic tables data concepts in Snowflake
  • Data share
  • Data masking
  • Snowpark: Data analysis (pandas) functionality in Snowflake

RiskSpan Incorporates Flexible Loan Segmentation into Edge Platform

ARLINGTON, Va., March 3, 2023 — RiskSpan, a leading technology company and the most comprehensive source for data management and analytics for residential mortgage and structured products, has announced the incorporation of Flexible Loan Segmentation functionality into its award-winning Edge Platform.

The new functionality makes Edge the only analytical platform offering users the option of alternating between the speed and convenience of rep-line-level analysis and the unmatched precision of loan-level analytics, depending on the purpose of their analysis.

For years, the cloud-native Edge Platform has stood alone in its ability to offer the computational scale necessary to perform loan-level analyses and fully consider each loan’s individual contribution to a mortgage or MSR portfolio’s cash flows. This level of granularity is of paramount importance when pricing new portfolios, taking property-level considerations into account, and managing tail risks from a credit/servicing cost perspective.

Not every analytical use case justifies the computational cost of a full loan-level analysis, however. For situations where speed requirements dictate the use of rep lines (such as for daily or intra-day hedging needs), the Edge Platform’s new Flexible Loan Segmentation affords users the option to perform valuation and risk analysis at the rep line level.

Analysts, traders and investors take advantage of Edge’s flexible calculation specification to run various rate and HPI scenarios, key rate durations, and other calculation-intensive metrics in an efficient and timely manner. Segment-level results run at both loan and rep line level can be easily compared to assess the impacts of each approach. Individual rep lines are easily rolled up to quickly view results on portfolio subcomponents and on the portfolio as a whole.

Comprehensive details of this and other new capabilities are available by requesting a no-obligation demo at riskspan.com.

This new functionality is the latest in a series of enhancements that further the Edge Platform’s objective of providing frictionless insight to Agency MBS traders and investors, knocking down barriers to efficient, clear and data-driven valuation and risk assessment.

###

About RiskSpan, Inc. 

RiskSpan offers cloud-native SaaS analytics for on-demand market risk, credit risk, pricing and trading. With our data science experts and technologists, we are the leader in data as a service and end-to-end solutions for loan-level data management and analytics. Our mission is to be the most trusted and comprehensive source of data and analytics for loans and structured finance investments. Learn more at www.riskspan.com.


RiskSpan’s Snowflake Tutorial Series: Ep. 1

Learn how to create a new Snowflake database and upload large loan-level datasets

The first episode of RiskSpan’s Snowflake Tutorial Series has dropped!

This six-minute tutorial succinctly demonstrates how to:

  1. Set up a new Snowflake #database
  2. Use SnowSQL to load large datasets (28 million #mortgage loans in this example)
  3. Use internal staging (without a #cloud provider)

This is this first in what is expected to be a 10-part tutorial series demonstrating how RiskSpan’s Snowflake integration makes mortgage and structured finance analytics easier than ever before.

Future topics will include:

  • Executing complex queries using python functions in Snowflake’s SQL
  • External Tables (accessing data without a database)
  • OLAP vs OLTP and hybrid tables in Snowflake
  • Time Travel functionality, clone and data replication
  • Normalizing data and creating a single materialized view
  • Dynamic tables data concepts in Snowflake
  • Data share
  • Data masking
  • Snowpark: Data analysis (pandas) functionality in Snowflake

RiskSpan Unveils New “Reverse ETL” Mortgage Data Mapping and Extract Functionality

ARLINGTON, Va., October 19, 2022 – Subscribers to RiskSpan’s Mortgage Data Management product can now not only leverage machine learning to streamline the intake of loan data from any format, but also define any target format for data extraction and sharing.

A recent enhancement to RiskSpan’s award-winning Edge Platform enables users to take in unformatted datasets from mortgage servicers, sellers and other counterparties and convert them into their preferred data format on the fly for sharing with accounting, client, and other downstream systems.

Analysts, traders, and portfolio managers have long used Edge to take in and store datasets, enabling them to analyze historical performance of custom cohorts using limitless combinations of mortgage loan characteristics and run predictive analytics on segments defined on the fly. With Edge’s novel “Reverse ETL” data extract functionality, these Platform users can now also easily and fully design an export format for exporting their data, creating the functional equivalent of a full integration node for sharing data with literally any system on or off the Edge Platform.   

Market participants tout the revolutionary technology as the end of having to share cumbersome and unformatted CSV files with counterparties. Now, the same smart mapping technology that for years has facilitated the ingestion of mortgage data onto the Edge Platform makes extracting and sharing mortgage data with downstream users just as easy.   

Comprehensive details of this and other new capabilities using RiskSpan’s Edge Platform are available by requesting a no-obligation live demo at riskspan.com.

SCHEDULE A FREE DEMO

This new functionality is the latest in a series of enhancements that is making the Edge Platform’s Data as a Service increasingly indispensable for mortgage loan and MSR traders and investors.

### 

About RiskSpan, Inc. 

RiskSpan is a leading technology company and the most comprehensive source for data management and analytics for residential mortgage and structured products. The company offers cloud-native SaaS analytics for on-demand market risk, credit risk, pricing and trading. With our data science experts and technologists, we are the leader in data as a service and end-to-end solutions for loan-level data management and analytics.

Our mission is to be the most trusted and comprehensive source of data and analytics for loans and structured finance investments.

Rethink loan and structured finance data. Rethink your analytics. Learn more at www.riskspan.com.

Media contact: Timothy Willis

CONTACT US


Optimizing Analytics Computational Processing 

We met with RiskSpan’s Head of Engineering and Development, Praveen Vairavan, to understand how his team set about optimizing analytics computational processing for a portfolio of 4 million mortgage loans using a cloud-based compute farm.

This interview dives deeper into a case study we discussed in a recent interview with RiskSpan’s co-founder, Suhrud Dagli.

Here is what we learned from Praveen. 


Speak to an Expert

Could you begin by summarizing for us the technical challenge this optimization was seeking to overcome? 

PV: The main challenge related to an investor’s MSR portfolio, specifically the volume of loans we were trying to run. The client has close to 4 million loans spread across nine different servicers. This presented two related but separate sets of challenges. 

The first set of challenges stemmed from needing to consume data from different servicers whose file formats not only differed from one another but also often lacked internal consistency. By that, I mean even the file formats from a single given servicer tended to change from time to time. This required us to continuously update our data mapping and (because the servicer reporting data is not always clean) modify our QC rules to keep up with evolving file formats.  

The second challenge relates to the sheer volume of compute power necessary to run stochastic paths of Monte Carlo rate simulations on 4 million individual loans and then discount the resulting cash flows based on option adjusted yield across multiple scenarios. 

And so you have 4 million loans times multiple paths times one basic cash flow, one basic option-adjusted case, one up case, and one down case, and you can see how quickly the workload adds up. And all this needed to happen on a daily basis. 

To help minimize the computing workload, our client had been running all these daily analytics at a rep-line level—stratifying and condensing everything down to between 70,000 and 75,000 rep lines. This alleviated the computing burden but at the cost of decreased accuracy because they couldn’t look at the loans individually. 

What technology enabled you to optimize the computational process of running 50 paths and 4 scenarios for 4 million individual loans?

PV: With the cloud, you have the advantage of spawning a bunch of servers on the fly (just long enough to run all the necessary analytics) and then shutting it down once the analytics are done. 

This sounds simple enough. But in order to use that level of compute servers, we needed to figure out how to distribute the 4 million loans across all these different servers so they can run in parallel (and then we get the results back so we could aggregate them). We did this using what is known as a MapReduce approach. 

Say we want to run a particular cohort of this dataset with 50,000 loans in it. If we were using a single server, it would run them one after the other – generate all the cash flows for loan 1, then for loan 2, and so on. As you would expect, that is very time-consuming. So, we decided to break down the loans into smaller chunks. We experimented with various chunk sizes. We started with 1,000 – we ran 50 chunks of 1,000 loans each in parallel across the AWS cloud and then aggregated all those results.  

That was an improvement, but the 50 parallel jobs were still taking longer than we wanted. And so, we experimented further before ultimately determining that the “sweet spot” was something closer to 5,000 parallel jobs of 100 loans each. 

Only in the cloud is it practical to run 5,000 servers in parallel. But this of course raises the question: Why not just go all the way and run 50,000 parallel jobs of one loan each? Well, as it happens, running an excessively large number of jobs carries overhead burdens of its own. And we found that the extra time needed to manage that many jobs more than offset the compute time savings. And so, using a fair bit of trial and error, we determined that 100-loan jobs maximized the runtime savings without creating an overly burdensome number of jobs running in parallel.  

Get A Demo

You mentioned the challenge of having to manage a large number of parallel processes. What tools do you employ to work around these and other bottlenecks? 

PV: The most significant bottleneck associated with this process is finding the “sweet spot” number of parallel processes I mentioned above. As I said, we could theoretically break it down into 4 million single-loan processes all running in parallel. But managing this amount of distributed computation, even in the cloud, invariably creates a degree of overhead which ultimately degrades performance. 

And so how do we find that sweet spot – how do we optimize the number of servers on the distributed computation engine? 

As I alluded to earlier, the process involved an element of trial and error. But we also developed some home-grown tools (and leveraged some tools available in AWS) to help us. These tools enable us to visualize computation server performance – how much of a load they can take, how much memory they use, etc. These helped eliminate some of the optimization guesswork.   

Is this optimization primarily hardware based?

PV: AWS provides essentially two “flavors” of machines. One “flavor” enables you to take in a lot of memory. This enables you to keep a whole lot of loans in memory so it will be faster to run. The other flavor of hardware is more processor based (compute intensive). These machines provide a lot of CPU power so that you can run a lot of processes in parallel on a single machine and still get the required performance. 

We have done a lot of R&D on this hardware. We experimented with many different instance types to determine which works best for us and optimizes our output: Lots of memory but smaller CPUs vs. CPU-intensive machines with less (but still a reasonably amount of) memory. 

We ultimately landed on a machine with 96 cores and about 240 GB of memory. This was the balance that enabled us to run portfolios at speeds consistent with our SLAs. For us, this translated to a server farm of 50 machines running 70 processes each, which works out to 3,500 workers helping us to process the entire 4-million-loan portfolio (across 50 Monte Carlo simulation paths and 4 different scenarios) within the established SLA.  

What software-based optimization made this possible? 

PV: Even optimized in the cloud, hardware can get pricey – on the order of $4.50 per hour in this example. And so, we supplemented our hardware optimization with some software-based optimization as well. 

We were able to optimize our software to a point where we could use a machine with just 30 cores (rather than 96) and 64 GB of RAM (rather than 240). Using 80 of these machines running 40 processes each gives us 2,400 workers (rather than 3,500). Software optimization enabled us to run the same number of loans in roughly the same amount of time (slightly faster, actually) but using fewer hardware resources. And our cost to use these machines was just one-third what we were paying for the more resource-intensive hardware. 

All this, and our compute time actually declined by 10 percent.  

The software optimization that made this possible has two parts: 

The first part (as we discussed earlier) is using the MapReduce methodology to break down jobs into optimally sized chunks. 

The second part involved optimizing how we read loan-level information into the analytical engine.  Reading in loan-level data (especially for 4 million loans) is a huge bottleneck. We got around this by implementing a “pre-processing” procedure. For each individual servicer, we created a set of optimized loan files that can be read and rendered “analytics ready” very quickly. This enables the loan-level data to be quickly consumed and immediately used for analytics without having to read all the loan tapes and convert them into a format that analytics engine can understand. Because we have “pre-processed” all this loan information, it is immediately available in a format that the engine can easily digest and run analytics on.  

This software-based optimization is what ultimately enabled us to optimize our hardware usage (and save time and cost in the process).  

Contact us to learn more about how we can help you optimize your mortgage analytics computational processing.


Rethink Analytics Computational Processing – Solving Yesterday’s Problems with Today’s Technology and Access 

We sat down with RiskSpan’s co-founder and chief technology officer, Suhrud Dagli, to learn more about how one mortgage investor successfully overhauled its analytics computational processing. The investor migrated from a daily pricing and risk process that relied on tens of thousands of rep lines to one capable of evaluating each of the portfolio’s more than three-and-a-half million loans individually (and how they actually saved money in the process).  

Here is what we learned. 


Could you start by talking a little about this portfolio — what asset class and what kind of analytics the investor was running? 

SD: Our client was managing a large investment portfolio of mortgage servicing rights (MSR) assets, residential loans and securities.  

The investor runs a battery of sophisticated risk management analytics that rely on stochastic modeling. Option-adjusted spread, duration, convexity, and key rate durations are calculated based on more than 200 interest rate simulations. 

GET A FREE DEMO OR FREE TRIAL

Why was the investor running their analytics computational processing using a rep line approach? 

SD: They used rep lines for one main reason: They needed a way to manage computational loads on the server and improve calculation speeds. Secondarily, organizing the loans in this way simplified their reporting and accounting requirements to a degree (loans financed by the same facility were grouped into the same rep line).  

This approach had some downsides. Pooling loans by finance facility was sometimes causing loans with different balances, LTVs, credit scores, etc., to get grouped into the same rep line. This resulted in prepayment and default assumptions getting applied to every loan in a rep line that differed from the assumptions that likely would have been applied if the loans were being evaluated individually.  

The most obvious solution to this would seem to be one that disassembles the finance facility groups into their individual loans, runs all those analytics at the loan level, and then re-aggregates the results into the original rep lines. Is this sort of analytics computational processing possible without taking all day and blowing up the server? 

SD: That is effectively what we are doing. The process is not a speedy as we’d like it to be (and we are working on that). But we have worked out a solution that does not overly tax computational resources.  

The analytics computational processing we are implementing ignores the rep line concept entirely and just runs the loans. The scalability of our cloud-native infrastructure enables us to take the three-and-a-half million loans and bucket them equally for computation purposes. We run a hundred loans on each processor and get back loan-level cash flows and then generate the output separately, which brings the processing time down considerably. 

SPEAK TO AN EXPERT

So we have a proof of concept that this approach to analytics computational processing works in practice for running pricing and risk on MSR portfolios. Is it applicable to any other asset classes?

SD: The underlying principles that make analytics computational processing possible at the loan level for MSR portfolios apply equally well to whole loan investors and MBS investors. In fact, the investor in this example has a large whole-loan portfolio alongside its MSR portfolio. And it is successfully applying these same tactics on that portfolio.   

An investor in any mortgage asset benefits from the ability to look at and evaluate loan characteristics individually. The results may need to be rolled up and grouped for reporting purposes. But being able to run the cash flows at the loan level ultimately makes the aggregated results vastly more meaningful and reliable. 

A loan-level framework also affords whole-loan and securities investors the ability to be sure they are capturing the most important loan characteristics and are staying on top of how the composition of the portfolio evolves with each day’s payoffs. 

ESG factors are an important consideration for a growing number of investors. Only a loan-level approach makes it possible for these investors to conduct the kind of property- and borrower-level analyses to know whether they are working toward meeting their ESG goals. It also makes it easier to spot areas of geographic concentration risk, which simplifies climate risk management to some degree.  

Say I am a mortgage investor who is interested in moving to loan-level pricing and risk analytics. How do I begin? 

 SD: Three things: 

  1.  It begins with having the data. Most investors have access to loan-level data. But it’s not always clean. This is especially true of origination data. If you’re acquiring a pool – be it a seasoned pool or a pool right after origination – you don’t have the best origination data to drive your model. You also need a data store that can generate loan-loan level output to drive your analytics and models.
  2. The second factor is having models that work at the loan level – models that have been calibrated using loan-level performance and that are capable of generating loan-level output. One of the constraints of several existing modeling frameworks developed by vendors is they were created to run at a rep line level and don’t necessarily work very well for loan-level projections.  
  3. The third thing you need is a compute farm. It is virtually impossible to run loan-level analytics if you’re not on the cloud because you need to distribute the computational load. And your computational distribution requirements will change from portfolio to portfolio based on the type of analytics that you are running, based on the types of scenarios that you are running, and based on the models you are using. 

The cloud is needed not just for CPU power but also for storage. This is because once you go to the loan level, every loan’s data must be made available to every processor that’s performing the calculation. This is where having the kind of shared databases, which are native to a cloud infrastructure, becomes vital. You simply can’t replicate it using a on-premise setup of computers in your office or in your own data center. 

So, 1) get your data squared away, 2) make sure you’re using models that are optimized for loan-level, and 3) max out your analytics computational processing power by migrating to cloud-native infrastructure. Thank you, Suhrud, for taking the time to speak with us.


Get Started
Log in

Linkedin   

risktech2024