Suhrud Dagli, Author at RiskSpan

Why AI Won’t Kill Asset-Backed Finance Software — and Why the Last Mile is the Moat

Every wave of financial technology innovation brings the same prediction: software will be commoditized. Today, that prediction is being applied to AI. If AI models can reason, summarize, and generate code, the thinking goes, B2B vertical SaaS becomes unnecessary.

That conclusion is inherently wrong. ABF platforms are not feature layers, they are governed systems.

The last mile of AI deployment isn’t friction—it’s the moat.

ABF Is Not a “Promptable” Problem

ABF platforms sit directly in the flow of capital allocation, risk management, and regulation. For asset managers deploying institutional capital, this creates a very high bar for reliable data, validated models and domain-specific workflow.

The real question isn’t whether a system can produce answers. It’s whether it can produce results that are:

Consistent over reporting periods and market cycles

Explainable under stress and investor scrutiny

Defensible and robust enough for LPs, investment committees, and regulators

That high bar changes everything. It explains why technology adoption in financial markets moves cautiously and why legacy systems persist. These systems embed decision rights, controls, and institutional logic that can’t simply be recreated with better prompts. Any platform that ignores this reality will struggle to scale beyond pilots.

Which leads to the obvious question: if AI is so powerful, where does it actually help?

AI Accelerates Workflow — Not Accountability

Applied correctly, AI can materially improve ABF workflows. It can ingest complex credit agreements faster, reconcile data across counterparties, flag covenant breaches, and reduce manual reporting work. In other words, AI increases operational leverage.

But AI does not remove the need for explicit deployment configuration and governance. Institutions still must define who owns key assumptions, which decisions can be automated, and where accountability sits when outcomes affect capital. These embedded design choices (not prompts) ultimately determine whether a platform is trusted.

AI compresses timelines, but responsibility remains fixed. Once this distinction is recognized, the broader implication becomes clear: AI does not eliminate the need for software. It raises the bar for it.

Software Remains the System of Record

The idea that AI replaces SaaS also misunderstands where SaaS enterprise value lives. Enterprise value in ABF doesn’t live in isolated insights. It lives in controlled systems of record and durable platforms that provide:

Governed data and the system of record

Embedded domain expertise

Repeatable processes that survive personnel turnover

A shared source of truth across counterparties, investment, risk, accounting, and investor relations

AI without software discipline creates speed without stability. With it, AI becomes force-multiplying. The question, then, is what separates platforms that successfully integrate AI from those that don’t.

The Real Differentiator: Deployment Intelligence at Scale

What separates enduring platforms from feature-rich tools is not model sophistication—it’s deployment intelligence — the ability to integrate AI into live production environments without weakening controls. That requires:

Controlled data pipelines designed for real-world imperfections

Configuration layers that adapt to fund-specific structures without breaking controls

AI outputs that are transparent, and auditable

Implementation treated as a repeatable product, not bespoke services

This is where defensibility emerges. Deployment intelligence compounds with each client rollout. Each successful implementation strengthens the next, deepening institutional trust and operational resilience. AI amplifies this flywheel but cannot replace it.

The Mispriced Risk of “AI-Only” Narratives

In private credit, trust is earned slowly and lost quickly. It is built through consistent valuations, defensible reporting, and reliability during market dislocations.

A system that produces faster answers, but weaker confidence does not displace incumbents. It increases operational and reputational risk. Investors should be wary of platforms that promise instant replacement without acknowledging institutional reality of fiduciary-grade infrastructure.

The Investment Takeaway

AI is not commoditizing ABF software solutions. It is widening the moat for platforms that integrate AI responsibly into governed systems.

The next phase of growth for category leaders such as RiskSpan will be driven by combining deep domain knowledge with AI-native architecture. Leaders will treat the last mile – data integration, workflow configuration, and control design — as a core product capability, not an implementation afterthought.

In markets where trillions in capital allocation depend on data integrity and institutional trust, the last mile isn’t an implementation detail.

It’s the moat.

What a Year of Building AI in Structured Finance Actually Taught Us

The lessons nobody puts in the demo.

In 2025, our team built production AI systems that process billions of performance records for tens of millions of mortgages, develop cash flow models for complex private ABF structures directly from documents, and connect large language models directly to bond analytics APIs.

We built dashboards, connectors, and credit analytics. Some of them worked. Some of them taught us more by failing.

This is what we learned—not the polished conference talk version, but the notes we’d share with a peer team starting the same journey.

The Value Shift Nobody Prepares You For

A portfolio delinquency analysis that used to take three hours now takes twenty minutes.

That sounds like a win. It is a win. But it also raises a question that’s harder to answer than any technical problem we solved this year:

If AI handles in minutes what took us hours, what are we contributing?

When we started pulling this thread, we realized that a significant portion of what felt like skilled analytical work was actually mechanical labor—data extraction, formatting, applying the same methodology we’d applied dozens of times before. The expertise was real, but it was wrapped in hours of execution that masked how much of the work was routine.

Here’s where we landed:

AI handles the “how.” Humans own the “why” and “so what.”

The value now lives in knowing which questions matter. Understanding what the client really needs versus what they say they need. Recognizing when output is wrong because we understand the domain deeply enough to see the error.

That’s an entirely different skill set. It requires judgment, contextual awareness, and domain intuition that deepens over years—the kind of expertise AI can’t simply replicate, unlike procedural analytical work.

Not everyone will make this transition comfortably. The analysts who built their identity around being fast and thorough at execution face a harder adjustment than those who always saw execution as a means to an end.

We don’t have this all figured out yet. But we’ve stopped pretending the shift isn’t happening.

Stop Asking AI to Write Code—Start Asking It to Think With You

For years, we used Claude as a coding assistant. “Write a function that does X.” “Convert this data from format A to format B.” “Generate a script that calculates Y.”

That works. But it captures maybe 20% of the value.

The shift that changed our results was: treating Claude not as a tool to instruct, but as an analyst to think alongside.

The difference looks like this:

Before (instruction mode):

“Write a Python script to calculate delinquency rates from this loan data.”

After (thinking partner mode):

“We need to identify hidden credit risk in this CLO portfolio—issuers that resemble recent defaults but haven’t shown price distress yet. What factors should we consider? What data would we need? Let’s build a scoring model together.”

That second conversation led to identifying hidden exposure across issuers. Claude suggested factors we hadn’t considered—CLO concentration patterns, industry clustering effects, the relationship between coupon levels and distress signals. We debated the weighting. We refined the methodology. The output was genuinely collaborative.

The code that emerged from the second approach was better, but that’s almost beside the point. The thinking was better. The model was better. The insight was better.

This requires a different posture than most of us learned. You have to think out loud. Admit what you don’t know. Explain your reasoning and invite critique. Treat the AI as a colleague who happens to have read every document and doesn’t get tired—not as a sophisticated autocomplete.

The developers and analysts on our team who made this shift produce substantively different work than those who are still in instruction mode. And the gap is widening.

The First Version Will Be Wrong—Plan for It

We built a benchmark analysis comparing a client’s NonQM loan portfolio against the broader market. The analysis looked solid: the portfolio showed a 1.37% delinquency rate advantage versus the universe. Strong results. Ready to present.

Then someone asked about DSCR loans.

In NonQM lending, DSCR (debt service coverage ratio) loans are a category unto themselves—with measurably better performance than other NonQM products. When we segmented the data, we discovered the universe was comprised of 43% DSCR loans while the client’s portfolio had only 30% DSCR loans.

This changed everything.

The client’s portfolio had less exposure to DSCR loans (the better-performing segment) yet still outperformed the benchmark. That alone was impressive, but our initial analysis understated the true picture. Once we compared performance within segments (DSCR vs. DSCR, non-DSCR vs. non-DSCR), the client’s edge was even larger than we’d initially observed.

If we had presented the first version, we would have undersold our client’s own performance. The insight that mattered most—superior underwriting across both loan categories—would have been invisible.

Lesson: “Wrong” doesn’t mean broken. It means the output doesn’t fully reflect reality. Have a domain expert review the work before drawing conclusions.

Deploying AI Agents for End Users Is a Security Project

Building an AI agent that works in a demo is straightforward. Deploying that agent in a production UI where real users interact with real data took us months.

We built an agent that lets users query our bond analytics platform conversationally. The AI worked. Making it production-ready required solving problems:

Prompt injection: When users can type anything into a text box processed by an LLM, you inherit a new attack surface. We implemented input validation, output filtering, tightly scoped permissions, and logging that captures every agent action for audit.

Rate limiting: A single conversational turn might trigger 50 API calls. We built tiered limits—per-user, per-session, per-token—plus circuit breakers for runaway queries.

Session management: Agent sessions need conversational context across multiple turns, isolated per user, with graceful expiration handling and automatic cleanup.

Audit trails: Regulated industries need to know what the AI did. Every query, tool invocation, and response needs to be logged immutably.

The agent itself was 20% of the effort. Authentication, authorization, input validation, rate limiting, session management, and security review were the other 80%.

Lesson: In production, the agent is the easy part. The security wrapper is the product.

Post script: AgentCore from AWS and Agent Framework from Microsoft are solving the deployment and security headaches.

AI Is Good at Finding Information But Sometimes Overstates What It Means

While building the credit risk analysis, we asked Claude to research distressed issuers—companies that had defaulted or were showing signs of stress. We wanted to understand patterns we could use to identify similar risks in the portfolio.

Claude surfaced real-time signals we wouldn’t have found efficiently on our own: FTC antitrust actions, rating agency downgrades, refinancing walls, fraud allegations. Information that wouldn’t appear in pricing data for months was available in news coverage and regulatory filings. The research phase that would have taken days was completed in hours.

But we also caught Claude drawing confident conclusions from weak sources. In one case, it attributed claims to “industry reports” that didn’t exist when we followed the links. The search results were real. The sources were ‘real’. But the synthesis drew conclusions the sources didn’t support.

The lesson: use AI-powered search aggressively. It’s the difference between stale knowledge and current intelligence, especially in fast-moving situations. But verify specific claims. Click the links. Read the actual sources.

AI is excellent at finding relevant information across large volumes of text. It is sometimes too confident about what that information means when synthesized. The combination of broad retrieval and skeptical verification is more powerful than either alone.

Your Org Chart Isn’t Ready for This

Our AI strategy deck included projections: reduction in onboarding costs, increased client capacity and margin expansion.

The numbers were defensible. The business case was clear.

What the projections didn’t address: the organizational implications of realizing the promised efficiencies.

If analysts can serve five times more clients, do you need fewer analysts—or do you pursue five times more clients? If the answer is “more clients,” do you have the sales capacity? The support infrastructure? The management bandwidth?

If developers now own adoption metrics for the features they build, then what happens to the product managers who previously owned that? Are product managers freed up for more strategic work, or are they defending territory?

If AI drafts client communications, who reviews them? What error rate are we willing to accept? Who’s accountable when the AI gets something wrong?

These aren’t hypothetical questions. We’re navigating them now, and the answers aren’t obvious.

AI doesn’t just improve workflows. It reshapes roles. And most organizations—including ours—are making it up as they go.

The companies that figure out the organizational design will outperform those that simply purchase better software. The differentiation in 2026 won’t come from adopting AI. It will come from redesigning teams, incentives, and accountability structures around what AI makes newly possible.

What We’re Taking Into Next Year

A year of building AI systems in structured finance clarified a few things:

AI is more powerful than the hype suggests—once you integrate it into real workflows rather than treating it as a research toy.

AI is more frustrating than the demos show—the gap between “works in claude.ai” and “works in production” is where most of the time goes.

AI is more dependent on domain expertise than the automation narrative implies—it generates analyses quickly, but distinguishing plausible from accurate requires human judgement that compounds over years. The “why” and “so what” remain stubbornly human problems.

AI changes more than technology—it changes job descriptions, team structures, and how people understand their own value. The skill isn’t operating the tool; it’s knowing when the output reflects reality.

We don’t have all the answers. We’re still learning what this means for how we build software, how we serve clients, and how we organize ourselves.

But we’re no longer wondering whether AI will change our industry. We’re focused on making sure we’re the ones defining how.

Using LLMs as judges for validating deal cash flow models: A new frontier in securitization modeling

As securitization models become increasingly complex and differentiated, validation becomes a critical challenge. We’ve experimented with an innovative approach that leverages large language models (LLMs) as impartial judges to validate models implemented across different platforms.

The Dual-Implementation Challenge

In cash flow modeling, we often maintain parallel implementations—typically in Python for flexibility and Excel for transparency. How do we ensure both versions produce consistent results?

Enter the “LLM as Judge” approach!

A Real-World Case Study: Residential Transition Loan Funding

Consider a portfolio of residential transition loans with a funding structure including:

100 loans averaging $275,000 each
12-month average terms at 8.75%
A 75% advance rate
2% loss reserve build-up
Performance triggers based on delinquency rates

We implemented this structure in both Python and Excel, then submitted both models to an LLM for validation.

The LLM Validation Process

The LLM first analyzed the conceptual alignment between models, confirming both followed the same fundamental approach to cash flow projection, default assumptions, reserve mechanics, and triggers.

Next came a rigorous numerical comparison. The LLM detected a $100,000 investor distribution discrepancy in Month 2:

Python model: $1,790,702
Excel model: $1,690,702

Through logical analysis, the LLM determined this likely stemmed from differently evaluated trigger conditions. This kind of subtle implementation difference could easily go unnoticed in manual validation, potentially leading to significant valuation discrepancies over time.

Beyond Discrepancy Detection

The true power of this approach extends beyond finding differences. The LLM also provided:

Stress testing recommendations tailored to our specific product, including scenarios for rapid defaults, extension waves, and interest rate shocks
Model risk management insights highlighting documentation needs and suggesting a formal reconciliation process
Code quality assessment noting strengths and weaknesses in both implementations

Why This Matters

For securitization professionals, this approach offers several advantages:

Efficiency: Automation of tedious line-by-line comparisons
Comprehensiveness: Identification of conceptual differences, not just numerical ones
Regulatory compliance: Better documentation for model risk management requirements
Objectivity: Unbiased third-party perspective

How Are Ginnie’s New RG Pools Performing?

In February of this year, the Ginnie Mae II program began guaranteeing securities backed by pools of mortgages previously bought out of Ginnie Mae securities because of delinquency. In order to qualify for these new re-performing pools (known as “RG pools”) a loan must meet two (related) conditions:

Borrower has made at least six months of timely payments prior to pool issuance.
Pool issue date is at least 210 days from when the mortgage was last delinquent.

The novelty of RG pools raises questions about their composition and performance relative to other Ginnie Mae pools. While it remains too early to make many conclusive statements, a preliminary look at the prepayment data indicates speeds somewhere between those of similar vintage Ginnie Mae multi and custom pools, with typical variability from servicer to servicer.

In this post, we discuss the prepayment behaviors we have observed over the first seven months of RG pool securitization, issuance patterns, and collateral characteristics.

Prepayments

Latest September prepayment prints show that RG pools’ speeds generally fell in between those of similar coupon/vintage multi and custom pools. Below charts shows that 2015/2016 3.5% RG pools prepaid at around 37-38 CPR in September, a couple of CPR slower than similarly aged multi pools and almost 10 CPR faster than custom pools.

Prepayments for G2 3.5% RG, Custom and Multi Pools by Vintages, September Factor Month Note: Loan level data

Below, we plot S-curves for 49 to 72 wala RG loans against S-curves for similarly aged multi and other custom loans from April to September factor months. Speeds for RG loans with 25 to 100 bp of rate incentives have prepaid in mid-30s CPRs (Green line in below figure). During the same period, similar multi pools have prepaid 5 to 8 CPR faster (blue line) than RG pools while similar custom pools have prepaid around 5 CPR slower (black line). We also overlaid a s-curve for 7 to 18 wala G2 multi pools as a comparison (orange line).

S-curves for RG, Custom and Multi Pools (49 to 72 WALA) April to September Factor Months
Note: Loan level data, orange line is the s-curve for 7-18 wala G2 multi pools with a one-year lookback period

Not surprisingly, prepayment behavior differs by servicer. Wells-serviced RG pools that are seasoned 49 to 72 months with 25 to 100 bp of rate incentives appear to be prepaying in low 30s CPRs (black line in below figure). Similar loans from Penny Mac are prepaying 5 to 10 CPR faster, which tends to be the case for non-RG loans as well.

S-curves for RG loans by servicers, 49 to 72 WALA, April to September Factor MonthsNote: Loan level data

While the re-performing loans that are being securitized into RG pools are already seasoned loans, prepayments have been increasing as pool seasons. For example, one-month old RG 3.5% pools have prepaid at 27 CPR while 6- and 7-month 3.5% pools prepaid at 45-50 CPR (black line below). In addition, overall prepayment speeds for same-pool-age 3.0%, 3.5%, and 4.0% have been on top of each other.

Prepayments for RG 3.0%, 3.5% and 4.0% Pools by Pool Age, March to September 2021 Note: only showing data points for cohorts with more than 50 loans

Issuance Volume

Following a brief ramp-up period in February and March, issuance of RG pools has averaged around $2 billion (and roughly 300 pools) per month for the past five months (see Issuance chart below). The outstanding UPB of these pools stands at nearly $11 billion as of the September factor month.

Note: RiskSpan uses reporting month as a factor month. For this chart, we adjust our factor date by one month to match the collection period.

RG pools already account for a sizable share of Ginnie II custom issuance, as illustrated in the following chart, making up 18% of G2 custom issuance and 3% of all G2 issuance since April.

Note: RiskSpan uses reporting month as a factor month. For this chart, we adjust our factor date by one month to match the collection period.

RG Pool Characteristics

Nearly all of RG pool issuance has been in 3.0% to 4.5% coupons, with a plurality at 3.5%. As of the September factor month, almost $4 billion (37%) of the outstanding RG pools are in 3.5% coupons. The 4% coupon accounted for the next-largest share–$2.5 billion (23%)—followed by $2.3 billion in 3.0% (20.9%) and $1.3 billion in 4.5% (11.8%).

RG Pool Outstanding Amount by Coupon — September Factor Month

The following table compares the characteristics of RG pools issued since February with those of G2 single-family custom and multi pools issued during the same period. The table highlights some interesting differences:

Issuance of RG pools seems to be concentrated in higher coupons (3% to 4%) compared to issuances for G2 custom pools (concentrated on 2.5% and 3.0%) and G2 multi-lender pools (concentrated on 2.0% and 2.5%).
Loan sizes in RG pools tend to fall between those of G2 customs and smaller than G2 multis. For example, WAOLS for 3.5% RG pools is around 245k and is around 50k smaller than multi pools and 30k larger than other custom pools.

RG pools consist almost exclusively of FHA loans while G2 multis have a much higher share of VA loans. Almost 98% of 3.5% RG loans are FHA loans.

G2 RG vs. G2 Custom and G2 Multi (pools issued since February), Stat as of September Factor Month

Wells Fargo and Penny Mac are far and away the leaders in RG issuance, accounting collectively for 62% of outstanding RG pools.

RG Pools by Servicer, September Factor Month

How to Run RG Pools in Edge Perspective

Subscribers to Edge Perspective can run these comparisons (and countless others) themselves using the “GN RG” pool type filter. The “Custom/Multi-lender” filter can likewise be applied to separate those pools in G2SF.

Contact Us

Contact us if you are interested in seeing variations on this theme. Using Edge, we can examine any loan characteristic and generate an S-curve, aging curve, or time series.

Managing Market Risk for Crypto Currencies

Overview

Asset Volatility vs Asset Sensitivity to Benchmark (Beta)

Portfolio Asset Covariance

Value at Risk (VaR)

Bitcoin Futures: Basis and Proxies

Intraday Value at Risk (VaR)

Risk-Based Limits

VaR Validation (Bayesian Approach)

Scenario Analysis

Conclusion

Overview

Crypto currencies have now become part of institutional investment strategies. According to CoinShares, assets held under management by crypto managers reached $57B at the end of Q1 2021.

Like any other financial asset, crypto investments are subject to market risk monitoring with several approaches evolving. Crypto currencies exhibit no obvious correlation to other assets classes, risk factors or economic variables. However, crypto currencies have exhibited high price volatility and have enough historical data to implement a robust market risk process.

In this paper we discuss approaches to implementing market risk analytics for a portfolio of crypto assets. We will look at betas to benchmarks, correlations, Value at Risk (VaR) and historical event scenarios.

Value at Risk allows risk managers to implement risk-based limits structures, instead of relying on traditional notional measures. The methodology we propose enables consolidation of risk for crypto assets with the rest of the portfolio. We will also discuss the use of granular time horizons for intraday limit monitoring.

Asset Volatility vs Asset Sensitivity to Benchmark (Beta)

For exchange-traded instruments, beta measures the sensitivity of asset price returns relative to a benchmark. For US-listed large cap stocks, beta is generally computed relative to the S&P 500 index. For crypto currencies, several eligible benchmark indices have emerged that represent the performance of the overall crypto currency market.

We analyzed several currencies against S&P’s Bitcoin Index (SPBTC). SPBTC is designed to track the performance of the original crypto asset, Bitcoin. As market capitalization for other currencies grows, it would be more appropriate to switch to a dynamic multi-currency index such as Nasdaq’s NCI. At the time of this paper, Bitcoin constituted 62.4% of NCI.

Traditionally, beta is calculated over a variable time frame using least squares fit on a linear regression of benchmark return and asset return. One of the issues with calculating betas is the variability of the beta itself. In order to overcome that, especially given the volatility of crypto currencies, we recommend using a rolling beta.

Due to the varying levels of volatility and liquidity of various crypto currencies, a regression model may not always be a good fit. In addition to tracking fit through R-squared, it is important to track confidence level for the computed betas.

Figure 1 History of Beta to S&P Bitcoin Index with Confidence Intervals

The chart above shows rolling betas and confidence intervals for four crypto currencies between January 2019 and July 2021. Beta and confidence interval both vary over time and periods of high volatility (stress) cause a larger dislocation in the value of beta.

Rolling betas can be used to generate a hierarchical distribution of expected asset values.

Portfolio Asset Covariance

Beta is a useful measure to track an asset’s volatility relative to a single benchmark. In order to numerically analyze the risk exposure (variance) of a portfolio with multiple crypto assets, we need to compute a covariance matrix. Portfolio risk is a function not only of each asset’s volatility but also of the cross-correlation among them.

Figure 2 Correlations for 11 currencies (calculated using observations from 2021)

The table above shows a correlation matrix across 11 crypto assets, including Bitcoin.

Like betas, correlations among assets change over time. But correlation matrices are more unwieldy to track over time than betas are. For this reason, hierarchical models provide a good, practical framework for time-varying covariance matrices.

Value at Risk (VaR)

The VaR for a position or portfolio can be defined as some threshold Τ (in dollars) where the existing position, when faced with market conditions resembling some given historical period, will have P/L greater than Τ with probability k. Typically, k is chosen to be 99% or 95%.

To compute this threshold Τ, we need to:

Set a significance percentile k, a market observation period, and holding period n.
Generate a set of future market conditions (scenarios) from today to period n.
Compute a P/L on the position for each scenario

After computing each position’s P/L, we sum the P/L for each scenario and then rank the scenarios’ P/Ls to find the the k th percentile (worst) loss. This loss defines our VaR Τ at the the k th percentile for observation-period length n.

Determining what significance percentile k and observation length n to use is straightforward and often dictated by regulatory rules. For example, 99^th percentile 10-day VaR is used for risk-based capital under the Market Risk Rule. Generating the scenarios and computing P/L under these scenarios is open to interpretation. We cover each of these, along with the advantages and drawbacks of each, in the next two sections.

To compute VaR, we first need to generate projective scenarios of market conditions. Broadly speaking, there are two ways to derive this set of scenarios:

Project future market conditions using historical (actual) changes in market conditions
Project future market conditions using a Monte Carlo simulation framework

In this paper, we consider a historical simulation approach.

RiskSpan projects future market conditions using actual (observed) n-period changes in market conditions over the lookback period. For example, if we are computing 1-day VaR for regulatory capital usage under the Market Risk Rule, RiskSpan takes actual daily changes in risk factors. This approach allows our VaR scenarios to account for natural changes in correlation under extreme market moves. RiskSpan finds this to be a more natural way of capturing changing correlations without the arbitrary overlay of how to change correlations in extreme market moves. This, in turn, will more accurately capture VaR. Please note that newer crypto currencies may not have enough data to generate a meaningful set of historical scenarios. In these cases, using a benchmark adjusted by a short-term beta may be used as an alternative.

One key consideration for the historical simulation approach is the selection of the observation window or lookback period. Most regulatory guidelines require at least a one-year window. However, practitioners also recommend a shorter lookback period for highly volatile assets. In the chart below we illustrate how VaR for our portfolio of crypto currencies changes for a range of lookback periods and confidence intervals. Please note that VaR is expressed as a percentage of portfolio market value.

Use of an exponentially weighted moving average methodology can be used to overcome the challenges associated with using a shorter lookback period. This approach emphasizes recent observations by using exponentially weighted moving averages of squared deviations. In contrast to equally weighted approaches, these approaches attach different weights to the past observations contained in the observation period. Because the weights decline exponentially, the most recent observations receive much more weight than earlier observations.

Figure 3 Daily VaR as % of Market Value calculated using various historical observation periods

VaR as a single number does not represent the distribution of P/L outcomes. In addition to computing VaR under various confidence intervals, we also compute expected shortfall, worst loss, and standard deviation of simulated P/L vectors. Worst loss and standard deviation are self-explanatory while the calculation of expected shortfall is described below.

Expected shortfall is the average of all the P/L figures to the left of the VaR figure. If we have 1,000 simulated P/L vectors, and the VaR is the 950th worst case observation, the expected shortfall is the average of P/Ls from 951 to 1000.

The table below presents VaR-related metrics as a percentage of portfolio market value under various lookback periods.

Figure 4 VaR for a portfolio of crypto assets computed for various lookback periods and confidence intervals

Bitcoin Futures: Basis and Proxies

One of the most popular trades for commodity futures is the basis trade. This is when traders build a strategy around the difference between the spot price and futures contract price of a commodity. This exists in corn, soybean, oil and of course Bitcoin. For the purpose of calculating VaR, specific contracts may not provide enough history and risk systems use continuous contracts. Continuous contracts introduce additional basis as seen in the chart below. Risk managers need to work with the front office to align risk factor selection with trading strategies, without compromising independence of the risk process.

Figure 5 BTC/Futures basis difference between generic and active contracts

Intraday Value

The highly volatile nature of crypto currencies requires another consideration for VaR calculations. A typical risk process is run at the end of the day and VaR is calculated for a one-day or longer forecasting period. But Crypto currencies, especially Bitcoin, can also show significant intraday price movements.

We obtained intraday prices for Bitcoin (BTC) from Gemini, which is ranked third by volume. This data was normalized to create time series to generate historical simulations. The chart below shows VaR as a percentage of market value for Bitcoin (BTC) for one-minute, one-hour and one-day forecasting periods. Our analysis shows that a Bitcoin position can lose as much as 3.5% of its value in one hour (99^th percentile VaR).

Risk-Based Limits

Right from the inception of Value at Risk as a concept it has been used by companies to manage limits for a trading unit. VaR serves as a single risk-based limit metric with several advantages and a few challenges:

Pros of using VaR for risk-based limit:

VaR can be applied across all levels of portfolio aggregation.
Aggregations can be applied across varying exposures and strategies.
Today’s cloud scale makes it easy to calculate VaR using granular risk factor data.

VaR can be subject to model risk and manipulation. Transparency and use of market risk factors can avoid this pitfall.

Ability to calculate intra-day VaR is key for a risk-based limit implementation for crypto assets. Risk managers should consider at least an hourly VaR limit in addition to the traditional daily limits.

VaR Validation (Bayesian Approach)

Standard approaches for back-testing VaR are applicable to portfolios of crypto assets as well.

Given the volatile nature of this asset class, we also explored an approach to validating the confidence interval and percentiles implied from historical simulations. Although this is a topic that deserves its own document, we present a high-level explanation and results of our analysis.

Building an approach first proposed in the Pyfolio library, we generated a posterior distribution using the Pymc3 package from our historically observed VaR simulations.

Sampling routines from Pymc3 were used to generate 10,000 simulations of the 3-year lookback case. A distribution of percentiles (VaR) was then computed across these simulations.

The distribution shows that the mean 95^th percentile VaR would be 7.3% vs 8.9% calculated using the historical simulation approach. However, the tail of the distribution indicates a VaR closer to the historical simulation approach. One could conclude that the test indicates that the original calculation still represents the extreme case, which is the motivation behind VaR.

Figure 6 Distribution of percentiles generated from posterior simulations

Scenario Analysis

In addition to standard shock scenarios, we recommend using the volatility of Bitcoin to construct a simulation of outcomes. The chart below shows the change in Bitcoin (BTC) volatility for select events in the last two years. Outside of standard macro events, crypto assets respond to cyber security events and media effects, including social media.

Figure 7 Weekly observed volatility for Bitcoin

Conclusion

Given the volatility of crypto assets, we recommend, to the extent possible, a probability distribution approach. At the very least, risk managers should monitor changes in relationship (beta) of assets.

For most financial institutions, crypto assets are part of portfolios that include other traditional asset classes. A standard approach must be used across all asset classes, which may make it challenging to apply shorter lookback windows for computing VaR. Use of the exponentially weighted moving approach (described above) may be considered.

Intraday VaR for this asset class can be significant and risk managers should set appropriate limits to manage downward risk.

Idiosyncratic risks associated with this asset class have created a need for monitoring scenarios not necessarily applicable to other asset classes. For this reason, more scenarios pertaining to cyber risk are beginning to be applied across other asset classes.

Calculating VaR: A Review of Methods

COVID-19 and the Cloud

COVID-19 creates a need for analytics in real time

Regarding the COVID-19 pandemic, Warren Buffet has observed that “we haven’t faced anything that quite resembles this problem” and the fallout is “still hard to evaluate.”

The pandemic has created unprecedented shock to economies and asset performance. The recent unemployment data, although encouraging , has only added to the uncertainty. Furthermore, impact and recovery are uneven, often varying considerably from county to county and city to city. Consider:

COVID-19 cases and fatalities were initially concentrated in just a few cities and counties resulting in almost a total shutdown of these regions.
Certain sectors, such as travel and leisure, have been affected worse than others while other sectors such as oil and gas have additional issues. Regions with exposure to these sectors have higher unemployment rates even with fewer COVID-19 cases.
Timing of reopening and recoveries has also varied due to regional and political factors.

Regional employment, business activity, consumer spending and several other macro factors are changing in real time. This information is available through several non-traditional data sources.

Legacy models are not working, and several known correlations are broken.

Determining value and risk in this environment is requiring unprecedented quantities of analytics and on-demand computational bandwidth.

Need for on-demand computation and storage across the organization

“I don’t need a hard disk in my computer if I can get to the server faster… carrying around these non-connected computers is byzantine by comparison.” ~ Steve Jobs

Front office, risk management, quants and model risk management – every aspect of the analytics ecosystem requires the ability to run large number of scenarios quickly.

Portfolio managers need to recalibrate asset valuation, manage hedges and answer questions from senior management, all while looking for opportunities to find cheap assets. Risk managers are working closely with quants and portfolio managers to better understand the impact of this unprecedented environment on assets. Quants must not only support existing risk and valuation processes but also be able to run new estimations and explain model behavior as data streams in from variety of sources.

These activities require several processors and large storage units to be stood up on-demand. Even in normal times infrastructure teams require at least 10 to 12 weeks to procure and deploy additional hardware. With most of the financial services world now working remotely, this time lag is further exaggerated.

No individual firm maintains enough excess capacity to accommodate such a large and urgent need for data and computation.

The work-from-home model has proven that we have sufficient internet bandwidth to enable the fast access required to host and use data on the cloud.

Cloud is about how you do computing

“Cloud is about how you do computing, not where you do computing.” ~ Paul Maritz, CEO of VMware

Cloud computing is now part of everyday vocabulary and powers even the most common consumer devices. However, financial services firms are still in early stages of evaluating and transitioning to a cloud-based computing environment.

Cloud is the only way to procure the level of surge capacity required today. At RiskSpan we are computing an average of a half-million additional scenarios per client on demand. Users don’t have the luxury to wait for an overnight batch process to react to changing market conditions. End users fire off a new scenario assuming that the hardware will scale up automagically.

When searching Google’s large dataset or using Salesforce to run analytics we expect the hardware scaling to be limitless. Unfortunately, valuation and risk management software are typically built to run on a pre-defined hardware configuration.

Cloud native applications, in contrast, are designed and built to leverage the on-demand scaling of a cloud platform. Valuation and risk management products offered as SaaS scale on-demand, managing the integration with cloud platforms.

Financial services firms don’t need to take on the burden of rewriting their software to work on the cloud. Platforms such as RS Edge enable clients to plug their existing data, assumptions and models into a cloud–native platform. This enables them to get all the analytics they’ve always had—just faster and cheaper.

Serverless access can also help companies provide access to their quant groups without incurring additional IT resource expense.

A recent survey from Flexera shows that 30% of enterprises have increased their cloud usage significantly due to COVID-19.

Cloud is cost effective

“In 2000, when my partner Ben Horowitz was CEO of the first cloud computing company, Loudcloud, the cost of a customer running a basic Internet application was approximately $150,000 a month.”  ~ Marc Andreessen, Co-founder of Netscape, Board Member of Facebook

Cloud hardware is cost effective, primarily due to the on-demand nature of the pricing model. A $250B asset manager uses RS Edge to run millions of scenarios for a 45–minute period every day. Analysis is performed over a thousand servers at a cost of $500 per month. The same hardware if deployed for 24 hours would cost $27,000 per month

Cloud is not free and can be a two-edged sword. The same on-demand aspect that enables end users to spin up servers as needed, if not monitored, can cause the cost of such servers to accumulate to undesirable levels. One of the benefits of a cloud-native platform is built-on procedures to drop unused servers, which minimizes the risk of paying for unused bandwidth.

And yes, Mr. Andreeseen’s basic application can be hosted today for less than $100 per month

The same survey from Flexera shows that organizations plan to increase public cloud spending by 47% over the next 12 months.

Alternate data analysis

“The temptation to form premature theories upon insufficient data is the bane of our profession.” ~ Sir Arthur Conan Doyle, Sherlock Holmes.

Alternate data sources are not always easily accessible and available within analytic applications. The effort and time required to integrate them can be wasted if the usefulness of the information cannot be determined upfront. Timing of analyzing and applying the data is key.

Machine learning techniques offer quick and robust ways of analyzing data. Tools to run these algorithms are not readily available on a desktop computer.

Every major cloud platform provides a wealth of tools, algorithms and pre-trained models to integrate and analyze large and messy alternate datasets.

Join fintova’s Gary Maier and me at 1 p.m. EDT on June 24 th as we discuss other important factors to consider when performing analytics in the cloud. Register now.

Chart of the Month: Tracking Mortgage Delinquency Against Non-traditional Economic Indicators by MSA

Tracking Mortgage Delinquency Against Non-traditional Economic Indicators by MSA

Traditional economic indicators lack the timeliness and regional granularity necessary to track the impact of COVID-19 pandemic on communities across the country. Unemployment reports published by the Bureau of Labor Statistics, for example, tend to have latency issues and don’t cover all workers. As regional economies attempt to get back to a new “normal” RiskSpan has begun compiling non-traditional “alternative” data that can provide a more granular and real-time view of issues and trends. In past crises, traditional macro indicators such as home price indices and unemployment rates were sufficient to explain the trajectory of consumer credit. However, in the current crisis, mortgage delinquencies are deteriorating more rapidly with significant regional dispersion. Serious mortgage delinquencies in the New York metro region were around 1.1% by April 2009 vs 30 day delinquencies at 9.9% of UPB in April 2020.

STACR loan–level data shows that nationwide 30–day delinquencies increased from 0.8% to 4.2% nationwide. In this chart we track the performance and state of employment of 5 large metros (MSA).

Indicators included in our Chart of the Month:

Change in unemployment is the BLS measure computed from unemployment claims. Traditionally this indicator has been used to measure economic health of a region. BLS reporting typically lags by months and weeks.

Air quality index is a measure we calculate using level PM2.5 reported by EPA’s AirNow database on a daily basis. This metric is a proxy of increased vehicular traffic in different regions. Using a nationwide network of monitoring sites, EPA has developed ambient air quality trends for particle pollution, also called Particulate Matter (PM). We compute the index as daily level of PM2.5 vs the average of the last 5 years. For regions that are still under a shutdown air quality index should be less than 100 (e.g. New York at 75% vs Houston at 105%)

Air pollution from traffic has increased in regions where businesses have opened in May ’20 (e.g. LA went up from 69% in April to 98% in May). However, consumer spending has not always increased at the same level. We look to proxies for hourly employment levels.

New Daily COVID-19 Cases: This is a health crisis and managing the rate of new COVID-19 cases will drive decisions to open or close businesses. The chart reports the monthly peak in new cases using daily data from Opportunity Insight

Hourly Employment and Hours Worked at small businesses is provided by Opportunity Insight using data from Homebase. Homebase is a company that provides virtual scheduling and time-tracking tools, focused on small businesses in sectors such as retail, restaurant, and leisure/accommodation. The chart shows change in level of hourly employment as compared to January 2020. We expect this is to be a leading indicator of employment levels for this sector of consumers.

Sources of data:

Freddie Mac’s (STACR) transaction database

Opportunity Insight’s Recovery Tracker

Bureau of Labor and Statistics (BLS)’ MSA level economic reports

Environment Protection Agency (EPA)’s AirNow database.

Why AI Won’t Kill Asset-Backed Finance Software — and Why the Last Mile is the Moat

What a Year of Building AI in Structured Finance Actually Taught Us

Using LLMs as judges for validating deal cash flow models: A new frontier in securitization modeling

How Are Ginnie’s New RG Pools Performing?

Contact Us

Managing Market Risk for Crypto Currencies

Contents

Overview

Asset Volatility vs Asset Sensitivity to Benchmark (Beta)

Portfolio Asset Covariance

Value at Risk (VaR)

Bitcoin Futures: Basis and Proxies

Intraday Value

Risk-Based Limits

VaR Validation (Bayesian Approach)

Scenario Analysis

Conclusion

Related Article

COVID-19 and the Cloud

COVID-19 creates a need for analytics in real time

Need for on-demand computation and storage across the organization

Cloud is about how you do computing

Cloud is cost effective

Alternate data analysis

Chart of the Month: Tracking Mortgage Delinquency Against Non-traditional Economic Indicators by MSA

Company

Products

Security & Compliance