In our most recent workshop on Anomaly Detection and Quality Control (Part I), we discussed how clean market data is an integral part of producing accurate market risk results. As incorrect and inconsistent market data is so prevalent in the industry, it is not surprising that the U.S. spends over $3 trillion on processes to identify and correct market data. In taking a step back, it is worth noting what drives accurate market risk analytics. Clearly, having accurate portfolio holdings with correct terms and conditions for over-the-counter trades is central to calculating consistent risk measures that are scaled to the market value of the portfolio. The use of well-tested and integrated industry-standard pricing models is another key factor in producing reliable analytics. In comparison to the two categories above, clean, and consistent market data are the largest contributors that could lead to poor market risk analytics. The key driving factor behind detecting and correcting/transforming market data is risk and portfolio managers expectation that risk results are accurate at the start of the business day with no need to perform any time-consuming re-runs during the day to correct issues found. Broadly defined, market data is defined as any data that is used as input to the re-valuation models. This includes equity prices, interest rates, credit spreads. FX rates, volatility surfaces, etc. Market data needs to be: Complete – no true gaps when looking back historically. Accurate Consistent – data must be viewed across other data points to determine its accuracy (e.g., interest rates across tenor buckets, volatilities across volatility surface) Anomaly types can be broken down into four major categories: Spikes Stale data Missing data Inconsistencies Here are three example of “bad” market data: Credit Spreads The following chart depicts day-over-day changes in credit spreads for the 10-year consumer cyclical time series, returned from an external vendor. The changes indicate a significant spike on 12/3 that caused big swings, up and down, across multiple rating buckets. Without an adjustment to this data, key risk measures would show significant jumps, up and down, depending on the dollar value of positions on two consecutive days. Swaption Volatilities Market data also includes volatilities, which drive delta and possible hedging. The following chart shows implied swaption volatilities for different maturities of swaptions and their underlying swaps. The following chart shows implied swaption volatilities for different maturity of swaption and underlying swap. Note the spikes in 7×10 and 10×10 swaptions. The chart also highlights inconsistencies between different tenors and maturities. Equity Implied Volatilities The 146 and 148 strikes in the table below reflect inconsistent vol data, as often occurs around expiration. The detection of market data inconsistencies needs to be an automated process with multiple approaches targeted for specific types of market data. The detection models need to evolve over time as added information is gathered with the goal of reducing false negatives to a manageable level. Once the models detect the anomalies, the next step is to automate the transformation of the market data (e.g., backfill, interpolate, use prior day value). Together with the transformation, transparency must be recorded such that it is known what values were either changed or populated if not available. This should be shared with clients which could lead to alternative transformations or model detection routines. Detector types typically fall into the following categories: Extreme Studentized Deviate (ESD): finds outliers in a single data series (helpful for extreme cases.) Level Shift: detects change in level by comparing means of two sliding time windows (useful for local outliers.) Local Outliers: detects spikes in near values. Seasonal Detector: detects seasonal patterns and anomalies (used for contract expirations and other events.) Volatility Shift: detects shift of volatility by tracking changes in standard deviation. On Wednesday, May 19th, we will present a follow-up workshop focusing on: Coding examples Application of outlier detection and pipelines PCA Specific loan use cases Loan performance Entity correction Novelty Detection Anomalies are not always “bad” Market monitoring models You can register for this complimentary workshop here.