Feature selection in machine learning refers to the process of isolating only those variables (or “features”) in a dataset that are pertinent to the analysis. Failure to do this effectively has many drawbacks, including: 1) unnecessarily complex models with difficult-to-interpret outcomes, 2) longer computing time, and 3) collinearity and overfitting. Effective feature selection eliminates redundant variables and keeps only the best subset of predictors in the model, thus making it possible to represent the data in the simplest way.This post begins by identifying steps that must be taken to prepare datasets for meaningful analysis—and how machine learning can help. We then introduce and discuss some commonly used machine learning techniques for variable selection.
Real world data contains a wide range of holes, noise, and inconsistencies. Before doing any statistical analysis, it is crucial to ensure that the data can be meaningfully analyzed. In practice, data cleansing is often the most time-consuming part of data analysis. This upfront investment is necessary, however, because the quality of data has a direct bearing on the reliability of model outputs.
Various machine learning projects require different sorts of data cleansing steps, but in general, when people speak of data cleansing, they are referring to the following specific tasks.
Cleaning Missing Values
Many machine learning techniques do not support data with missing values. To address this, we first need to understand why data are missing. Missing values usually occur simply because no information is provided, but other circumstances can lead to data holes as well. For instance, setting incorrect data types for attributes when data is extracted and integrated from multiple sources can cause data loss.
One way to investigate missing values is to identify patterns for missing data. For example, missing answers for certain questions from female respondents in a survey may indicate that those questions are only asked of male respondents. Another example might involve two loan records that share the same ID. If the second record contains blank values for every attribute except ‘Market Price,’ then the second record is likely simply updating the market price of the first record.
Once the early-stage evaluation of missing data is complete, we can set about determining how to address the problem. The easiest way to handle missing values is simply to ignore the records that contain them. However, this solution is not always practical. If a relatively large portion of the dataset contains missing values, then removing all of them could result in remaining data that may not be a good representation of the initial population. In that case, rather than filtering out relevant rows or attributes, a more proper approach is to impute missing values with sensible values.
A typical imputing method for categorical variables involves replacing the missing values with the most frequent value or with a newly created “unknown” category. For numeric variables, missing values might be replaced with mean or median values. Other, more advanced methods for dealing with missing values, e.g., listwise deletion for deleting rows with missing data and multiple imputation for substituting missing values, exist as well.
Reducing Noise in Data
“Noise” in data refers to erroneous values and outliers. Noise is an unavoidable problem which can be caused by human mistakes in data entry, technical problems, and many other factors. Noisy data adversely influences model performance, so its detection and removal has a key role to play in the data cleaning process.
There are two major noise types in data: class noise and attribute noise. Class noise often occurs in categorical variables and can include: 1) non-standardized class labels, 2) duplicate records mapping to different class labels, and 3) mislabeled records. Attribute noise refers to corruptive values and outliers, such as percentages inappropriately greater than 100% and placeholders (e.g., 999,000).1
There are many ways to deal with noisy data. Certain type of noise can be easily identified by sorting the data—thus isolating text input where numeric input is expected and other placeholders. Other noise can be addressed only using statistical methods. Clustering analysis groups the data by similarity and can help with detecting irrelevant objects and outliers. Data binning is used to reduce the impact of observation errors by combining ‘neighborhood’ data into a small number of bins. Advanced smoothing algorithms, including moving average and loess, fit the data into regression functions to eliminate the effect due to random variation and allow important patterns to stand out.
Data normalization converts numerical values into specific ranges to meet the needs of a model. Performing data normalization makes it possible to aggregate data with different scales. Several algorithms require normalized data. For example, it is necessary to normalize data before feeding into principal component analysis (PCA) so that all variables have zero mean and unit variance and therefore the same weight. This also applies when performing support vector machines (SVM), which assumes that the input data is in range [0,1] or [-1,1]. Unnormalized data slows down model convergence time and skews results.
The most common way of normalizing data involves Z-score. Also known as standard-score normalization, this approach normalizes the error by dividing the difference between the data and mean by standard deviation. Z-score normalization is often used when min and max are unknown. Another common method is feature scaling, which brings all values into range [0,1] by dividing the difference between the data and min by the difference between max and min. Other normalization methods include studentized residual, t-statistics, and coefficient of variation.
Feature Selection Methods2
A stepwise procedure adds or subtracts individual features from a model until the optimal mix is identified. Stepwise procedures take three forms: backward elimination, forward selection, and stepwise regression.
Backward elimination is the simplest method. It fits the model using all available features and then systematically removes features one at a time, beginning with the feature with the highest p-value (provided the p-value exceeds a given threshold, usually 5%). The model is refit after each elimination and process loops until a model is identified in which each feature’s p-value falls below the threshold.
Forward selection is the opposite of backward elimination. It includes no variables in the model at first and then systematically adds features one at a time, beginning with the lowest p-value (provided the p-value falls below a threshold). The model is refit after each addition and loops until additional features do not help model performance.
Stepwise regression combines backward elimination and forward selection by allowing a feature to be added or dropped at each iteration. Using this method, a newly added variable in an early stage may be removed later, and vice versa.
A variable’s p-value is not the only statistic that can be used for feature selection. Penalized-likelihood criteria, such as akaike information criterion (AIC) and bayesian information criterion (BIC), are also valuable. Lower AICs and BICs indicate that a model is more likely to be true. They are given as: nlog (RSS/n) + kp, where RSS is residual sum of square (which decreases as the model complexity increases), n is sample size, p is numbers of predictors, and k is two for AIC and log(n) for BIC. Both criteria penalize larger models as p goes up, and BIC penalizes model complexity more heavily, which explains why BIC tends to favor smaller models in comparison to AIC. Other criteria are 1) Adjusted R2, which increases only if a new feature improves model performance more than expected, 2) PRESS, summing up squares of predicted residuals, and 3) Mallow’s Cp Statistic, estimating the average MSE of prediction.
Lasso and Ridge Regression
Lasso and ridge regressions are powerful techniques for dealing with large feature coefficients. Both approaches reduce overfitting by penalizing features with large coefficients and minimizing the difference between predicted value and observation, but they differ when adding penalized terms. Lasso adds a penalty term equivalent to the absolute value of the magnitude of coefficients, so that it zeros out target variables’ coefficients and eliminates them from the model. Ridge assigns a penalty equivalent to square of the magnitudes of the coefficients. Even though it does not shrink the coefficient to zero, it can regularize and constrain the coefficients to control variance.
Lasso and ridge regression models have been widely used in finance since their introduction. A recent example used both these methods in predicting corporate bankruptcy.3 In this study, the authors discovered that these regression methods are optimal as they handle multicollinearity and minimize the numerical instability that may occur due to overfitting.
“Dimensionality reduction” is a process of transforming an extraordinarily complex, “high-dimensional” dataset (i.e., one with thousands of variables or more) into a dataset that can tell the story using a significantly smaller number of variables.
The most popular linear technique for dimensionality reduction is principal component analysis (PCA). It converts complex dataset features into a new set of coordinates named principal components (PCs). PCs are created in such a way that each succeeding PC preserves the largest possible variance under the condition that it is uncorrelated with the preceding PCs. Keeping only the first several PCs in the model reduces data dimensionality and eliminates multi-collinearity among features.
PCA has a couple of potential pitfalls: 1) PCA is sensitive to the scale effects of the original variables (data normalization is required for performing PCA), and 2) Applying PCA to the data will hurt its ability to interpret the influence of individual features since the PCs are not real variables any more. For these reasons, PCA is not a good choice for feature selection if interpretation of results is important.
Dimensionality reduction and specifically PCA have practical applications to fixed income analysis, particularly in explaining term-structure variation in interest rates. Dimensionality reduction has also been applied to portfolio construction and analytics. It is well known that the first eigenvector identified by PCA maximally captures the systematic risk (variation of returns) of a portfolio.4 Quantifying and understanding this risk is essential when balancing a portfolio.
 Pereira, J. M., Basto, M., & da Silva, A. F. (2016). The Logistic Lasso and Ridge Regression in Predicting Corporate Failure. Procedia Economics and Finance, v.39, pp.634-641.
 Alexander, C. (2001). Market models: A guide to financial data analysis. John Wiley & Sons.
Mortgage analysts refer to graphs plotting prepayment rates against the interest rate incentive for refinancing as “S-curves” because the resulting curve typically (vaguely) resembles an “S.” The curve takes this shape because prepayment rates vary positively with refinance incentive, but not linearly. Very few borrowers refinance without an interest rate incentive for doing so. Consequently, on the left-hand side of the graph, where the refinance incentive is negative or out of the money, prepayment speeds are both low and fairly flat. This is because a borrower with a rate 1.0% lower than market rates is not very much more likely to refinance than a borrower with a rate 1.5% lower. They are both roughly equally unlikely to do so.
As the refinance incentive crosses over into the money (i.e., when prevailing interest rates fall below rates the borrowers are currently paying), the prepayment rate spikes upward, as a significant number of borrowers take advantage of the opportunity to refinance. But this spike is short-lived. Once the refinance incentive gets above 1.0% or so, prepayment rates begin to flatten out again. This reflects a segment of borrowers that do not refinance even when they have an interest rate incentive to do so. Some of these borrowers have credit or other issues preventing them from refinancing. Others are simply disinclined to go through the trouble. In either case, the growing refinance incentive has little impact and the prepayment rate flattens out.
These two bends—moving from non-incentivized borrowers to incentivized borrowers and then from incentivized borrowers to borrowers who can’t or choose not to refinance—are what gives the S-curve its distinctive shape.
Figure 1: S-Curve Example
An S-Curve Example – Servicer Effects
Interestingly, the shape of a deal’s S-curve tends to vary depending on who is servicing the deal. Many things contribute to this difference, including how actively servicers market refinance opportunities. How important is it to be able to evaluate and analyze the S-curves for the servicers specific to a given deal? It depends, but it could be imperative.
In this example, we’ll analyze a subset of the collateral (“Group 4”) supporting a recently issued Fannie Mae deal, FNR 2017-11. This collateral consists of four Fannie multi-issuer pools of recently originated jumbo-conforming loans with a current weighted average coupon (WAC) of 3.575% and a weighted average maturity (WAM) of 348 months. The table below shows the breakout of the top six servicers in these four pools based on the combined balance.
Figure 2: Breakout of Top Six Servicers
Over half (54%) of the Group 4 collateral is serviced by these six servicers. To begin the analysis, we pulled all jumbo-conforming, 30-year loans originated between 2015 and 2017 for the six servicers and bucketed them based on their refi incentive. A longer timeframe is used to ensure that there are sufficient observations at each point. The graph below shows the prepayment rate relative to the refi incentive for each of the servicers as well as the universe.
Figure 3: S-curve by Servicer
For loans that are at the money—i.e., the point at which the S-curve would be expected to begin spiking upward—only those serviced by IMPAC prepay materially faster than the entire cohort. However, as the refi incentive increases, IMPAC, Seneca Mortgage, and New American Funding all experience a sharp pick-up in speeds while loans serviced by Pingora, Lakeview, and Wells behave comparable to the market.
The last step is to compute the weighted average S-curve for the top six servicers using the current UPB percentages as the weights, shown in Figure 4 below. On the basis of the individual servicer observations, prepays for out-of-the-money loans should mirror the universe, but as loans become more re-financeable, speeds should accelerate faster than the universe. The difference between the six-servicer average and the universe reaches a peak of approximately 4% CPR between 50 bps and 100 bps in the money. This is valuable information for framing expectations for future prepayment rates. Analysts can calibrate prepayment models (or their outputs) to account for observed differences in CPRs that may be attributable to the servicer, rather than loan characteristics.
Figure 4: Weighted Average vs. Universe
This analysis was generated using RiskSpan’s data and analytics platform, RS Edge.
Attribution analysis of portfolios typically aims to discover the impact that a portfolio manager’s investment choices and strategies had on overall profitability. They can help determine whether success was the result of an educated choice or simply good luck. Usually a benchmark is chosen and the portfolio’s performance is assessed relative to it.
This post, however, considers the question of whether a non-referential assessment is possible. That is, can we deconstruct and assess a portfolio’s performance without employing a benchmark? Such an analysis would require access to historical return as well as the portfolio’s weights and perhaps the volatility of interest rates, if some of the components exhibit a dependence on them. This list of required variables is by no means exhaustive.
There are two prevalent approaches to attribution analysis—one based on factor models and the other on return decomposition. The factor model approach considers the equities in a portfolio at a single point in time and attributes performance to various macro- and micro-economic factors prevalent at that time. The effects of these factors are aggregated at the portfolio level and a qualitative assessment is done. Return decomposition, on the other hand, explores the manner in which positive portfolio returns are achieved across time. The principal drivers of performance are separated and further analyzed. In addition to a year’s worth of time series data for the variables listed in the previous paragraph, covariance, correlation, and cluster analyses and other mathematical methods would likely be required.
Is the normality assumption for stock returns fully justified? Are sample means and variances good proxies for population means and variances? This assumption is worth testing because Normality and the Central Limit Theorem are widely assumed when dealing with financial data. The Delta-Normal Value at Risk (VaR) method, which is widely used to compute portfolio VaR, assumes that stock returns and allied risk factors are normally distributed. Normality is also implicitly assumed in financial literature. Consider the distribution of S&P returns from May 1980 to May 2017 displayed in Figure 1.
Figure One: Distribution of S&P Returns
Panel (a) is a histogram of S&P daily returns from January 2001 to January 2017. The red curve is a Gaussian fit. Panel (b) shows the same data on a semi-log plot (logarithmic Y axis). The semi-log plot emphasizes the tail events.
The returns displayed in the left panel of figure 1 have a higher central peak and the “shoulders” are somewhat wider than what is predicted by the Gaussian fit. This mismatch in the tails is more visible in the semi-log plot shown in panel (b). This demonstrates that a normal distribution is probably not a very accurate assumption. Sigma, the standard deviation, is typically used as a measure of the relative magnitude of market moves and as a rough proxy for the occurrence of such events. The normal distribution places the odds of a minus-5 sigma swing at only 2.86×10-5 %. In other words, assuming 252 trading days per year, a drop of this magnitude should occur once in every 13,000 years! However, an examination of S&P returns over the 37-year period cited shows drops of 5 standard deviations or greater on 15 occasions. Assuming a normal distribution would consistently underestimate the occurrence of tail events.
We conducted a subsequent analysis focusing on the daily returns of SPY, a popular exchange-traded fund (ETF). This ETF tracks 503 component instruments. Using returns from July 01, 2016 through June 31, 2017, we tested each component instrument’s return vector for normality using the Chi-Square Test, the Kurtosis estimate, and a visual inspection of the Q-Q plot. Brief explanations of these methods are provided below.
This is a goodness-of-fit test that assumes a specific data distribution (Null hypothesis) and then tests that assumption. The test evaluates the deviations of the model predictions (Normal distribution, in this instance) from empirical values. If the resulting computed test statistic is large, then the observed and expected values are not close and the model is deemed a poor fit to the data. Thus, the Null hypothesis assumption of a specific distribution is rejected.
The kurtosis of any univariate standard-Normal distribution is 3. Any deviations from this value imply that the data distribution is correspondingly non-Normal. An example is illustrated in Figures 2, 3, and 4, below.
Quantile-quantile (QQ) plots are graphs on which quantiles from two distributions are plotted relative to each other. If the distributions correspond, then the plot appears linear. This is a visual assessment rather than a quantitative estimation. A sample set of results is shown in Figures 2, 3, and 4, below.
Figure Two: Year’s Returns for Exxon
Figure 2. The left panel shows the histogram of a year’s returns for Exxon (XOM). The null hypothesis was rejected with the conclusion that the data is not normally distributed. The kurtosis was 6 which implies a deviation from normality. The Q-Q plot in the right panel reinforces these conclusions.
Figure Three: Year’s Returns for Boeing
Figure 3. The left panel shows the histogram of a year’s returns for Boeing (BA). The data is not normally distributed and shows a significant skewness also. The kurtosis was 12.83 and implies a significant deviation from normality. The Q-Q plot in the right panel confirms this.
For the sake of comparison, we also show returns that exhibit normality in the next figure.
Figure Four: Year’s Returns for Xerox
The left panel shows the histogram of a year’s returns for Xerox (XRX). The data is normally distributed, which is apparent from a visual inspection of both panels. The kurtosis was 3.23 which is very close to the value for a theoretical normal distribution.
Machine learning literature has several suggestions for addressing this problem, including Kernel Density Estimation and Mixture Density Networks. If the data exhibits multi-modal behavior, learning a multi-modal mixture model is a possible approach.
In addition to normality, we also make untested assumptions regarding stationarity. This critical assumption is implicit when computing covariances and correlations. We also tend to overlook insufficient sample sizes. As observed earlier, the SPY dataset we had at our disposal consisted of 503 instruments, with around 250 returns per instrument. The number of observations is much lower than the dimensionality of the data. This will produce a covariance matrix which is not full-rank and, consequently, its inverse will not exist. Singular covariance matrices are highly problematic when computing the risk-return efficiency loci in the analysis of portfolios. We tested the returns of all instruments for stationarity using the Augmented Dickey Fuller (ADF) test. Several return vectors were non-stationary. Non-stationarity and sample size issues can’t be wished away because the financial markets are fluid with new firms coming into existence and existing firms disappearing due bankruptcies or acquisitions. Consequently, limited financial histories will be encountered and must be dealt with.
This is a problem where machine learning can be profitably employed. Shrinkage methods, Latent factor models, Empirical Bayes estimators and Random matrix theory based models are widely published techniques that are applicable here.
Portfolio Performance Analysis
Once issues surrounding untested assumptions have addressed, we can focus on portfolio performance analysis–a subject with a vast collection of books and papers devoted to it. We limit our attention here to one aspect of portfolio performance analysis – an inquiry into the clustering behavior of stocks in a portfolio.
Books on portfolio theory devote substantial space to the discussion of asset diversification to achieve an optimum balance of risk and return. To properly diversify assets, we need to know if resources have been over-allocated to a specific sector and, consequently, under-allocated to others. Cluster analysis can help to answer this. A pertinent question is how to best measure the difference or similarity between stocks. One way would be to estimate correlations between stocks. This approach has its own weaknesses, some of which have been discussed in earlier sections. Even if we had a statistically significant set of observations, we are faced with the problem of changing correlations during the course of a year due to structural and regime shifts caused by intermittent periods of stress. Even in the absence of stress, correlations can break down or change due to factors that are endogenous to individual stocks.
We can estimate similarity and visualize clusters using histogram analysis. However, histograms eliminate temporal information. To overcome this constraint, we used Spectral Clustering, which is a machine learning technique that explores cluster formation without neglecting temporal information.
Figures 5 to 7 display preliminary results from our cluster analysis. Analyses like this will enable portfolio managers to realize clustering patterns and their strengths in their portfolios. They will also help guide decisions on reweighting portfolio components and diversification.
Figures 5-7: Cluster Analyses
Figure 5. Cluster analysis of a limited set of stocks is shown here. The labels indicate the names of the firms. Clusters are illustrated by various colored bullets, and increasing distances indicate decreasing similarities. Within clusters, stronger affinities are indicated by greater connecting line weights.
The following figures display magnified views of individual clusters.
Figure 6. We can see that Procter & Gamble, Kimberly Clark and Colgate Palmolive form a cluster (top left, dark green bullets). Likewise, Bank of America, Wells Fargo and Goldman Sachs form a cluster (top right, light green bullets). This is not surprising as these two clusters represent two sectors: consumer products and banking. Line weights are correlated to affinities within sectors.
Figure 7. The cluster on the left displays stocks in the technology sector, while the clusters on the right represent firms in the defense industry (top) and the energy sector (bottom).
In this post, we raised questions about standard assumptions that are made when analyzing portfolios. We also suggested possible solutions from machine learning literature. We subsequently analyzed one year’s worth of returns of SPY to identify clusters and their strengths and discussed the value of such an analysis to portfolio managers in evaluating risk and reweighting or diversifying their portfolios.
Since the financial crisis began in 2007, the “Non-Agency” MBS market, i.e., securities neither issued nor guaranteed by Fannie Mae, Freddie Mac, or Ginnie Mae, has been sporadic and has not rebounded from pre-crisis levels. In recent months, however, activity by large financial institutions, such as AIG and Wells Fargo, has indicated a return to the issuance of Non-Agency MBS. What is contributing to the current state of the securitization market for high-quality mortgage loans? Does the recent, limited-scale return to issuance by these institutions signal an increase in private securitization activity in this sector of the securitization market? If so, what is sparking this renewed interest?
The MBS Securitization Market
Three entities – Ginnie Mae, Fannie Mae, and Freddie Mac – have been the dominant engine behind mortgage-backed securities (MBS) issuance since 2007. These entities, two of which remain in federal government conservatorship and the third a federal government corporation, have maintained the flow of capital from investors into guaranteed MBS and ensured that mortgage originators have adequate funds to originate certain types of single-family mortgage loans.
Virtually all mortgage loans backed by federal government insurance or guaranty programs, such as those offered by the Federal Housing Administration and the Department of Veterans Affairs, are issued in Ginnie Mae pools. Mortgage loans that are not eligible for these programs are referred to as “Conventional” mortgage loans. In the current market environment, most Conventional mortgage loans are sold to Fannie Mae and Freddie Mac (i.e. “Conforming” loans) and are securitized in Agency-guaranteed pass-through securities.
The Non-Agency MBS Market
Not all Conventional mortgage loans are eligible for purchase by Fannie Mae or Freddie Mac, however, due to collateral restrictions (i.e., their loan balances are too high or they do not meet certain underwriting requirements). These are referred to as “Non-Conforming” loans and, for most of the past decade, have been held in portfolio at large financial institutions, rather than placed in private, Non-Agency MBS. The Non-Agency MBS market is further divided into sectors for “Qualified Mortgage” (QM) loans, non-QM loans, re-performing loans and nonperforming loans. This post deals with the securitization of QM loans through Non-Agency MBS programs.
Since the crisis, Non-Agency MBS issuance has been the exclusive province of JP Morgan and Redwood Trust, both of which continue to issue a relatively small number of deals each year. The recent entry of AIG into the Non-Agency MBS market and, combined with Wells Fargo’s announcement that it intends to begin issuing as well, makes this a good time to discuss reasons why these institutions with other funding sources available to them are now moving back to this securitization market sector.
Considerations for Issuing QM Loans
Three potential considerations may lead financial institutions to investigate issuing QM Loans through Non-Agency MBS transactions:
- “All-In” Economics
- Portfolio Concentration or Limitations
- Regulatory Pressures
Investigate “All-In” Economics
Over the long-term, mortgage originators gravitate to funding sources that provide the lowest cost to borrowers and profitability for their firms. To improve the “all-in” economics of a Non-Agency MBS transaction, investment banks work closely with issuers to broaden the investor base for each level of the securitization capital structure. Partly due to the success of the Fannie Mae and Freddie Mac Credit Risk Transfer transactions, there appears to be significant interest in higher-yielding mortgage-related securities at the lower-rated (i.e. higher risk) end of the securitization capital structure. This need for higher yielding assets has also increased demand for lower-rated securities in the Non-Agency MBS sector.
However, demand from investors at the higher-rated end of the securitization capital structure (i.e. ‘AAA’ and ‘AA’ securities) has not resulted in “all-in” economics for a Non-Agency MBS transaction that surpass the economics of balance sheet financing provided by portfolios funded with low deposit rates or low debt costs. If deposit rates and debt costs remain at historically low levels, the portfolio funding alternative will remain attractive. Notwithstanding the low interest rate environment, some institutions may develop operational capabilities for Non-Agency MBS programs as a risk mitigation process for future periods where balance sheet financing alternatives may not be as beneficial.
Portfolio Concentration or Limitations
Due to the lack of robust investor demand and unfavorable economics in Non-Agency MBS, many banks have increased their portfolio exposure to both fixed-rate and intermediate-adjustable-rate QM loans. The ability to hold these mortgage loans in portfolio has provided attractive pricing to a key customer demographic and earned an attractive net interest rate margin during the historical low-rate environment. While bank portfolios have provided an attractive funding source for Non-Agency QM loans, some financial institutions may attempt to develop diversified funding sources in response to regulatory pressure or self-imposed portfolio concentration limits. Selling existing mortgage portfolio assets into the Non-Agency MBS securitization market is one way in which financial institutions might choose to reduce concentrated mortgage risk exposure.
Some financial institutions may be under pressure from their regulators to demonstrate their ability to sell assets out of their mortgage portfolio as a contingency plan. The Non-Agency MBS market is one way of complying with these sorts of regulatory requests. Developing a contingency ability to tap Non-Agency MBS markets develops operational capabilities under less critical circumstances, while assessing the time needed by the institution to liquidate such assets through securitization. This early establishment of securitization functionalities is a prudent activity for those institutions who foresee the possibility of securitization as a future funding option.
While the Non-Agency MBS market has been dormant for most of the past decade, some financial institutions that have relied upon portfolio funding now appear to be testing the current viability of the Non-Agency MBS market. Other mortgage originators would be wise to take notice of these events, monitor activity in these markets, and assess the viability of this alternative funding source for their on-Conforming QM Loans. With the continued issuance by JP Morgan and Redwood Trust and new entrants such as AIG and Wells Fargo, -Non-Agency MBS market activity should be monitored by other mortgage originators to determine whether securitization has the potential to provide an alternative funding source for future lending activity.
In our next article on the Non-Agency MBS market, we will review the changes in due diligence practices, loan-level data disclosures, the representation and warranty framework, and the ratings process made by securitization market participants and the impact of these changes on the Non-Agency MBS market segment.