Linkedin    Twitter   Facebook

Get Started
Log In

Linkedin

Articles Tagged with: Data Management

Webinar: Using Machine Learning in Whole Loan Data Prep

webinar

Using Machine Learning in Whole Loan Data Prep

Tackle one of your biggest obstacles: Curating and normalizing multiple, disparate data sets.

Learn from RiskSpan experts:

  • How to leverage machine learning to help streamline whole loan data prep
  • Innovative ways to manage the differences in large data sets
  • How to automate ‘the boring stuff’

About The Hosts

LC Yarnelle

Director – RiskSpan

LC Yarnelle is a Director with experience in financial modeling, business operations, requirements gathering and process design. At RiskSpan, LC has worked on model validation and business process improvement/documentation projects. He also led the development of one of RiskSpan’s software offerings, and has led multiple development projects for clients, utilizing both Waterfall and Agile frameworks.  Prior to RiskSpan, LC was as an analyst at NVR Mortgage in the secondary marketing group in Reston, VA, where he was responsible for daily pricing, as well as on-going process improvement activities.  Before a career move into finance, LC was the director of operations and a minority owner of a small business in Fort Wayne, IN. He holds a BA from Wittenberg University, as well as an MBA from Ohio State University. 

Matt Steele

Senior Analyst – RiskSpan

LC Yarnelle is a Director with experience in financial modeling, business operations, requirements gathering and process design. At RiskSpan, LC has worked on model validation and business process improvement/documentation projects. He also led the development of one of RiskSpan’s software offerings, and has led multiple development projects for clients, utilizing both Waterfall and Agile frameworks.  Prior to RiskSpan, LC was as an analyst at NVR Mortgage in the secondary marketing group in Reston, VA, where he was responsible for daily pricing, as well as on-going process improvement activities.  Before a career move into finance, LC was the director of operations and a minority owner of a small business in Fort Wayne, IN. He holds a BA from Wittenberg University, as well as an MBA from Ohio State University. 


Residential Mortgage REIT: End to End Loan Data Management and Analytics

An inflexible, locally installed risk management system with dated technology required a large IT staff to support it and was incurring high internal maintenance costs.

Absent a single solution, the use of multiple vendors for pricing and risk analytics, prepay/credit models and data storage created inefficiencies in workflow and an administrative burden to maintain.

Inconsistent data and QC across the various sources was also creating a number of data integrity issues.

The Solution

An end-to-end data and risk management solution. The REIT implemented RiskSpan’s Edge Platform, which provides value, cost and operational efficiencies.

  • Scalable, cloud-native technology
  • Increased flexibility to run analytics at loan level; additional interactive / ad-hoc analytics
  • Reliable, accurate data with more frequent updates

Deliverables 

Consolidating from five vendors down to a single platform enabled the REIT to streamline workflows and automate processes, resulting in a 32% annual cost savings and 46% fewer resources required for maintenance.


GSE: Earnings Forecasting Framework Development

A $100+ billion government-sponsored enterprise with more than $3 trillion in assets sought to develop an end-to-end earnings forecast framework to project and stress-test the future performance of its loan portfolio. The comprehensive framework needed to draw data from a combination of unintegrated systems to compute earnings, capital management requirements and other ad hoc reporting under a variety of internal and regulatory (i.e., DFAST) stress scenarios. 

Computing the required metrics required cross-functional team coordination, proper data governance, and a reliable audit trail, all of which were posing a challenge.  

The Solution

RiskSpan addressed these needs via three interdependent workstreams: 

Data Preparation

RiskSpan consolidated multiple data sources required by the earnings forecast framework. These included: 

  • Macroeconomic drivers, including interest rates and unemployment rate 
  • Book profile, including up-to-date snapshots of the portfolio’s performance data 
  • Modeling assumptions, including portfolio performance history and other asset characteristics 

Model Simulation

Because the portfolio in question consisted principally of mortgage assets, RiskSpan incorporated more than 20 models into the framework, including (among others): 

  • Prepayment Model 
  • Default Model 
  • Delinquency Model 
  • Acquisition Model: Future loans 
  • Severity Model  
  • Cash Flow Model 

Business Calculations and Reporting

Using the data and models above, RiskSpan incorporated the following outputs into the earnings forecast framework: 

  • Non-performing asset treatment 
  • When to charge-off delinquent loans 
  • Projected loan losses under FAS114/CECL  
  • Revenue Forecasts 
  • Capital Forecast 

Client Benefits

The earnings forecast framework RiskSpan developed represented a significant improvement over the client’s previous system of disconnected data, unintegrated models, and error-prone workarounds. Benefits of the new system included:  

  • User Interface – Improved process for managing loan lifecycles and GUI-based process execution  
  • Data Lineage – Implemented necessary constraints to ensure forecasting processes are executed in sequence and are repeatable. Created a predefined, dynamic output lineage tree (UI-accessible) to build robust data flow sequence used to facilitate what-if scenario analysis. 
  • Run Management – Assigned a unique run ID to every execution to ensure individual users across the institution can track and reuse execution results 
  • Audit Trail – Designed logging of forecasting run details to trace attributes such as version changes (Version control system – GIT, SVN), timestamp, run owner, and inputs used (MySQL/Oracle Databases for logging)  
  • Identity Access Management – User IDs and access is now managed administratively. Metadata is captured via user actions through the framework for audit purposes. Role-based restrictions now ensure data and forecasting features are limited to only those who require such permissions 
  • Golden Configuration – Implemented execution-specific parameters passed to models during runtime. These parameters are stored, enabling any past model result to be reproduced if needed 
  • Data Masking – Encrypted personally identifiable information at-rest and in transit 
  • Data Management – Execution logs and model/report outputs are stored to the database and file systems 
  • Comprehensive User and Technical Documentation – RiskSpan created audit-ready documentation tied to logic changes and execution. This included source-to-target mapping documentation and enterprise-grade catalogs and data dictionaries. Documentation also included: 
      • Vision Document 
      • User Guides 
      • Testing Evidence 
      • Feature Traceability Matrix 

Automate Your Data Normalization and Validation Processes

Robotic Process Automation (RPA) is the solution for automating mundane, business-rule based processes so that organizations high value business users can be deployed to more valuable work. 

McKinsey defines RPA as “software that performs redundant tasks on a timed basis and ensures that they are completed quickly, efficiently, and without error.” RPA has enormous savings potential. In RiskSpan’s experience, RPA reduces staff time spent on the target-state process by an average of 95 percent. On recent projects, RiskSpan RPA clients on average saved more than 500 staff hours per year through simple automation. That calculation does not include the potential additional savings gained from the improved accuracy of source data and downstream data-driven processes, which greatly reduces the need for rework. 

The tedious, error-ridden, and time-consuming process of data normalization is familiar to almost all organizations. Complex data systems and downstream analytics are ubiquitous in today’s workplace. Staff that are tasked with data onboarding must verify that source data is complete and mappable to the target system. For example, they might ensure that original balance is expressed as dollar currency figures or that interest rates are expressed as percentages with three decimal places. 

Effective data visualizations sometimes require additional steps, such as adding calculated columns or resorting data according to custom criteria. Staff must match the data formatting requirements with the requirements of the analytics engine and verify that the normalization allows the engine to interact with the dataset. When completed manually, all of these steps are susceptible to human error or oversight. This often results in a need for rework downstream and even more staff hours. 

Recently, a client with a proprietary datastore approached RiskSpan with the challenge of normalizing and integrating irregular datasets to comply with their data engine. The non-standard original format and the size of the data made normalization difficult and time consuming. 

After ensuring that the normalization process was optimized for automation, RiskSpan set to work automating data normalization and validation. Expert data consultants automated the process of restructuring data in the required format so that it could be easily ingested by the proprietary engine.  

Our consultants built an automated process that normalized and merged disparate datasets, compared internal and external datasets, and added calculated columns to the data. The processed dataset was more than 100 million loans, and more than 4 billion recordsTo optimize for speed, our team programmed a highly resilient validation process that included automated validation checks, error logging (for client staff review) and data correction routines for post-processing and post-validation. 

This custom solution reduced time spent onboarding data from one month of staff work down to two days of staff work. The end result is a fullyfunctional, normalized dataset that can be trusted for use with downstream applications. 

RiskSpan’s experience automating routine business processes reduced redundancies, eliminated errors, and saved staff time. This solution reduced resources wasted on rework and its associated operational risk and key-person dependencies. Routine tasks were automated with customized validations. This customization effectively eliminated the need for staff intervention until certain error thresholds were breached. The client determined and set these thresholds during the design process. 

RiskSpan data and analytics consultants are experienced in helping clients develop robotic process automation solutions for normalizing and aggregating data, creating routine, reliable data outputsexecuting business rules, and automating quality control testing. Automating these processes addresses a wide range of business challenges and is particularly useful in routine reporting and analysis. 

Talk to RiskSpan today about how custom solutions in robotic process automation can save time and money in your organization. 


GSE: Datamart Design and Build

The Problem

A government-sponsored enterprise needed a centralized data solution for its forecasting process, which involved cross-functional teams from different business lines.​

The firm also sought a cloud-based data warehouse to host forecasting outputs for reporting purposes with faster querying and processing speeds.​

The firm also needed assistance migrating data from legacy data sources to new datamarts. The input and output files and datasets had different sources and were often in different formats. Analysis and transformation were required prior to designing, developing and loading tables.  

The Solution

RiskSpan built and now maintains a new centralized datamart (in both Oracle and Amazon Web Services) for the client’s revenue and loss forecasting processes. This includes data modeling, historical data upload, and the monthly recurring data process.

The Deliverables

  • Analyzed the end-to-end data flow and data elements​
  • Designed data models satisfying business requirements​
  • Processed and mapped forecasting input and output files​
  • Migrated data from legacy databases to the new sources ​
  • Built an Oracle datamart and a cloud-based data warehouse (Amazon Web Services) ​
  • Led development team to develop schemas, tables and views, process scripts to maintain data updates and table partitioning logic​
  • Resolved data issues with the source and assisted in reconciliation of results

GSE: ETL Solutions

The Problem

The client needed ETL solutions for handling data of any complexity or size in a variety of formats and/or from different upstream sources.​

The client’s data management team extracted and processed data from different sources and different types of databases (e.g. Oracle, Netezza, Excel files, SAS datasets, etc.), and needed to load into its Oracle and AWS datamarts for it’s revenue and loss forecasting processes. ​

The client’s forecasting process used very complex large-scale datasets in different formats which needed to be consumed and loaded in an automated and timely manner.

The Solution

RiskSpan was engaged to design, develop and implement ETL (Extract, Transform and Load) solutions for handling input and output data for the client’s revenue and loss forecasting processes. This included dealing with large volumes of data and multiple source systems, transforming and loading data to and from data marts and data ware houses.

The Deliverables

  • Analyzed data sources and developed ETL strategies for different data types and sources​
  • Performed source target mapping in support of report and warehouse technical designs​
  • Implemented business-driven requirements using Informatica ​
  • Collaborated with cross-functional business and development teams to document ETL requirements and turn them into ETL jobs ​
  • Optimized, developed, and maintained integration solutions as necessary to connect legacy data stores and the data warehouses

Case Study: Web Based Data Application Build

The Client

Government Sponsored Enterprise (GSE)

The Problem

The Structured Transactions group of a GSE needed to offer a simpler way for broker-dealers to  create new restructured securities (improved ease of use), that provided flexibility to do business at any hour and reduce the dependence on Structured Transactions team members’ availability. 

The Solution

RiskSpan led the development of a customer-facing web-based application for a GSE. Their structured transactions clients use the application to independently create pools of pools and re-combinable REMIC exchanges (RCRs) with existing pooling and pricing requirements.​

RiskSpan delivered the complete end-to-end technical implementation of the new portal.

The Deliverables

  • Development included self-service web portal that provides RCR, pool-of-pool exchange capabilities, reporting features ​
  • Managed data flows from various internal sources to the portal, providing real-time calculations​
  • Latest technology stack included Angular 2.0, Java for web services​
  • Development, testing, and config control methodology featured DevOps practices, CI/CD pipeline, 100% automated testing with Cucumber, Selenium​
  • GIT, JIRA, Gherkin, Jenkins, Fisheye/Crucible, SauceLabs, for config control, testing, deployment

Case Study: Web Based Data Application Build

The Client

GOVERNMENT SPONSORED ENTERPRISE (GSE)

The Problem

The Structured Transactions group of a GSE needed to offer a simpler way for broker-dealers to  create new restructured securities (improved ease of use), that provided flexibility to do business at any hour and reduce the dependence on Structured Transactions team members’ availability. 


The Solution

RiskSpan led the development of a customer-facing web-based application for a GSE. Their structured transactions clients use the application to independently create pools of pools and re-combinable REMIC exchanges (RCRs) with existing pooling and pricing requirements.​

RiskSpan delivered the complete end-to-end technical implementation of the new portal.


The Deliverables

  • Development included self-service web portal that provides RCR, pool-of-pool exchange capabilities, reporting features ​
  • Managed data flows from various internal sources to the portal, providing real-time calculations​
  • Latest technology stack included Angular 2.0, Java for web services​
  • Development, testing, and config control methodology featured DevOps practices, CI/CD pipeline, 100% automated testing with Cucumber, Selenium​
  • GIT, JIRA, Gherkin, Jenkins, Fisheye/Crucible, SauceLabs, for config control, testing, deployment

CONTACT US

SOFR, So Good? The Main Anxieties Around the LIBOR Transition

SOFR Replacing LIBOR

The London Interbank Offered Rate (LIBOR) is going away, and the international financial community is working hard to plan for and mitigate risks to make a smooth transition. In the United States, the Federal Reserve’s Alternative Reference Rates Committee (ARRC) has recommended the Secured Overnight Financing Rate (SOFR) as the preferred replacement rate. The New York Fed began publishing SOFR regularly on April 3, 2018. In July 2018, Fannie Mae issued $6 billion in SOFR-denominated securities, leading the way for other institutions who have since followed suit. In November 2018, the Federal Home Loan (FHL) Banks issued $4 billion in debt tied to SOFR. CME Group, a derivatives and futures exchange company, launched 3-month and 1-month SOFR futures contracts in 2018. All of these steps to support liquidity and demonstrate SOFR demand are designed to create a rate more robust than LIBOR—the transaction volume underpinning SOFR rates is around $750 billon daily, compared to USD LIBOR’s estimated $500 million in daily transaction volume. 

USD LIBOR is referenced in an estimated $200 trillion of financial contracts, of which 95 percent is derivatives. However, the remaining cash market is not small. USD LIBOR is referenced in an estimated: $3.4 trillion in business loans, $1.3 trillion in retail mortgages and other consumer loans, $1.8 trillion in floating rate debt, and $1.8 trillion in securitized products. 

The ARRC has held consultations on its recommended fallback language for floating rate notes and syndicated business loans—the responses are viewable on the ARRC website. On December 7, the ARRC published consultations on securitizations and bilateral business loans, which are both open for comment through February 5, 2019.  

Amid the flurry of positive momentum in the transition towards SOFR, anxiety remains that the broader market is not moving quickly enough. ARRC consultations and working groups indicate that these anxieties derive primarily from a few specific points of debate: development of term rates, consistency of contracts, and implementation timing.

Term Rates

Because the SOFR futures market remains immature, term rates cannot be developed without significant market engagement with the newly created futures. The ARRC Paced Transition Plan includes a goal to create a forward-looking reference rate by end-of-year 2021 – just as LIBOR is scheduled to phase out. In the interim, financial institutions must figure out how to build into existing contracts fallback language or amendments that include a viable alternative to LIBOR term rates.  

The nascent SOFR futures market is growing quickly, with December 2018 daily trade volumes at nearly 16,000. However, they pale in comparison to Eurodollar futures volumes, which logged daily averages around 5 million per day at CME Group alone. This puts SOFR on track according to the ARRC plan, but means institutions remain in limbo until the futures market is more mature and term SOFR rates can be developed. 

In July 2018, the Financial Stability Board (FSB) stated their support for employment of term rates primarily in cash markets, while arguing that spreads are tightest in derivative markets focused around overnight risk-free rates (RFRs), which therefore are preferred. An International Swaps and Derivatives Association (ISDA) FAQ document published in September 2018 explained the FSB’s request that “ISDA should develop fallbacks that could be used in the absence of suitable term rates and, in doing so, should focus on calculations based on the overnight RFRs.” This marks a major change, given that derivatives commonly reference 3-month LIBOR, and cash products are dependent on forward-looking term rates. Despite the magnitude of change, transition from LIBOR term rates to an alternative term rate based on limited underlying transactions would be undesirable.

The FSB explained:

Moving the bulk of current exposures referencing term IBOR benchmarks that are not sufficiently anchored in transactions to alternative term rates that also suffer from thin underlying markets would not be effective in reducing risks and vulnerabilities in the financial system. Therefore, the FSB does not expect such RFR-derived term rates to be as robust as the RFRs themselves, and they should be used only where necessary.

In consultation report published December 20, 2018, ISDA stated the overwhelming majority of respondents preference for fallback language with a compounded setting in arrears rate for the adjusted RFR, with a significant and diverse majority preferring the historical mean/median approach for the spread adjustment.

Though ISDA’s consultation report noted some drawbacks to the historical mean/median approach for the spread adjustment, the diversity of supporters – in all regions of the world, representing many types of financial institutions – was a strong indicator of market preference. By comparison, there was no ambiguity about preference for the RFR in fallback language: In almost 90 percent of ISDA respondent rankings, the compounded setting in arrears rate was selected as the top preference for the adjusted RFR. 

In the Structured Finance Industry Group (SFIG) LIBOR Task Force Green Paper, the group indicates strong preference for viable term rates and leaves the question of whether such calculations should be done in advance or in arrears as an open item, while indicating preference for continuing prospectively determining rates at the start of each term. They list their preference for waterfall options as first an endorsed forward-looking term SOFR rate, and second, a compounded or average daily SOFR. SFIG is currently drafting their response to the ARRC Securitization Consultation, which will be made public on the ARRC website after submission. 

Despite stated preferences, working groups are making a concerted effort to follow the ARRC’s guidance to strive for consistency across cash and derivative products. Given the concerns about a viable term rate, some market participants in cash products are also exploring the realities of implementing ISDA’s recommended fallback language and intend to incorporate those considerations into their response to the ARRC consultations. 

In the absence of an endorsed term rate, pricing of other securities such as fixed-rate bonds is difficult, if not impossible. Additionally, the absence of an endorsed term rate creates issues of consistency within the rate itself (i.e., market standards will need to developed around how and over what periods the rate is compounded). The currently predominant recommendation of a compounding in arrears overnight risk-free rate would also have added complexity when compared with any forward-looking rate, which is exacerbated in the cash markets with consumer products where changes must be fully disclosed and explained. Compounding in arrears would require a lock-out period at the end of a term to allow institutions time to calculate the compounded interest. Market standards and consumer agreement around the specific terms governing the lock-out period would be difficult to establish.

Consistency:

While ISDA has not yet completed formal consultation specific to USD LIBOR and SOFR, and their analysis is only applicable to derivatives and swaps, there are several benefits to consistency across cash and derivatives markets. Consistency of contract terms across all asset classes during the transition away from USD LIBOR lowers operational, accounting, legal, and basis risk, according to the ARRC, and makes the change easier to communicate and negotiate with stakeholders.  

Though it is an easy case to make that consistency is advantageous, achieving it is not. For example, the Mortgage Bankers Association points out that the ISDA-selected compounding in arrears approach to interest accrual periods “would be a very material change from current practice as period interest expenses would not be determined until the end of the relevant period.” The nature of the historical mean/median spread adjustment does not come without drawbacks. ISDA’s consultation acknowledges that the approach is “likely to lead to value transfers and potential market disruption by not capturing contemporaneous market conditions at the trigger event, as well as creating potential issues with hedging.” Additionally, respondents acknowledge that relevant data may not yet be available for long lookback periods with the newly created overnight risk-free rates.  

The effort to achieve some level of consistency across the transition away from LIBOR poses several challenges related to timing. Because LIBOR will only be unsupported (rather than definitively discontinued) by the Financial Conduct Authority (FCA) at the end of 2021, some in the market retain a small hope that production of LIBOR rates could continue. The continuation of LIBOR is possible, but betting a portfolio of contracts on its continuation is an unnecessarily high-risk decision. That said, transition plans remain ambiguous about timing, and implementation of any contract changes is ultimately at the sole discretion of the contract holder. Earlier ARRC consultations acknowledged two possible implementation arrangements:   

  1. An “amendment approach,” which would provide a streamlined amendment mechanism for negotiating a replacement benchmark in the future and could serve as an initial step towards adopting a hardwired approach.  
  2. A “hardwired approach,” which would provide market participants with more clarity as to a how a potential replacement rate will be identified and implemented. 

However, the currently open-for-comment securitizations consultation has dropped the “amendment” and “hardwired” terminology and now describes what amounts to the hardwired approach as defined above – a waterfall of options that is implemented upon occurrence of a predefined set of “trigger” events. Given that the securitizations consultation is still open for comment, it remains possible that market respondents will bring the amendment approach back into discussions.  

Importantly, in the U.S. there are currently no legally binding obligations for organizations to plan for the cessation of LIBOR, nor policy governing how that plan be made. In contrast, the European Union has begun to require that institutions submit written plans to governing bodies.

Timing

Because the terms of implementation remain open for discussion and organizational preference, there is some ambiguity about when organizations will begin transitioning contracts away from LIBOR to the preferred risk-free rates. In the structured finance market, this compounds the challenge of consistency with timing. For commercial real estate securities, for example, there is possibility of mismatch in the process and timing of transition for rates in the index and for the underlying assets and resulting certificates or bonds. This potential challenge has not yet been addressed by the ARRC or other advisory bodies.

Mortgage Market

The mortgage market is still awaiting formal guidance. While the contributions by Fannie Mae and the FHLBanks to the SOFR market signal government sponsored entity (GSE) support for the newly selected reference rate, none of the GSEs has issued any commentary about recommended fallback language specific to mortgages or guidance on how to navigate the fact that SOFR does not yet have a viable term rate. An additional concern for consumer loan products, including mortgages, is the need to explain the contract changes to consumers. As a result, the ARRC Securitization consultation hypothesizes that consumer products are “likely to be simpler and involve less optionality and complexity, and any proposals would only be made after wide consultation with consumer advocacy groups, market participants, and the official sector.”  

For now, the Mortgage Bankers Association has recommended institutions develop a preliminary transition plan, beginning with a detailed assessment of exposures to LIBOR.

How can RiskSpan Help?

At any phase in the transition away from LIBOR, RiskSpan can provide institutions with analysts experienced in contract review, experts in model risk management and sophisticated technical tools—including machine learning capabilities—to streamline the process to identify and remediate LIBOR exposure. Our diverse team of professionals is available to deliver resources to financial institutions that will mitigate risks and streamline this forthcoming transition.


Case Study: Loan-Level Capital Reporting Environment​

The Client

Government Sponsored Enterprise (GSE)

The Problem

A GSE and large mortgage securitizer maintained data from multiple work streams in several disparate systems, provided at different frequencies. Quarterly and ad-hoc data aggregation, consolidation, reporting and analytics required a significant amount of time and personnel hours. ​

The client desired configurable integration with source systems, automated acquisition of over 375 million records and performance improvements in report development.

 

The Solution

The client engaged RiskSpan Consulting Services to develop a reporting environment backed by an ETL Engine to automate data acquisition from multiple sources. 

The Deliverables

  • Reviewed system architecture, security protocol, user requirements and data dictionaries to determine feasibility and approach.​
  • Developed a user-configurable ETL Engine, developed in Python, to load data from different sources into a PostgreSQL data repository hosted on Linux server. The engine provides real-time status updates and error tracking.​
  • Developed the reporting module of the ETL Engine in Python to automatically generate client-defined Excel reports, reducing report development time from days to minutes​
  • Made raw and aggregated data available for internal users to connect virtually any reporting tool, including Python, R, Tableau and Excel​
  • Developed a user interface, leveraging the API exposed by the ETL Engine, allowing users to create and schedule jobs as well as stand up user-controlled reporting environments​


Get Started
Log in

Linkedin   

risktech2024