Linkedin    Twitter   Facebook

Get Started
Log In

Linkedin

Articles Tagged with: Data Management

How RiskSpan Helped a Credit-Focused Investment Management Firm Transition to Snowflake

A leading investment management firm and recognized leader in structured credit, including asset-backed securities (ABS), mortgage-backed securities (MBS), and other fixed-income sectors, sought RiskSpan’s help transitioning key data processing functions from the data management platform 1010data to Snowflake.

The ability to share data with partners using the same system in which the analytics are performed made the combination of RiskSpan and Snowflake especially attractive. The shift provided significant operational and financial benefits to the client, marking another successful milestone in RiskSpan’s history of helping clients optimize their data management.

Converting Key Functionalities from 1010data to Snowflake

The company had been relying on 1010data for several critical timeseries-based calculations. However, the limitations of the platform—both in terms of speed and cost—prompted them to seek a more modern solution. RiskSpan worked closely with them to replicate and enhance key functionalities using Snowflake. Converted functionalities included:

  1. Timeseries-Based Calculations: We re-engineered these to operate efficiently within Snowflake’s cloud-native architecture, maintaining accuracy while enhancing processing speeds.
  2. fill_nearest: This function retrieves the nearest non-N/A value within a group. It was implemented seamlessly using Snowflake’s window functions, preserving data integrity while boosting performance.
  3. rolling_sum: Snowflake’s SQL capabilities were leveraged to implement the moving sum of valid (non-N/A) values within a window. This provided the company with more responsive and scalable time-series analysis capabilities.
  4. cumulative_run_length: The cumulative run length within a group was translated into Snowflake’s environment using efficient SQL queries, making the entire process faster and more robust.

Integration Capabilities

In addition to replicating 1010data’s core functionalities, the company sought to expand its data capabilities by integrating additional datasets such as Market Data and Home Price Indices (HPI). We showed them how to incorporate and analyze these datasets within Snowflake’s environment, further enhancing their decision-making capabilities.

This cross-functional integration was pivotal in showcasing Snowflake’s ability to streamline complex data workflows. By integrating third-party data directly into their ecosystem, our client can now generate more insightful reports and conduct deeper analysis across multiple datasets without leaving the Snowflake platform.

The Benefits of Transitioning to Snowflake

Our client experienced several immediate and impactful benefits by transitioning from 1010data to Snowflake were immediate and impactful. These included:

  • Complete Replacement of 1010data: With all critical functionalities successfully converted, the company now can fully discontinue their reliance on 1010data. This eliminates the need for maintaining multiple platforms and simplifies their technology stack.
  • Significant Cost Savings: Discontinuing 1010data relieved our client of the high costs associated with the platform’s licensing and maintenance fees. Snowflake’s cost-efficient pricing model has already resulted in substantial savings for the company.
  • Improved Processing Speeds: One of the most noticeable changes has been the drastic improvement in the company’s processing times. Snowflake’s optimized cloud infrastructure provides faster data processing and querying capabilities, significantly reducing time-to-insight.
  • Access to Full Snowflake Feature Set: Moving to Snowflake has enabled the company to take advantage of features such as data sharing, enhanced security, and elasticity. Snowflake’s built-in scalability ensures our client’s data infrastructure will continue to grow effortlessly as its data needs expand.
  • Speed and Cost Efficiency: The company has expressed particular satisfaction with both the speed and cost-efficiency of the Snowflake platform. The reduction in data processing time and cost per query has positively impacted its business operations.

Partnering with RiskSpan not only enabled the company to replace 1010data with a more modern and efficient platform, but it has also empowered them to take advantage of Snowflake’s newest, advanced features, including AI.

Contact us to learn how RiskSpan can help you unlock the full potential of your data by guiding you through complex transitions and helping you implement scalable, secure, and cost-effective solutions.


Enhancing a HELOC Lender’s Operations with RiskSpan’s Data as a Service (DaaS)

A leading fintech company specializing in home equity lines of credit (HELOCs), was seeking to optimize the management of its data operations. To accomplish this, the company turned to RiskSpan, a leader in data analytics and financial technology solutions. Through a tailored Data as a Service (DaaS) offering, RiskSpan helped the company improve its HELOC business operations by providing advanced data management and modeling capabilities.

Challenges

The company sought to enhance its HELOC operations in two critical areas:

  1. Data Management and Integration: The company was dealing with complex data sets from multiple sources, including credit bureaus, property data, and customer behavior insights. Integrating and managing this data effectively was crucial for making informed lending decisions.
  2. Risk Assessment and Modeling: Accurate and reliable risk assessment models were necessary for evaluating customer behavior and predicting loan performance. The company required a solution that could model draw behavior and other variables specific to HELOCs.

RiskSpan’s DaaS Solution

RiskSpan’s DaaS offering provided the company with a comprehensive solution tailored to address these challenges. The key components of the solution included:

  1. Advanced Data Integration: RiskSpan’s DaaS platform seamlessly integrated the company’s various data sources, enabling a more streamlined and efficient data management process. This integration allowed the company to better understand their borrowers and make more informed lending decisions.
  2. Enhanced Loan-Level HELOC Pricing and Projections: The client successfully loaded its historical loan performance data onto RiskSpan’s DaaS platform and established a monthly process within the platform’s flexible data warehouse. Using the embedded historical performance tool, the client analyzed loan-level behavior across its portfolio. This enabled the client to generate detailed collateral performance reports for investors and rating agencies, as well as leverage these insights to enhance future projections and loan-level pricing for new loans.
  3. Cost-Effective Data Services: RiskSpan also identified an opportunity to replace the client’s existing data services provider at a significantly reduced cost. By offering a more competitive pricing structure while maintaining high-quality data services, RiskSpan positioned the client to achieve substantial cost savings, making them more competitive in the HELOC market.

Outcomes and Benefits

Implementing RiskSpan’s DaaS solution brought several key benefits:

  • Improved Decision-Making: With better-integrated data and more accurate modeling of HELOC draw behavior, the client could make more informed lending decisions, ultimately reducing risk and enhancing profitability.
  • Operational Efficiency: The streamlined data management process allowed the client to operate more efficiently, freeing up resources to focus on core business activities.
  • Cost Savings: RiskSpan’s competitive pricing enabled the client to cut costs significantly, improving their bottom line and allowing them to reinvest in other areas of the business.

RiskSpan’s Data as a Service solution provided the clients with the tools it needed to optimize its HELOC business. By addressing its data integration challenges, improving risk assessment through advanced modeling, and offering a cost-effective alternative to existing data services, RiskSpan helped the client strengthen its market position and enhance overall business performance.


AI Prompt Structuring — Does it Even Matter?

At the mesh point of human ingenuity and artificial intelligence, the importance of appropriately structured prompts is frequently underestimated. Within this dynamic (and, at times, delicate) ecosystem, the meticulous craftmanship of prompts serves as the linchpin, orchestrating a seamless collaboration between human cognition and machine learning algorithms. Not unlike to a conductor directing an ensemble, judicious prompt structuring lays the foundation for AI systems to synchronize with human intent, thereby facilitating the realization of innovative endeavors. Given the large number of interactions with Large Language Models (LLMs) based on 1:1 digital chats, it is important to carefully prompt gen AI models to generate accurate and tailored outputs.

Gartner predicts that more than 80% of enterprises will have used generative artificial Intelligence (gen AI) or deployed gen AI-enabled applications in production environments by 2026, up from less than 5% in 2023.[1] As gen AI adoption continues to accelerate, understanding proper prompt engineering structures and techniques is becoming more and more important.

With this in mind, we are going to discuss the criticality of the structure of AI prompting to the accuracy of AI outputs. Specifically, we discuss how defining objectives, assigning roles, providing context, specifying the output format, and reviews each play a role in crafting effective prompts.  

@Indian_Bronson. “salmon swimming in a river.” 15 Mar. 2023. X(Twitter), https://twitter.com/Indian_Bronson/status/1636213844140851203/photo/2. Accessed 3 Apr. 2024

Interacting with LLMs through a chat bot function may result in frustrations as users are faced with outputs that are not on par with their expectations. However, the more detail and clarity given to the model, the more resources it will have to understand and execute the task properly. In this context, “detail and clarity” means:

    1. Defining the objective

    1. Assigning Roles and Providing context

    1. Specifying the output format

    1. Reviewing & Refining

1. Define the Objective
Some good questions to ask oneself before providing a prompt to the gen AI include: What needs to be done? What tone does it have to be in? What format do we need? A 2023 Standford University study found that models are better at using relevant information that occurs at the very beginning or the end of the request.[2] Therefore, it is important to generate prompts that are context rich, and concise. 

2. Assign Roles and Provide Context
Arguably the most important part of prompting, providing context is critical because gen AI machines cannot infer meanings beyond the given prompts. Machines also lack the years of experience necessary to grasp the sense of what is needed and what is not without some explicit direction. The following principles are important to bear in mind:

Precision and Personalization: Providing detailed context and a clear role enables the AI system to deliver responses that are both accurate and tailored to individual user needs, preferences, and the specificity of the situation.

Delimiters like XML tags: & angle brackets: <> are a great way to separate instructions, data, and examples from one another. Think of XML tags as hash tagging on social media.

For example:

 

I want to learn about Mortgage Finance and its history

What are some key institutions in the industry?

 

Efficiency and Clarity in Communication: By understanding its expected role, whether as a consultant, educator, or support assistant, an AI application can adjust its communication style, level of detail, and prioritization accordingly. This alignment not only streamlines interactions but also ensures that the dialogue is efficiently directed towards achieving the user’s goals, minimizing misunderstandings and maximizing productivity.

Appropriateness and Ethical Engagement: Knowledge of the context in which it operates, and the nuance of its role allows an AI to navigate sensitive situations with caution, ensuring that responses are both appropriate and considerate. Moreover, this awareness aids in upholding ethical standards in an AI’s responses — crucial for maintaining user trust and ensuring a responsible use of technology.

3. Specify the output format
In crafting a prompt for AI text generation, specifying the output format is crucial to ensuring that the generated output is not only relevant, but also suitable for the intended purpose and audience or stakeholders. To this end:

  • Provide clear instructions that include details of the text’s purpose, the audience it’s intended for, and any specific points or information that should be included. Clear instructions help prevent ambiguity and ensure that the AI produces relevant and coherent output.
  • Set the desired tone, language, and topics so that the output is properly tailored to a business need or setting, whether it is an informative email or a summary of a technical report. Outlining specific topics in combination with language and tone setting aids in generating output that resonates with the stakeholders at the appropriate level of formality and delegates the correct purpose of such output to the end user.
  • Define constraints (length, count, tools, terminology) to help guide the AI’s text generation process within predetermined boundaries. These constraints ensure that the generated output meets the task’s requirements and is consistent with existing systems or workflows. It also minimizes review time and reduces the possibility of submitting additional prompts.

    • Supply output examples. This is a great way to encompass all the above tricks for specifying the output format. Examples serve as reference points for style, structure, and content, helping the AI understand the desired outcome more effectively. By providing a tangible example to the gen AI, a user increases the likelihood of achieving a satisfactory result that aligns with expectations.

4. Review & Refine
Last, but nevertheless important, is to review the prompt before submitting it to the gen AI. Check for consistency of terminology and technical terms usage throughout the prompt and formatting, such as tags and bullet points, to avoid confusion in the responses. Make sure the prompt follows logical flow, avoids repetition and unnecessary information to maintain the desired level of specificity and to avoid skewing the response onto the undesired path.

As users navigate the complexities of AI integration, remembering these prompting structures ensures maximization of AI’s potential while mitigating risks associated with misinformation.

Contact us to learn more about how we are helping our clients harness AI’s capabilities, informed by a strategic and mindful approach.


[1] “Gartner Says More than 80% of Enterprises Will Have Used Generative AI Apis or Deployed Generative AI-Enabled Applications by 2026.” Gartner, 11 Oct. 2023, www.gartner.com/en/newsroom/press-releases/2023-10-11-gartner-says-more-than-80-percent-of-enterprises-will-have-used-generative-ai-apis-or-deployed-generative-ai-enabled-applications-by-2026.

[2] Liu, Nelson F., et al. Lost in the Middle: How Language Models Use Long …, July 2023, cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pdf.


How RiskSpan and Snowflake Helped a Large Insurance Company Revolutionize Its Data Management

Background

Asset managers are increasingly turning to Snowflake’s cloud infrastructure to address the limitations of outdated databases. Migrating to Snowflake grants them access to a sustainable and secure platform that enables efficient data storage, processing, and analytics. This transition empowers asset managers to streamline operations, improve data accessibility, and reduce costs associated with maintaining on-premises infrastructure.

Client Challenge

A large insurance company’s asset management team was seeking to improve its approach to data management in response to its increasingly complex investment portfolio. The company recognized that transitioning to Snowflake would serve as a foundation for sustainable data analysis for years to come.

Desiring a partner to assist with the transition, the life insurer turned to RiskSpan – a preferred Snowflake partner with substantial experience in database architecture and management.

Specifically, the insurance company sought to achieve the following:

Systems Consolidation: Data stored across multiple transactional systems had contributed to data fragmentation and inefficiencies in data retrieval and analysis. The client sought to establish and maintain a consistent source of asset data for enterprise consumption and reporting.

Improved Reporting Capabilities: Quantifying full risk exposures in fast-moving situations proved challenging, leaving the institution vulnerable to unforeseen market fluctuations. Consequently, the client sought to improve its asset evaluation and risk assessment process by incorporating comprehensive look-through data and classification information. The need for various hierarchical classifications further complicated data access and reporting processes which required streamlining the process of producing ad-hoc exposure reports, which often required several weeks and involved teams of people.

Reduction of Manual Processes: The client needed more automated data extraction processes in order to create exposure reports across different asset classes in a more time-efficient manner with less risk of human error. 

Reduction of Infrastructure Constraints: On-premise infrastructure had defined capacity limitations, hindering scalability and agility in data processing and analysis.

RiskSpan’s Approach and Solutions Implemented

Collaborative Partnership: RiskSpan worked closely with the client’s IT, risk management, and analytics teams throughout the project lifecycle, fostering collaboration and ensuring alignment with organizational goals and objectives.

Comprehensive Assessment: Together, we conducted a thorough assessment of the client’s existing data infrastructure, analytics capabilities, and business requirements to identify pain points and opportunities for improvement.

Strategic Planning: Based on the assessment findings, the collective team developed a strategic roadmap outlining the migration plan to the unified data platform, encompassing asset data consolidation, portfolio analytics enhancement, and reporting automation.

Unified Data Platform: Leveraging modern technologies, including cloud-based solutions and advanced analytics tools, RiskSpan orchestrated the integration of various data sources and analytics capabilities. Together, we consolidated asset data from various transactional systems into a unified data platform, providing a single source of truth for comprehensive asset evaluation and risk assessment.

Data Lineage Tracking: The team employed dbt Labs tools to build, validate, and deploy flexible reporting solutions from the Snowflake cloud infrastructure.  This enabled the tracking of data lineage, adjustments, and ownership.

Daily Exposure Reporting: Leveraging automated analytic pipelines, we enabled real-time generation of exposure reports across different asset classes, enhancing the client’s ability to make timely and informed decisions.

Automated Data Extraction: We automated the data extraction processes, reducing manual intervention and streamlining data retrieval, cleansing, and transformation workflows.

Hierarchical Classification Framework: We implemented a hierarchical classification framework, providing standardized and consistent data hierarchies for improved data access and reporting capabilities.

Transformative Outcomes

Enhanced Decision-making: Implementing advanced analytics capabilities and exposure reporting empowered our client to make informed decisions more quickly, mitigating risks and capitalizing on market opportunities.

Operational Efficiency: Automation of data extraction, analytics modeling, and reporting processes resulted in significant operational efficiencies, reducing time-to-insight and enabling resource reallocation to strategic initiatives.

Scalability and Agility: The migration to a cloud-based infrastructure provides scalability and agility, allowing our client to adapt quickly to changing business needs and accommodate future growth without infrastructure constraints.

Data Governance and Compliance: The implementation of standardized hierarchical classifications strengthened data governance and compliance, ensuring data consistency, integrity, and regulatory adherence. By leveraging Snowflake’s scalable architecture and advanced features, this large asset manager is now positioned to maneuver both its current and future data landscapes. The implementation of Snowflake not only streamlined data management processes but also empowered the organization to extract valuable insights with unprecedented efficiency. As a result, the asset manager can make data-driven decisions confidently, enhance operational agility, and drive sustainable growth in a rapidly evolving market landscape.


RiskSpan Launches MBS Loan Level Historical Data on Snowflake Marketplace

ARLINGTON, Va., June 18, 2024 – RiskSpan, a leading provider of data analytics and risk management solutions for the mortgage industry, announced today that it has launched MBS Loan Level Historical Data on Snowflake Marketplace. RiskSpan’s MBS Loan Level Historical Data on Snowflake Marketplace enables joint customers to access RiskSpan’s normalized and enriched loan-level data for Fannie Mae, Freddie Mac, and Ginnie Mae mortgage-backed securities.

“We are thrilled to join the Snowflake Marketplace and offer our loan-level MBS data to a wider audience of Snowflake users,” said Janet Jozwik, Senior Managing Director at RiskSpan. “This is a first step in what we believe will ultimately become a cloud-based analytical hub for MBS investors everywhere.”

RiskSpan and Snowflake, the AI Data Cloud company, are working together to help joint customers inform business decisions and drive innovations by enabling them to query the data using SQL, join it with other data sources, and scale up or down as needed. RiskSpan also provides sample code and calculations to help users get started with common metrics such as CPR, aging curves, and S-curves.

“RiskSpan’s launch of a unique blend of enriched data onto Snowflake Marketplace represents a major opportunity for Snowflake customers to unlock new value through data on their business journey,” said Kieran Kennedy, Head of Marketplace at Snowflake. “We welcome RiskSpan to the ecosystem and look forward to exploring how we can support our customers as they look to leverage the breadth of the Snowflake platform more effectively.”

Joint customers can now leverage Loan-Level MBS Data on Snowflake Marketplace, allowing them to access RiskSpan data enhancements, including servicer normalization, refinements, mark-to-market LTV calculations, current coupon. These and other enhancements make it easier and faster for users to perform analysis and modeling.

Snowflake Marketplace is powered by Snowflake’s ground-breaking cross-cloud technology, Snowgrid, allowing companies direct access to raw data products and the ability to leverage data, data services, and applications quickly, securely, and cost-effectively. Snowflake Marketplace simplifies discovery, access, and the commercialization of data products, enabling companies to unlock entirely new revenue streams and extended insights across the AI Data Cloud. To learn more about Snowflake Marketplace and how to find, try and buy the data, data services, and applications needed for innovative business solutions, click here.

About RiskSpan, Inc. 

RiskSpan delivers a single analytics solution for structured finance and private credit investors of any size to confidently make faster, more precise trading and portfolio risk decisions and meet reporting requirements with fewer resources, and less time spent managing multiple vendors and internal solutions. Learn more at www.riskspan.com.


The newest, fastest and easiest way to access and analyze Agency MBS data

TL;DR Summary of Benefits

  • Data normalization and enhancement: RiskSpan’s MBS data on Snowflake normalizes Fannie, Freddie, and Ginnie loan-level data, consolidating everything into one set of field names. It also offers enhanced loan level-data fields, including current coupon, spec pool category, and mark-to-market LTV, which are not available in the raw data from the agencies. The data also includes pool-level factors like pool prefix and pool age, as well as full loan histories not available from the GSEs directly.
  • Data access and querying: Users access the data in Snowflake using SQL or Python connectors. Snowflake functions essentially as a cloud SQL server that allows for instantaneous data sharing across entities. In just a few clicks, users can start analyzing MBS data using their preferred coding language—no data, ETL, or IT Teams required.
  • Data merging and analytics: Users can merge the data in Snowflake with other available loan level or macroeconomic data, including interest rates, home prices, and unemployment, for advanced analytics. Users can also project performance, monitor portfolios, and create spec pools, among other features.

The Problem

Even though Fannie, Freddie and Ginnie have been making MBS performance data publicly available for years, working with the raw data can be challenging for traders and back-office analysts.

Traders and analysts already have many of the tools they need to write powerful queries that can reveal hidden patterns and insights across different markets – patterns that can reveal lucrative trading opportunities based on prepayment analysis. But one big obstacle often stands in the way of getting the most out of these tools: the data from the agencies is large and unwieldy and is not formatted in a consistent way, making it hard to compare and combine.

What’s more, the Agencies do not maintain full history of published data on the websites for download. Only recent history is available.

The Solution: RiskSpan’s new MBS loan-level historical offering on Snowflake Marketplace

Using RiskSpan’s new MBS Loan-Level Historical Data Offering, MBS traders and analysts can now leverage the power of Snowflake, the leading cloud data platform, to perform complex queries and merge data from multiple sources like never before.

This comprehensive data offering provides a fully normalized view of the entire history of loan-level performance data across Agencies – allowing users to interact with the full $9T Agency MBS market in unprecedented ways.

A list of normalized Fannie and Freddie fields can be found at the end of this post.

In addition to being able to easily compare different segments of the market using a single set of standardized data fields, MBS traders and analysts also benefit from derived and enhanced data, such as current coupon, refinance incentive, current loan-to-value ratio, original specified pool designation, and normalized seller and servicer names.

The use cases are practically limitless.

MBS traders and analystscan track historical prepayment speeds, find trading opportunities that offer relative value, and build, improve, or calibrate prepayment models. They can see how prepayment rates vary by loan size, credit score, geographic location, or other factors. They can also identify pools that have faster or slower prepayments than expected and exploit the differences in price.

Loan originators can see how their loans perform compared to similar loans issued by other originators, servicers, or agencies, allowing them to showcase their ability to originate high-quality loans that command premium pricing.

Enhanced fields provide users with more comprehensive insights and analysis capabilities. They include a range of derived and enhanced data attributes beyond the standard dataset: derived fields useful for calculations, additional macroeconomic data, and normalized field names and enumerations. These fields give users the flexibility to customize their analyses by incorporating additional data elements tailored to their specific needs or research objectives.

Enhanced loan-level fields include:

  • Refi Incentive: The extent to which a borrower’s interest rate exceeds current prevailing market rates
  • Spread at Origination (SATO): a representation of the total opportunities for refinancing within a mortgage servicing portfolio. SATO encompasses all potential refinance candidates based on prevailing market conditions, borrower eligibility, and loan characteristics
  • Servicer Normalization: A standardization of servicer names to ensure consistency and accuracy in reporting and analysis
  • Scheduled Balance: A helper field necessary to easily calculate CPR and other performance metrics
  • Spec Pool Type: A designation of the type of spec story on the loan’s pool at origination
  • Current LTV: a walked forward LTV based on FHFA’s HPI and the current balance of the loan

Not available in the raw data from the agencies, these fields allow MBS traders and analysts to seamlessly project loan and pool performance, monitor portfolios, create and evaluate spec pools, and more.

Access the Data on Your Terms

Traders and analysts can access the data in Snowflake using SQL or Python connectors. Alternatively, they can also access the data through the Edge UI, our well-established product for ad hoc querying and visualization. RiskSpan’s Snowflake listing provides sample queries and a data dictionary for reference. Data can be merged with macroeconomic data from other sources – rates, HPI data, unemployment – for deeper insights and analytics.

The listing is available for a 15-day free trial and can be purchased on a monthly or annual basis. Users don’t need to have a Snowflake account to try it out. Learn more and get started at the Snowflake Marketplace or contact us to schedule a demo or discussion.

Fannie/Freddie Normalized Fields

NAMETYPEDESCRIPTION
AGENumberLoan Age in Months
AGENCYVarcharFN [Fannie Mae], FH [Freddie Mac]
ALTDQRESOLUTIONVarcharPayment deferral type: CovidPaymentDeferral,DisasterPaymentDeferral,PaymentDeferral,Other/NA
BORROWERASSISTPLANVarcharType of Assistance: Forbearance, Repayment, TrialPeriod, OtherWorkOut, NoWorkOut, NotApplicable, NotAvailable
BUSINESSDAYSNumberBusiness Day in Factor Period
COMBINEDLTVFloatOriginal Combined LTV
CONTRIBUTIONFloatContribution of Loan to the Pool, to be used to correctly attribution Freddie Mirror Pools
COUPONFloatNet Coupon or NWAC in %
CURRBALANCEFloatCurrent Balance Amount
CURRENTCOUPONFloatPrimary rate in the market (PMMS)
CURRENTLTVFloatCurrent Loan to Value Ratio based on rolled-forward home value calculated by RiskSpan based on FHFA All-Transaction data
CURTAILAMOUNTFloatDollar amount curtailed in the period
DEFERRALAMOUNTFloatDollar amount deferred
DQSTRINGVarcharDelinquency History String, left most field in the current period
DTIFloatDebt to Income Ratio %
FACTORDATEDatePerformance Period
FICONumberBorrower FICO Score [300,850]
FIRSTTIMEBUYERVarcharFirst time home buyer flag Y,N,NA
ISSUEDATEDateLoan Origination Date
LOANPURPOSEVarcharLoan Purpose: REFI,PURCHASE,NA
LTVFloatOriginal Loan to Value Ratio in %
MATURITYDATEDateLoan Maturity Date
MICOVERAGEFloatMortgage Insurance Coverage %
MOSDELINQVarcharDelinquency Status: Current, DQ_30_Day, DQ_60_Day, DQ_90_Day, DQ_120_Day, DQ_150_Day, DQ_180_Day, DQ_210_Day, DQ_240_Day, DQ_270_Day, DQ_300_Day, DQ_330_Day, DQ_360_Day, DQ_390_Day, DQ_420_Day, DQ_450_Day, DQ_480_Day, DQ_510_Day, DQ_540_Day, DQ_570_Day, DQ_600_Day, DQ_630_Day, DQ_660_Day, DQ_690_Day, DQ_720pls_Day
MSAVarcharMetropolitian Statistical Area
NUMBEROFBORROWERSNumberNumber of Borrowers
NUMBEROFUNITSVarcharNumber of Units
OCCUPANCYTYPEVarcharOccupancy Type: NA,INVESTOR,OWNER,SECOND
ORIGBALANCEFloatOriginal Loan Balance
ORIGSPECPOOLTYPEVarcharSpec Story of the pool that the loan is a part of. Please see Spec Pool Logic in our linked documentation
PERCENTDEFERRALFloatPercentage of the loan balance that is deferred
PIWVarcharProperty Inspection Waiver Type: Appraisal,Waiver,OnsiteDataCollection, GSETargetedRefi, Other,NotAvailable
POOLAGENumberAge of the Pool
POOLIDVarcharPool ID


Transforming Loan Data Management Using Snowflake Secure Data Sharing

Presenters

Paul Gross

Senior Quantitative Analyst, Rithm Capital

Michael Cowley

Principal, Data Cloud Products, Snowflake

Bernadette Kogler

CEO, RiskSpan

Suhrud Dagli

CTO, RiskSpan

Wednesday, May 29th, 2024

1:00 ET

Hear from a distinguished panel including RiskSpan and Snowflake customers as they describe how Data Share has transformed their approach to mortgage investment. Specific topics to include:

  • High-speed data processing using Snowflake for easy delivery of risk analytics and diligence data
  • How Snowflake’s Data Sharing facilitates data access across and between organizations while maximizing computational performance and flexibility 
  • How Snowflake protects client data
  • The unique value of a central hub for all mortgage industry data and never having to FTP a file again

watch recording


Snowflake and the Future of Data Sharing Across Financial Institutions

The digitization of the financial services industry has opened countless doors to streamlining operations, building customer bases, and more accurately modeling risk. Capitalizing on these opportunities, however, requires financial institutions to address the immense data storage and sharing requirements that digitization requires.  

Recognizing this need, Snowflake has emerged as an industry-leading provider of cloud-computing services for the financial industry. According to estimates, some 57 percent of financial service companies in the Fortune 500 have partnered with Snowflake to address their data needs.1 In this article, we highlight some of Snowflake’s revolutionary data sharing capabilities that have contributed to this trend and RiskSpan’s decision to become a Snowflake partner.     

Financial institutions contemplating migration to the cloud are beset by some common concerns. Chief among these are data sharing capabilities and storage costs. Fortunately, Snowflake is well equipped to address both. 

Data Sharing Between Snowflake Customers

Ordinarily, sharing information across institutions inflates storage costs and imposes security and data integrity concerns.  

Snowflake’s Secure Data Sharing eliminates these concerns because no physical data transfer occurs between accounts. When one Snowflake customer desires to share data with another Snowflake customer, a services layer and metadata store facilitate all sharing activities. As a result, shared data does not occupy any storage in the institution consuming the data, nor does it impact that institution’s monthly data storage expenses. Data consumers are only charged for the compute resources, such as virtual warehouses, they use to query the shared data.  

The setup for Secure Data Sharing is streamlined and straightforward for data providers, while consuming institutions can access shared data almost instantaneously.   

Organizations can easily: 

  • Establish a share from a database within their account, granting access to specified objects within that database.  
  • Share data across multiple databases, provided all databases are under the same account.  
  • Add, remove, and edit access for all users. 

Data Sharing with Non-Snowflake Customers

For institutions desiring to share data with non-Snowflake customers, Snowflake offers an alternative secure data sharing method, known as a “reader account.” Reader accounts offer an efficient and cost-effective solution for data sharing without requiring consumers to register for Snowflake. They are associated exclusively with the provider’s account that established them. Data providers share databases with reader accounts, but each reader account can only access data from its originating provider account. Individuals using a reader account can perform queries on shared data but are restricted from carrying out DML operations, such as data loading, insertions, updates, and other data manipulations. These accounts serve as cost-effective solutions for organizations seeking to limit the number of more expensive user profiles. 

Secure Sharing with Data Clean Rooms

Clean room managed accounts are another way for Snowflake customers to share data with non-Snowflake customers. Data clean rooms are created by data providers to avoid privacy concerns when sharing their data. This is accomplished by allowing data consumers to compile aggregated results and analysis without permitting access to query the original raw data. Data providers can granularly control how their data is accessed and the types of analysis that can be run using their data. The data is encrypted and uses differential privacy techniques for further protection.   

How Can RiskSpan Help?

Knowing that you want to be on Snowflake isn’t always enough. Getting there can be the hardest part, and many organizations face challenges migrating from legacy systems and lack the expertise to fully utilize new technology after implementation. RiskSpan has partnered with numerous companies to help guide them towards a sustainable framework that holistically addresses all their data needs. No matter where the organization is within their data journey, RiskSpan has the expertise to help overcome the challenges associated with the new technology.    

RiskSpan is equipped to help institutions with the following as they embark on their Snowflake migration journey: 

  • End-to-end migration services, including architecture design, setting up the Snowflake environment, and properly validating the new platform.   
  • Adaptive project management. 
  • Data governance including the creation of a data catalog, tracing data lineage, and compliance and security requirements. 
  • Establishing data warehouses and data pipelines to facilitate collaboration and analysis. 
  • Creating security protocols including role-based access controls, disaster recovery solutions, and ensuring the utmost protection of personally identifiable information.   
  • Optimizing extract, transform and load solutions   

Snowflake’s data sharing capabilities offer an innovative solution for businesses looking to leverage real-time data without the hassle of traditional data transfer methods. These features not only enhance operational efficiency but also provide the scalability and security necessary for handling extensive datasets in a cloud environment.

Contact us with any questions or to discuss how Snowflake can be tailored to your specific needs.


Connect with us at SFVegas 2024

Click Here to book a time to connect

RiskSpan is delighted to be sponsoring SFVegas 2024!

Connect with our team there to learn how we can help you move off your legacy systems, streamline workflows and transform your data.

SFA-Attendees
Click Here to book a time to connect

Don’t miss these RiskSpan presenters at SFVegas 2024

Bernadette Kogler

Housing Policy:
What’s Ahead
Mon, Feb 26th, 1:00 PM

Tom Pappalardo

Future of Fintech
Wed, Feb 28th, 9:15 AM

Divas Sanwal Photo (3)

Divas Sanwal

Big Data & Machine Learning: Impacts on Origination
Wed, Feb 28th, 11:05 AM

Can’t make the panels?

Click here to make an appointment to connect. Or just stop by Booth 13 in the exhibit hall!


Snowflake Tutorial Series: Episode 3

Using External Tables Inside Snowflake to work with Freddie Mac public data (13 million loans across 116 fields)

Using Freddie Mac public loan data as an example, this five-minute tutorial succinctly demonstrates how to:

  1. Create a storage integration
  2. Create an external stage
  3. Grant access to stage to other roles in Snowflake
  4. List objects in a stage
  5. Create a format file
  6. Read/Query data from external stage without having to create a table
  7. Create and use an external table in Snowflake

This is the third in a 10-part tutorial series demonstrating how RiskSpan’s Snowflake integration makes mortgage and structured finance analytics easier than ever before.

Episode 1, Setting Up a Database and Uploading 28 Million Mortgage Loans, is available here.

Episode 2, Using Python User-Defined Functions in Snowflake SQL, is available here.

Future topics will include:

  • OLAP vs OLTP and hybrid tables in Snowflake
  • Time Travel functionality, clone and data replication
  • Normalizing data and creating a single materialized view
  • Dynamic tables data concepts in Snowflake
  • Data share
  • Data masking
  • Snowpark: Data analysis (pandas) functionality in Snowflake

Get Started
Log in

Linkedin   

risktech2024