Those involved with catastrophe modeling within the property and casualty insurance industry have no shortage of data. Most cat teams have access to detailed exposure data, model outputs from multiple vendors, decades of historical loss experience, third-party hazard layers, and increasingly, AI-driven property intelligence. The greater challenge is turning that data into timely, decision-ready insight across day-to-day operations, renewals, regulatory submissions, capital reviews, and event response.
Moody’s launched Risk Data Lake (RDL) in 2025 to help teams convert data into insights through scalable infrastructure, cataloged data, and integrated analytical tooling. Today, we are announcing the general availability of a feature that enables querying all RDL data using Spark SQL as a fully integrated experience within Risk Data Lake.
The impact of data transfer taxes on analytical efficiency
Across the insurance market, a familiar pattern persists. A modeling team processes exposure data and produces model results. That data then leaves the modeling environment—is exported, reformatted, loaded into local tools, joined with in-house data, charted, reviewed, and revised—before a decision can be made.
When a new run is required with different inputs, settings, or assumptions, the whole cycle repeats. During peak periods, such as renewal cycles, regulatory submissions, or catastrophe events, the cumulative cost of this data movement can add weeks of elapsed time, unrelated to the analytical question itself.
The industry has spent the last decade attempting to address this challenge with general-purpose data lakes. The results have been mixed. As we have heard from our customers, sheer scale and modern tools alone do not produce business value when a data lake lacks an insurance-domain context, integrated workflows, and embedded analytics. Without those elements, a data lake can become an underutilized storage environment rather than a source of analytical advantage.
What makes Risk Data Lake different
Risk Data Lake is designed differently from traditional data lakes available in the market. It is an analytical environment built specifically for catastrophe risk analytics and embedded directly in Moody’s Intelligent Risk Platform™ (IRP).
Data assets stored in the IRP—portfolios, policies, locations, events, model output, and accumulation results—are presented in a structured, queryable form through Risk Data Lake’s catalog. Customers can also bring their own data, including loss experience, third-party data, and custom event sets, and analyze it alongside data from the IRP tenant.
Risk Data Lake includes a comprehensive set of analytical tools that support a broad spectrum of analytical needs, from drag-and-drop workflows for dashboard development to SQL-based data exploration and advanced analytics through programmable notebooks.
Launching the next generation of Risk Data Lake
The first version of Risk Data Lake was launched in October 2025, and the product included a reporting engine enabling users to design and publish dynamic dashboards.
With this June 2026 release of Risk Data Lake, the ability to query all data using SQL becomes an integrated experience within the product. In addition, users can now access and navigate Risk Data Lake’s catalog while writing SQL queries. Risk Data Lake also enables the import of external data in CSV format and allows users to create datasets within their own user space or in a shared tenant space.
Risk Data Lake includes four core capabilities in this release:
- Risk Data Catalog: A governed inventory of exposures, losses, intermediate outputs, and metadata stored in the IRP.
- Tenant Workspace: Managed schemas for customer-owned datasets, derivative views, and uploaded reference data, accessible alongside the catalog.
- SQL Execution: The analytical surface within RDL for writing and running Spark SQL directly against the catalog and Tenant Workspace, with the same identity and access control defined at the IRP level
- Reporting Engine: Pre-built reporting datasets and a dashboard designer for creating and publishing dynamic dashboards and reports.
Risk Data Catalog and Tenant Workspace
The Risk Data Catalog and Tenant Workspace are the primary data management areas of Risk Data Lake. For the June 2026 release, users have full access to these assets through the left navigation of Risk Data Lake.
The Risk Data Catalog catalogs data from the Moody's Intelligent Risk Platform (IRP) tenant, making it discoverable, accessible, and queryable. This read-only catalog represents data from the IRP tenant. As users perform operations within the IRP, such as executing a model run, the resulting data assets are reflected in the Risk Data Catalog, allowing users to query them within minutes.
The catalog presents exposure data in a schema aligned with the Exposure Data Module (EDM), while model results are presented in a schema aligned with the Risk Data Module (RDM). This approach allows Risk Data Lake users to access and analyze data without having to learn an entirely new schema.
In addition to data already present in the IRP tenant, users may create derivative datasets or import external data into Risk Data Lake to enhance their analytics. For example, users might want to import historical loss information to be included in the analytics they are developing in Risk Data Lake. With this release, Risk Data Lake natively supports the import of CSV-formatted external data. We will continue to expand import capabilities to include additional data formats such as Excel, shapefiles, and Parquet.
Storing imported and/or derived datasets
Creating derivative datasets and importing external data requires an area within the catalog that supports write operations. Risk Data Lake delivers this capability through Tenant Workspace. Each user receives a dedicated area under the user_data schema, identified by user ID. This area is accessible only to the user to whom it belongs.
For sharing data across users or teams, the tenant_data schema should be used. This area is accessible to all Risk Data Lake users, but only members of the Risk Data Lake Admin group can create or delete datasets.
The combination of user space and tenant space allows clients to manage and govern datasets with different access patterns.
SQL execution
One of the most significant enhancements to the Risk Data Lake is the ability to query all the Risk Data Lake data using SPARK SQL. We have developed a new SQL editor that enables users to query data from both Risk Data Catalog and Tenant Workspace.
Within the SQL Editor, users can access the catalog within the Risk Data Lake, construct and execute queries, review results, and download outputs. Users can also create and save query files, review query execution history, and access analytical toolkits that Moody’s will publish in Risk Data Lake.
Risk Data Lake supports Spark SQL 4.0 as its query dialect. This dialect differs somewhat from the Microsoft SQL Server syntax familiar to many industry practitioners. While users may need to update existing SQL Server-based queries for use in Risk Data Lake, feedback from current RDL clients indicates that this transition is manageable.
Data access controls
One of the most important challenges in working with general-purpose data lakes is access control. Clients on the IRP invest significant effort in implementing access controls for data stored in the IRP tenant, using groups to determine which users can view specific data assets. A data lake environment that bypasses those controls and makes all data accessible to all users is not acceptable.
Risk Data Lake enforces the same access controls defined in the IRP tenant. As a result, the same SQL query executed by two different users may return different results based on their respective access levels. This critical capability is not traditionally offered in data lakes, whereas Risk Data Lake delivers it without requiring users to take additional steps.
Why this matters for business
Moody's Risk Data Lake provides a strong analytical foundation, built on scalable cloud-based compute, for clients to develop advanced risk analytics and derive timely insights.
Clients using Risk Data Lake for analytical workflows, including model validation and comparative analytics, have reported double-digit efficiency gains in their most data-intensive workflows. These clients can now process large volumes of data without exporting it outside the IRP tenant.
Organizations that have evaluated or implemented traditional data lakes have often found them lacking the domain specialization required for risk analytics. These organizations are extracting analytical value from Risk Data Lake and developing workflows that combine Risk Data Lake with their own enterprise data lake environments.
To explore how Risk Data Lake fits your analytical priorities, contact your Moody’s account team.
Companion reading:
- Risk Data Lake: applied risk analytics on the Moody's Intelligent Risk Platform by Kirtan Dave—comparative analytics on the IRP, with a worked validation dashboard example.
- From model output to executive insight in minutes: How Moody's Risk Data Lake is changing catastrophe model validation by Iuliia Shustikova—the nine-question validation framework on Moody's RMS™ Australia Bushfire HD Model, and the Model Validation Starter Kit.