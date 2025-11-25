Figure 1: Storm Prediction Center Storm Reports from May 8, 2024 (Source: NOAA SPC)

It is tempting to take SPC storm reports at face value and draw trends in the behavior of sub-perils or intense storms. But in catastrophe modeling, when working with only decades of data to model tens of thousands of years of activity, developers must be cautious not to overfit to uncertain ‘trends.’

These SPC observations are just that, an observation; someone has seen a tornado or hailstorm and reported it. From this reporting approach, several SPC data trends can arise due to reporting bias. It calls to mind that age-old saying: "If a tree falls in a forest and no one hears it, does it make a sound?"

There may be systematic underestimation. intense storms in urban areas may receive many reports from ‘storm watchers’, but storms in less-populated, rural areas, especially small-to-medium-sized events, may be less reported, skewing the true frequency and intensity of storms.

Thanks to our SCS modelers, Moody’s takes great pains to correct biases in the historical record, but a (re)insurer may have limited resources to take these corrections into account for its validation projects. So, how should a user begin to validate an SCS hazard model?

Model validation toolkit: How to overcome observational uncertainty

Here are three opening thoughts to guide model validation efforts, with an emphasis on where we know observations to be most reliable:

1. Focus on data from populated regions

We can be most confident in reports from suburban and urban areas, where population density and infrastructure ensure that severe weather events are well observed and their impacts accurately recorded.

Modern technology and exposure growth also give us the highest confidence in more recent parts of the record, with more detection systems and people available to observe storms.

To overcome reporting bias, Moody’s trains its machine learning techniques to correlate reports from densely-populated regions with the convective conditions at the time, as defined by weather reanalysis data.

We can combine these techniques with expected levels of undercounting from across the entire country to fill in the observational gaps.

As a result, a model should generally produce more severe hazard than the observed record, especially in rural areas and in earlier years, while remaining consistent with observed severity in urban and suburban regions.