How Harvard's Hurricane Maria Study Actually Worked
/The much-publicized study suggests that the death toll due to Hurricane Maria was 4,645 people. Here's how it was done.
The study on everyone’s minds this week is this one out of Harvard, appearing in the New England Journal of Medicine.
Mortality in Puerto Rico after Hurricane Maria. The study has appropriately gotten a lot of attention, but its also generated a whole lot of incorrect interpretations. Now, part of this is due to the fact that President Trump made a comparison of the deaths due to Maria and those due to Hurricane Katrina in the aftermath of that storm:
That statement has led “death count” to be a surrogate for “adequate government response”, hence media coverage like this from the daily beast.
The official death count currently is 64. The Harvard paper puts it the death count at 4,645, so what is going on here. How hard is it to figure out how many people died in a natural disaster?
It’s actually really hard. So let me take you through the methodology of the Maria study to see how they got to their final death count of 4,645 deaths.
First off, we need to distinguish between direct and indirect deaths due to a storm. Direct deaths are those, like drowning, that can be tied to the storm itself. In Puerto Rico, direct deaths need to be confirmed by the Institute of Forensic Sciences and a medical examiner. That official death count of 64 reflects direct deaths but is probably an underestimate given the high bar for certification. But there are also indirect causes of death after a storm – what if an individual died because they couldn’t get to a hospital to be treated for a heart attack, or because they couldn’t get medicine for their chronic condition? What if they are killed in a traffic accident because the traffic lights stop working? What if they are killed during a robbery because an alarm system is cut off?
There are two ways to approach the indirect death problem. First, you can take each death occurring after the storm and simply ask “do I think this death would have occurred if there hadn’t been a Hurricane Maria”. This approach is subjective and can lead to wildly different estimates.
The more robust approach, and the one used in this study, takes a broader view – look at the number of deaths that typically occur in a year in Puerto Rico – the background death rate, then look at the deaths that occurred in the time period after Maria struck on September 20th. The difference in deaths are the “excess mortality”.
Here’s the monthly death rate in Puerto Rico in 2016.
And now I’ll add 2017.
You do indeed see a spike around the time of the Hurricane in September.
The difference here is the excess mortality that occurred after the Hurricane.
Here’s the first caveat: excess mortality is a population phenomenon. We can’t say that any individual death is due specifically to Maria – only that more deaths than expected happened. The destruction caused by Maria is the most obvious contributing factor, but not necessarily the only one.
There’s a problem with that graph, though, which relied on death certificate data. The death certificate data in Puerto Rico is incomplete. This is why the count for December is so low, for example – the data hasn’t all been released yet.
OK so we can’t use death certificates.
What the Harvard Team did was a household survey. But you can’t knock on every door in Puerto Rico, so you take a random sample of households. They sampled just over 3000.
To be accurate, a sample needs to be random, but also needs to have adequate coverage. If you just pick a random 3000 houses in Puerto Rico, you’re apt to get a sample of people living in cities. The researchers accounted for this by stratifiying the sample by “remoteness” from city centers, ensuring they had representative data from even sparsely populated areas.
Basically, they conducted interviews in a few houses in a district, essentially asking those who lived there if a family member died in 2017 and if so, when. The mortality rate for those households is averaged to represent the rate in the district as a whole. The average rate in districts of similar remoteness is averaged to create a stratified death count, which is averaged based on population to get the overall number. In this way, you take a survey of 3000 households to represent a territory of more than 3 million people.
The surveys turned up 38 deaths in the three months after Maria. Using this methodology, those 38 deaths get scaled up to 12,178 individuals territory-wide. The expected number of deaths over those three months was 7533. The difference is 4,645 – 4,645 extra deaths - the number in all the headlines.
Some might be quick to argue that extrapolating from just 38 deaths seems to be a stretch. To be fair, though, assuming the randomness of the survey was preserved, you can estimate how uncertain that estimate is with statistics. In this case, the authors report a 95% confidence interval that’s pretty wide – ranging from 793 to as many as 8498 deaths. But any way you cut it, way more than 64.
There’s another compelling tidbit that suggests a really significant increase in the death rate after the hurricane. Remember that the surveyors asked about deaths that occurred in 2017. The entire year. They found 18 deaths that occurred in the nine months prior to the hurricane, and 38 deaths in the three months after. That’s a startling increase, assuming dates are being recalled correctly, using the same metric.
This may also be an underestimate. If the surveyors went to a house that was empty or abandoned, it wasn’t counted. By definition, this means that the death rate among people living alone is zero in this study – which is clearly too low. Assuming that people living alone had the same death rate after Maria as they would have any other time of year, the excess mortality number goes up to 5740.
Is the official number off by a factor of 70? Not really. The official number, as I pointed out, is based on extremely strict criteria. These two numbers measure different things. In my opinion, to assess the true impact of a disaster, you really should include indirect deaths. But it’s disingenuous to imply that these were two efforts designed to measure the same thing.
So… what about Hurricane Katrina? While I find it a bit distasteful to “rank” disasters using body counts, the comparison is being bandied about because, well, Trump made the comparison.
The best data came out in 2013 in a journal called Disaster Medicine and Public Health Preparedness. It used death certificates to determine that there were 971 Katrina-related deaths in Louisiana.
But these were, by and large, direct deaths – coroners had listed ICD-10 Code X37, “victim of cataclysmic storm” on the death certificate. Fully 40% of these deaths were due to drowning. In contrast, in the Harvard study just 10% of deaths were attributed by family members to the Hurricane itself. About a third were caused by delayed or prevented access to medical care.
So did anyone look at the aftermath of Katrina with a similar approach? This study came close.
The authors looked at a New Orleans newspaper death notices in the aftermath of Katrina compared to similar time periods pre-Katrina and found about a 50% increase, which translates to around 1500 excess deaths in the New Orleans area – quite a bit greater than the official number. This is also an underestimate considering many deaths may not have been reported to the newspaper.
So was president Trump correct in implying that things went way better in Puerto Rico than in Louisiana? Not really. Is the Daily Beast correct in saying that Maria was worse than Katrina and 9/11 combined? Not really.
Different methodologies: different results. When full death-certificate data becomes available from Puerto Rico, we may have more of an apples-to-apples comparison, but for now, let’s hold off on ranking tragedies and remember that we remain woefully unprepared for many natural disasters and that there are still people – living people – who continue to suffer in the aftermath of this terrible storm.