claiming that the Gaza health ministry figures n deaths in Gaza is undercounted; as of the date of the analysis when it said there were 40,000 deaths the paper estimates over 64,000. Here is the introduction:
Accurate mortality estimates help quantify and memorialise the impact of war. We used multiple data sources to estimate deaths due to traumatic injury in the Gaza Strip between Oct 7, 2023, and June 30, 2024.
We used a three-list capture–recapture analysis using data from Palestinian Ministry of Health (MoH) hospital lists, an MoH online survey, and social media obituaries. After imputing missing values, we fitted alternative generalised linear models to the three lists' overlap structure, with each model representing different possible dependencies among lists and including covariates predictive of the probability of being listed; we averaged the models to estimate the true number of deaths in the analysis period (Oct 7, 2023, to June 30, 2024). Resulting annualised age-specific and sex-specific mortality rates were compared with mortality in 2022.
We estimated 64 260 deaths (95% CI 55 298–78 525) due to traumatic injury during the study period, suggesting the Palestinian MoH under-reported mortality by 41%.
The "capture-recapture" analysis that is the heart of the statistics-heavy paper works like this. The authors took three different lists of victims - the hospital records, the online survey, and those found on social media obituaries. Between the three they found very little overlap:
Only about 15% of people in hospital records appeared on other lists
About 33% of people in the survey appeared on other lists
About 54% of people in social media appeared on other lists
They used these overlap patterns to estimate how many deaths likely occurred but weren't captured on any list. A low overlap, they say, indicates that each list is a gross undercount of the total. Based on their analysis of the low overlap, they suggest that the three lists combined only captured about 45% of total deaths. This led to their estimate of about 64,260 total deaths during the study period, compared to the official count at the time of about 37,900 (which included unnamed people.)
The assumptions behind this methodology are wrong.
The reason that there is very little overlap between the hospital lists and survey lists is because the entire purpose of the survey was to supplement the hospital list. The survey list is designed to capture names not on the hospital lists. The health ministry Capture-recapture analysis is based on the assumption that the lists are independent, when in fact they are supposed to be exclusive - the Ministry of Health issues statistics based on both lists combined and would attempt to remove duplicates.
In this case, the capture-recapture analysis is not an appropriate methodology because the two dominant data sets are largely meant to be exclusive of each other, not representative samples of the total deaths.
The third list, from multiple social media sources like
Instagram, is not a random sample of the deceased at all. It could be updated by anyone anywhere in the world. It is self-defining and impossible to verify. Using it as an input to the analysis is questionable at best. To give it the same weight as the other two sources for the purposes of statistical analysis and estimations based on low overlap is almost certainly a poor assumption.
There are other potential problems. The
survey form does not distinguish between "martyrs" and "missing persons," and many of those assumed to be dead may in fact be missing - the ICRC has managed to
reunite thousands of people thought to be missing.
Altogether, this is a case where the authors try to misdirect the reader with lots of statistical formulas but their basic assumptions that the statistics are based on are worthless to begin with.