Coronavirus Clinical Trials: Too Much of a Good Thing?

Randomized trials are the gold-standard of medical literature, but that doesn’t mean they are never wrong.

There has been an explosion of research into COVID-19, from its underlying biology, to its potential treatments. I have not only lauded the scientific community for this rapid-fire pace of research but have contributed to the growing body of literature myself. All good, right?

Well, maybe not. It’s possible there might be too much of a good thing here, as this article, appearing in JAMA Network Open shows.

Researchers from MD Anderson queried clinicaltrials.gov, the central US registry for (in theory) all clinical trials. Since 2007, if you were going to run a clinical trial that you wanted to publish eventually, you had to register it in clinicaltrials.gov before it got underway. The idea is to prevent the “burying” of negative clinical trial results. It’s not perfect, but nowadays almost all legitimate trials have an entry on this site.

OK so the researchers looked for trials involving COVID-19 and the numbers here are really staggering.

After removing suspended and halted trials, they found 674 specific randomized trials of COVID-19 interventions – most of these were treatment, not prevention trials. Of those treatment trials – 132 – nearly a quarter - were randomized trials of chloroquines.

132 randomized trials testing chloroquines.

This could be a problem.

Remember that in a randomized trial, one group always wins – even if the drug doesn’t do what you think it will do. One group always, by chance alone, has more deaths or longer length of stay, or whatever you are measuring. We account for that though – we can use math to tell us how weird the results of our study are assuming the drug doesn’t work.

Let’s say I enroll 200 people in a trial of a magic bean to cure COVID-19.

Think of the p-value as how “weird” the data is, assuming magic beans have no real effect.

Think of the p-value as how “weird” the data is, assuming magic beans have no real effect.

100 swallow the bean, 100 get a placebo bean. If you saw, say, 10 deaths in the placebo group and 9 in the bean group, would you be terribly excited? Your intuition should tell you no – a 1-person difference in death seems like it might just be due to random chance. In fact, you’d get a result like that – or even more extreme – 81% of the time. Nothing to write home about. That 81% is the “p-value” in these clinical trials – we have (rather arbitrarily) defined a p-value of 0.05 as our threshold for statistical significance.

Using our bean example again, 4 deaths in the bean group compared to 10 in placebo is a bit weird – results like that would happen 10% of the time even if magic beans don’t work.  But 3 deaths in the bean group – well a result like that only happens 4.5% of the time – we’ve passed that p-value threshold.

And that should feel about right to you.  If you did this trial, and had just 3 deaths in the magic bean group, compared to 10 in the placebo group, you might really start to think… huh… I guess that old peddler knew what he was talking about.

But that’s for one trial.

What if I did 132 trials of magic beans?

If you do 132 magic bean trials, you’re nearly guaranteed to find one that suggests that the legendary legumes significantly reduce mortality. In fact, you’ll likely find a few.

If you do 132 magic bean trials, you’re nearly guaranteed to find one that suggests that the legendary legumes significantly reduce mortality. In fact, you’ll likely find a few.

Assuming the magic beans are bunk, on average just over 3 of those 132 trials would be positive at that p-value of 5% threshold. But of course, it depends how the chips land – you can see sometimes you get as many as 8 positive trials out of 132 even when you are using a magic bean.

Now I should note that a similar amount of trials would show just the opposite – the magic bean INREASES the death rate significantly.

But here is the problem. What do you think will get talked about on Facebook, Twitter, and the nightly news?  That’s right – positive trials get WAY more airtime than negative or neutral ones and that shapes the public perception of drug efficacy. And, honestly, it shapes physician perceptions too.

There’s another problem with all these trials of the same thing.  There aren’t enough patients. Of the 201 trials recruiting solely in the US, the total expected enrollment was 146,688 individuals – 87,000 patients are needed to be enrolled in chloroquine-specific trials alone.  It is not easy to recruit patients into clinical trials – this many – even with a raging pandemic – is not feasible. Many of these studies will never finish.

So what are we to do? Well, the number one thing is to realize that there will be individual randomized trials of interventions that don’t actually work and yet nevertheless are positive. This WILL happen. You will see on the news that a new randomized trial shows that chloroquine or some other drug reduces mortality in COVID-19 and you will not be told of all the other trials that show that it doesn’t. First, we need to be aware of this.

Second, we need to encourage researchers to work together on these projects. We need to foster collaboration across medical centers and research institutions – get these teams working together – do 20 really amazing trials instead of 120 mediocre ones. And that means fixing some of the regulatory hurdles around data sharing and IRB oversight but ALSO making it worthwhile for researchers, desperate for high-impact publications, to join a study as a (gulp) middle author.

In the meantime, I’ll be out conducting 132 trials of my magic bean. You’ll hear about, say, 3 of them right here on Impact Factor.

This commentary first appeared on medscape.com.