Thread by @NoahHaber, Folks: There are serious statistical, design, language, and ethics concerns with that [...]

Folks: There are serious statistical, design, language, and ethics concerns with that vitamin-D RCT.

AT BEST, it's completely meaningless due to negligent statistical treatment and design, but there's more "questions"

Help us out: avoid sharing until the critics have had time.

Seriously, we (people who are frankly pretty good at this) are having a very hard time figuring out what the hell is happening in that trial.

Please give us time to do our work.

This thread is a good place to start if you want a taste.

But "super sus" is right; there is just so much here that doesn't make any sense at all, and this thread only scratches the surface.

It's gonna be a while before we figure this out. https://twitter.com/fperrywilson/status/1360944814271979523

https://twitter.com/fperrywilson/status/1360944814271979523

<grumbly soapbox rant>

this kind of crap is what happens when RCTs are automatically the "gold standard" in medical evidence.

</grumbly soapbox rant>

Even if we make the (extremely generous) assumption that failure to account for clustering is the only "major" error, I can't emphasize enough just how damning that error is.

It's both incredibly basic and incredibly important to deal with correctly. How did this happen?

Things are developing in a bad direction, so time to talk a little more about it.

Let's start with the most generous version of this, and say they just made a couple of honest mistakes. It happens! Stats and study design is hard!

Abstract: "Participants (n=551) were randomly assigned to calcifediol treatment"

That is an unambiguous statement that individuals were randomized to arms.

But that's false; taking things at face value, the 8 WARDS were randomized, NOT individuals.

That's super important, since when you are assigning treatment at a group level, the grouping is the important bit, and needs to be dealt with from day 1.

Think along the lines of this being more like n=8 than n=930 (not quite right, but you get the idea).

That's a HUGE deal, as it's effectively impossible to get usable results from an 8 cluster cRCT due to complete loss of statistical power.

In effect, we can't really tell if the results had to do with other stuff inherent with the different wards, or the vit-d "treatment"

But then there's another weird thing: If you have only 8 clusters, you would want to split them in half (4v4) not what they did: 5v3.

The reason is, again, statistical power. It's typically (not always) MUCH more efficient to split your groups evenly.

Sounds like a small thing, but it's....weird. Even if you knew nothing about clustered RCTs, you would probably know that you want an even split. So why the 5v3?

If that was all that was wrong, dayenu. This isn't subtlety, this is REALLY basic stuff for trial design/stats.

From here, things get .... weirder.

If this was a properly run trial (assuming the design/stats were legit, which they aren't), according to the paper the trial ended in May, 2020.

What was happening from then to now? It often takes a long time to get and clean the data, sure.

But given 1k person RCT (which generally requires a HUGE amount of planning and infrastructure) in a pandemic, you'd want to accelerate that to light speed, and get those results out ASAP. Especially if they showed these miraculous results (narrator: they don't)!

Then there is some weirdness about the study population: i.e. the embedding in the cohort.

That's not a problem by itself; it can be a huge time and resource saver. I've developed 2 RCTs to date, both of which were embedded in cohort studies for this very reason.

But the way it's described is ... weird.

There is some weird language here around "hospitalized randomly" implying that they assigned patients to wards randomly. Maybe a language or oversimplification issue, so maybe don't read too much into that.

If everything was done properly as described, we have a new issue: in the consent process, it seems that patients were given options. If they're given options, they might choose (or be encouraged) to go to different wards for various reasons.

That breaks a LOT of things.

If which ward was which changes what patient assignment to wards (e.g. patient 1 might want to be sent to the vit-d ward, or doctor might send them there), the randomization with respect to patient assignment is completely broken due to selection issues.

That's...not great.

Then there's ethics and protocol. The manuscript states that it received ethical approval for the study. Great! In theory, that means there is a protocol for this (this are usually not public) and a trial registration.

We should be able to verify that this was planned this way.

I am personally not familiar with the typical required, and standard processes for ethics, registration, and protocols in Spain.

If it was approved with a protocol that roughly matches what the manuscript says was done, then this is merely* study design and reporting negligence

* "merely" just means nothing more going on; there would still remain a jaw-dropping series of design and reporting errors.

Also worth noting that there are HUGE differences in the baseline levels of vit-d in the trial arms.

Don't fall into the trap of believing that randomization means that the arms are "balanced." That's not true. Differences are both expected and totally ok when things are done right.

But that's a HUGE difference, suggesting that there are fundamental differences between arms.

At best, this is some combination of ward-specific protocols, procedures that happened to line up with the arms (small n's do that), etc + the patient selection issue.

Again that would be enough to be super sus.

Altogether though, that is a LOT of issues that kinda happen to fall into place for these results.

So, at best, this is study design and reporting negligence on a lot of dimensions plus a little push from random chance.

Best bet is assuming this is the case.

HOWEVER, given the weirdness about these errors (and a clear willingness to play fast and loose with the word "randomized") we should do some due diligence here and verify that's the case, which is what is happening now.

This should all be pretty cut and dry if the ethics approval and protocol turns up and describes this RCT. If it doesn't...

In the meantime, we are playing catchup, since this study is blowing up all over the place with unscrupulous sharers and media reports.

This has the makings of yet another HCQ-type debacle (albeit probably not as big) with a hint of DANMASK.

Let's do our best to not let that happen.

If you want some more "live" look at this, @sTeamTraen has been at work here (and has a sterling reputation for discovering all kinds of research mistakes and misconduct, as well as well-known research trouble-maker @GidMK.

I'll probably update when I'm more sure of things.

Well, we have an an answer from the authors...of sorts. Copied here, because this sure is something.

https://pubpeer.com/publications/DAF3DFA9C4DE6D1B7047E91B1766F0#11

What on earth does this mean, and how can you possibly square it with what's in the abstract and manuscript itself?

"We never say in the article that it is a randomized control trial (RCT) but we consider an open randomized trial, and an observational study."

The study describes randomization (implied at the individual level, but actually at the ward level). It describes a control (not receiving the "treatment") and it directly describes it as a trial.

There is ZERO question that, if we take them at their word, this is an RCT.

"Formal ethical approval was obtained shortly after the study started although verbal approval from the ethics committee was given at the time it was started while we completed all the bureaucratic process."

Oh. Oh no. That is very, very not ok.

In case there is any doubt at all, this is a direct quote from the manuscript, page 5:

"The effect of calcifediol administration was studied in a prospective open randomized controlled trial."

This. https://twitter.com/maureviv/status/1361380910956953601

https://twitter.com/maureviv/status/1361380910956953601

This was clearly suspicious from the start, but I am quite honestly pretty shocked, and did not expect this result.

There are still some open questions, but at minimum this is a major violation of public trust and ethics, not to mention scientific and statistical rigor.

I hope the authors do the right thing and pull it. There is no version of this situation which can save it.

Best we can do is be honest with our errors and move forward.

Also folks; please don't take it upon yourselves to try to "fix this" by leaving inappropriate comments or feedback.

The authors already have all the information they need, from well-qualified folks who do this kind of thing.

Let this run its course, no need for more attention.

Latest Threads Unrolled: