Response To Alexandros Contra Me On Ivermectin
I. In November 2021, I posted Ivermectin: Much More Than You Wanted To Know, where I tried to wade through the controversy on potential-COVID-drug ivermectin. Most studies of ivermectin to that point had found significant positive effects, sometimes very strong effects, but a few very big and well-regarded studies were negative, and the consensus of top academics and doctors was that it didn’t work. I wanted to figure out what was going on. After looking at twenty-nine studies on a pro-ivermectin website’s list, I concluded that a few were fraudulent, many others seemed badly done, but there were still many strong studies that seemed to find that ivermectin worked. There were also many other strong studies that seemed to find that it didn’t. My usual heuristic is that when studies contradict, I trust bigger studies, more professionally done studies, and (as a tiebreaker) negative studies - so I leaned towards the studies finding no effect. Still, it was strange that so many got such impressive results. I thought the most plausible explanation for the discrepancy was Dr. Avi Bitterman’s hypothesis (now written up here) that ivermectin worked for its official indication of treating parasitic worms. COVID is frequently treated with steroids, steroids prevent the immune system from fighting a common parasitic worm called Strongyloides, and sometimes people getting treated for COVID died of Strongyloides hyperinfection. Ivermectin could prevent these deaths, which would mean fewer deaths in the treatment group than the control group, which would look like ivermectin preventing deaths from COVID in high-parasite-load areas (like the tropics) but not low-parasite-load areas (like temperate zones). This explained some of the mortality results, with the other endpoints likely being because of publication bias. Alexandros Marinos is an entrepreneur, long-time ACX reader, and tireless participant in online ivermectin arguments. He put a very impressive amount of work into rebutting my post in a 21 part argument at his Substack, which he finished last October (if you don’t want to read all 21 parts, you can find a summary here). I promised to respond to him within a few months of him finishing, so that’s what I’m doing now. I’ll be honest - I also didn’t want to read a 21 part argument. I would say I have read about half of his posts, and am mostly responding to the summary, going into individual posts only when I find we have a strong and real disagreement that requires further clarification. I also have had a bad time trying to discuss this with Alexandros (not necessarily his fault, I can be sensitive about these kinds of things) and am writing this out of obligation to honor and respond to someone who has put in a lot of work responding to me. It is not going to be as comprehensive and well-thought out as Alexandros probably deserves. I’ll go through each subpart of his argument, as laid out in the summary post. II. Individual StudiesAlexandros is completely right about one of these studies, partly right about a few others, and I still disagree with him on several more. On the one where I was wrong, I was egregiously wrong, and I apologize to the study authors and to you. On the original post, I went through a list of 29 studies, trying to decide whether or not I trusted them. I dismissed 13 studies as untrustworthy (which didn’t necessarily mean fraudulent, just that I wasn’t sure they had good methodology). Then I dismissed 5 more studies that epidemiologist Gideon Meyerowitz-Katz didn’t like (even though I didn’t have strong objections to them myself), just to get a list of studies everyone agreed seemed pretty good. This part is about my study-keeping decisions. It won’t have a very big impact on the final result, since both Alexandros and I agreed that regardless of which study-dismissing criteria you use the final list supports ivermectin efficacy. But I still tried to get this right and mostly didn’t. Alexandros critiques many of my study interpretations, but includes four in his summary. I’ll go over those four in detail, and make less detailed comments on the rest. Biber et al (Alexandros 100% right) The study I am most embarrassed about here is Biber et al, an Israeli study which found that COVID patients who received ivermectin had lower viral load. In the original post, I wrote:
You can find Alexandros’ full critique here. His main concerns are:
After looking into it, I think Alexandros is completely right and I was completely wrong. Although I sometimes get details wrong, this one was especially disappointing because I incorrectly tarnished the reputation of Biber et al and implicitly accused them of bad scientific practices, which they were not doing. I believed I was relaying an accusation by Gideon (who I trust), but I was wrong and he was not accusing them of that. I apologize to Biber et al, my readers, and everyone else involved in this. My only reservation is that I don’t want to say too strongly that Gideon’s critique is wrong: I haven’t looked through the study documents enough to say with certainty that Alexandros’ reanalysis of the protocol issues is correct (though the superficial check I’ve done looks that way). But my mistakes are completely separate from anything Gideon did and definitely real and egregious. Cadegiani et al (Alexandros 50% right) Flavio Cadegiani did several studies on ivermectin in Brazil; I edited this section in response to criticism by Marinos and others, but the earliest version I can find on archive.is (I can’t guarantee it was the first I wrote) said:
You can find Alexandros’ full critique here, but again I’ll try to summarize it as best I can.
My responses: Alexandros’ Point 1 is fair-ish. Since this person appears to be commiting pretty substantial fraud and doing some strange things, I thought it was useful to highlight the ways in which he is weird and suspicious, rather than the ways he is prestigious and impressive. But probably I went too far in this. His Point 2/3 is completely fair, and I’m sorry for getting this wrong. I may have unthinkingly copied it from forbetterscience.com, which made this mistake before me, or I might have just failed at reading comprehension on this translated Portugese-language article I linked. In either case, I apologize to Cadegiani. This is already on my Mistakes page as of June 2022 when Alexandros wrote his original article. His Point 4 is correct, although based on information that came out after I wrote my article. All that was available in English when I wrote was that the Brazilian government was considering accusing Cadegiani of crimes against humanity. I think I did an okay job noting that I was guessing at their reasoning (rather than reporting a known fact), and as written I did make clear that I thought he was innocent of the specific charge. Still, I appreciate the clarification. His Point 5 is - I do feel like Alexandros is having a sort of missing mood on the fact that one of Cadegiani’s big pro-ivermectin studies contains impossible data. While this is not proof of fraud or incompetence, it is some Bayesian evidence for both. And while fraud or incompetence in one of your studies supporting ivermectin is not proof that your other studies supporting ivermectin are also fraudulent/incompetent, it is, again, Bayesian evidence. Alexandros makes a big deal of there being four corrections in the BMJ article attacking Cadegiani, as if now the BMJ has admitted they were wrong all along, whereas these were mostly on unrelated details and the BMJ definitely did not correct the quotes about how his study was “an ethical cesspool of violations” or how “in the entire history of the National Health Council, there has never been such disrespect for ethical standards and research participants in the country”¹. I feel like if his Science Olympiad medals are an important part of the story, these kinds of things are an important part too. Still, several of Alexandros’ points were entirely correct, and I appreciate the corrections. Babalola et al (still disagree with Alexandros) OE Babalola (I incorrectly wrote this name as “Babaloba” in the original) did a Nigerian study which found that ivermectin decreased the amount of time it took before people tested negative for COVID. I described this study as:
Alexandros calls this The Sullying Of Babalola Et Al, and says I “followed Gideon Meyerowitz-Katz off a cliff” by unfairly “lambasting” the innocent Babalola. I “[made] a mountain out of a molehill”. Alexandros quotes a commenter who found that the most likely explanation for the “impossible numbers” in Babaloba was missing data, and notes that usually-anti-ivermectin researcher Kyle Sheldrick had evaluated the raw data and found no fraud. Alexandros concludes:
I don’t think I did anything especially wrong here. There was a chart that didn’t make sense. It turned out not to make sense because some data was missing. I said “[this] seems like a relatively minor mistake, and Meyerowitz-Katz stops short of calling fraud, but it’s not a good look. I’m going to be slightly uncomfortable with this study without rejecting it entirely, and move on.” I was right that it was a minor mistake, I was right that it wasn’t fraud, and I was right not to reject the study. I didn’t have the exact explanation (missing data), so I did not mention it, but I think I made the correct guess about the sort of explanation it was. I don’t understand why Alexandros acts like I said the study wasn’t worth keeping, or that there was no innocent explanation, or that I was accusing the researchers of fraud, when in fact I said the opposite of all those things, pretty explicitly.² Carvallo et al (Alexandros 25% right) This was an Argentine study. I described it as:
Alexandros responds here. Attempting to summarize his points:
This is a good place to note that I very poor memory of what I was thinking two years ago, and am having to reconstruct my arguments as I go. Still, reading the BuzzFeed article, I notice things like:
I think this is a more nuanced story than Alexandros’ version where Buzzfeed just doesn’t know that sometimes studies happen at more than one hospital. Is fraud the best explanation? I think Alexandros thinks of Carvallo as just not keeping very good records, so he doesn’t have raw data, and probably mixed up his numbers a few times or gave false numbers, and didn’t have anything to send his collaborators when they asked. I think this is maybe possible, although it seems suspicious that he falsely said Dr. Lombardo was involved, falsely claimed the hospital involved was doing a different trial, and got very implausible results. I can imagine weird chains of events that would cause all of these things through honest misunderstandings. But they don’t seem like the best explanation. After discussing this with Alexandros, he objects to my use of the term “known fraudster”. Perhaps I should have said “highly credibly suspected fraudster” instead, although in a Bayesian sense nothing can ever be 100% and at some point plausibility shades imperceptibly into knowledge. Still, I feel like my description here was more accurate than Alexandros’, which just mentions the hospital approval issue and says nothing about any of the rest of this in a thousand word subsection about this study in particular. I did err in saying the Carvallo paper was retracted. According to the article:
I apologize for the error. Elalfy et al (still disagree with Alexandros) I described this as:
In the summary post, Alexandros’ entire criticism of my coverage of this trial, one of the seven trials he focuses on as most unfairly covered and uses as the lynchpin of his argument that I am morally culpable for disastrously bad reporting, is:
In his full post on this, he goes line by line to point out all the places they say they are non-randomized, pausing to snark about how dumb I am for not noticing each time⁴. But he never addresses the actual source of my confusion, which is the part of the paper where it says that:
If this was done as described, it should be an (almost) random trial; patients who come in on Wednesdays shouldn’t systematically differ from patients who come in on Thursdays⁵. But in fact, it looks (assuming I am understanding a very ambiguous table correctly) like there are very large pre-existing differences between the groups, sufficient to explain the entire result. If they in fact followed their days-of-the-week protocol, and it was random as expected, then I’m misunderstanding the table seeming to show very large differences, and they have indeed found evidence for ivermectin’s efficacy. If they didn’t follow their day-of-the-week protocol and it’s non-random, then maybe I’m understanding the table correctly and their groups had large differences to begin with and the fact that they had large differences at the end of the trial doesn’t demonstrate anything about ivermectin. This is all I was trying to say in the post, and instead of having any opinion on it Alexandros just makes fun of me for saying it. I think our actual crux is that Alexandros thinks a table of big differences between the groups has to be post-treatment (based on how big the differences are), whereas I’m not sure (because it’s unclear in the study, and also because the authors describe what could be a randomization method but also go on and on about how nonrandom they are). This is why I thought it mattered how random it was! Maybe instead of mocking me for this, you can admit it’s an important and relevant question! Ghauri et al (still disagree with Alexandros) I describe this as:
Alexandros notes that these are three differences between experimental/control groups, out of 33 listed characteristics that could have been different. There is approximately a 23% chance (he calculates) that you could get these differences by chance. He accuses me of failing to do a formal Carlisle test - the usual test you would use to determine whether weird differences between randomized groups are because of fraud - instead eyeballing it and getting it wrong. Here I do want to defend myself: I am not accusing Ghauri et al of fraud. In fact, this would be nonsensical: they admit they are assigning patients nonrandomly. Carlisle tests are usually done to show that something about group assignment is impossible (and therefore fraudulent) in a fair random assignment. But these people aren’t claiming to have done a fair random assignment, so I’m not sure what a Carlisle test would prove. My argument is more like: this is nonrandom, therefore we should expect it to be unfair. It is unnecessary, but helpful, to note an actual apparent unfairness - there’s some evidence they gave the ivermectin to less severe patients (as measured by corticosteroid use). Therefore, we can’t necessarily trust this to be a fair trial (which it was never really claiming to be). In the end I kept Ghauri as an okay study, although GMK didn’t so it ended out trashed in the final analysis anyway. I think my thinking was that I never claimed to be only looking at RCTs, so this non-RCT whose between-group-differences confirmed that it was indeed a non-RCT with all the risk of bias that entails, didn’t necessarily need to be ruled out. Still, I don’t think I was wrong to mention this possibility, and I think Alexandros was wrong to suggest that I needed to do extra tests for this to be fair. Borody et al (still disagree with Alexandros) I described this as:
Alexandros lists his full concerns here. My summary:
Borody et al indeed have had amazing careers with many things they can be proud of. But I continue to believe that this paper is not among them. Synthetic control groups are more common in social sciences, but have occasionally been used in pharmacology when it would be unethical or extremely difficult to use a real control group. The most common use case is rare cancers, where it takes years to get enough patients to test a drug and it also seems kind of unethical to delay. Another good thing about rare cancers is that they're pretty discrete; you don't have to worry about things like "well, 90% of leukemias never make it to a doctor anyway, so maybe we're only seeing the serious leukemias" or "these guys counted the leukemias that get dealt with by the local doctors' office, but those other guys counted the leukemias that have to go to the hospital". More important, studies with synthetic control groups usually go above and beyond to justify why their synthetic control group should be a fair comparison to the treatment group. Here's an example, from a paper about a rare leukemia. They start by getting a synthetic control group from a previous randomized controlled trial of leukemia drugs (not the general population!) Then they throw out more than half their patients for not being a good match for the selection criteria of the current study. Then they investigate whether there are significant differences on five important demographic factors, and find a few. Then they re-weight the patients in the historical comaprator study to adjust out the differences between the previous population and the current population. Then they do some analyses to check if they re-weighted everything correctly. Then they apologize profusely for having to use this vastly inferior methodology at all:
Compare this to how the Borody study discusses its synthetic control group:
I hesitate to say “they didn’t even say which tracking data”, because in the past I’ve said things like that and just missed it. But I can’t find them saying which tracking data. In Borody et al’s synthetic control group, 70/600 (11.5%) patients required hospitalization. But the US hospitalization rate appears to be about 1% for unvaccinated individuals. So Borody’s synthetic control group got 10x the expected hospitalization rate. This seems very relevant to this study finding that ivermectin decreases hospitalization by 90%! I’m not claiming this is fraudulent, or impossible, or means the study couldn’t have been good. And Borody claim to have used an “equivalent” control group, so maybe there was some adjustment done for this. But this is why we usually use more than one word to describe our control groups! Or use real control groups that don’t ruin your study if you do a finicky adjustment slightly wrong! I feel like these are the kinds of questions Alexandros needs to be asking, instead of just giving a link to a Stat News article about how sometimes synthetic control groups are okay. Also other questions, like “how come this found a 90% decrease in hospitalization and mortality, but lots of other studies found smaller decreases, and the biggest and best studies found none at all?” I know Alexandros’ answers are to find lots of flaws with the biggest and best studies, but these flaws wouldn’t be enough to cover up a 90% cure rate. And if you’re in the business of calling out flaws in studies I genuinely think having your control group be “we used some group of people somewhere in Australia, they had 10x the normal hospitalization rate, we won’t tell you anything else” would be the sort of flaw you would call out! Thomas Borody is a genuinely brilliant gastroenterologist and I am very grateful for his life-saving discoveries. But Elon Musk is a genuinely brilliant engineer and I am very grateful for his low-cost reusable rockets - and this doesn’t mean he never does crazy inexplicable things. Maybe Borody and his collaborators have a point from this study, but I don’t feel like it makes sense as written. If they ever explain what they were doing in more detail and it’s some sort of amazing 4D-chess move that makes total sense, I will apologize to them. Otherwise, stick to inventing amazing life-saving digestive therapies. In response to this section, Alexandros stresses that he is not necessarily saying Borody et al is incorrect or challenging my decision to leave it out. He writes:
III. Hokey Meta-AnalysisAlexandros points out that I used the wrong statistical test when analyzing the overall picture gleaned from this studies. He’s right. The right statistical test would make ivermectin look stronger, without changing the sign of the conclusion. After getting a core group of potentially trustworthy studies, I tried to see whether ivermectin still had a statistically significant positive effect in them. I tried to be honest that I didn’t really know how to do formal meta-analyses:
I in fact could not do simple summary statistics to this. Alexandros describes the test I should have used, a DerSimonian-Laird test, and applies it to the same data. Now the numbers are p = 0.03 and p < 0.0001. I accept that I was wrong, he is right, and this is more accurate. My original conclusion to this section is that although you couldn’t be absolutely sure from the numbers, eyeballing things it definitely looked like ivermectin had an effect. I then went on to try to explain that effect. With Marinos’ corrections, you can be sure from the numbers, but the rest of the post - an attempt to explain the effect - still stands. IV. WormsAlexandros brings up issues with the Strongyloides hypothesis; Dr. Bitterman graciously responds. I find the issues real enough to lower my credence in the idea, but not to completely rule it out. Even if it is true, I probably overestimated how important it was. My original explanation for the effect was Dr. Avi Bitterman’s theory of Strongyloides hyperinfection. Many people in certain tropical regions are infected with the parasitic worm Strongyloides. Usually a person’s immune system keeps this worm under control, and the parasites cause only limited problems. But under certain situations - especially when people take immune-suppressing corticosteroids - the immune system fails, the worms multiply, and the patient can potentially die of sudden worm overgrowth (“hyperinfection”). Corticosteroids are a common COVID treatment. So plausibly some people in tropical areas fighting COVID are at risk of dying from worm hyperinfection. Ivermectin was originally an anti-parasitic-worm medication before being repurposed to fight COVID, and everyone agrees it is very good at this. So if many people in COVID trials are dying of worm infections, then ivermectin could help them. This would look like ivermectin reducing mortality in COVID trials, and make people wrongly conclude that ivermectin treats COVID. Alexandros responds to this theory here, again I’ll try to summarize:
You can find arguments for all these points at the link. (One additional thing Alexandros does that I really like: he compares the Strongyloides hypothesis - as an attempt to explain why these studies keep getting such different results - to other hypotheses. For example, studies in Latin America get negative results more often than others. This really feels like confronting the real question. He finds that Latin American studies do find lower efficacy for ivermectin than the other mostly Asian studies, and hypothesizes that this is because ivermectin is very popular in Latin America, the “control” group illicitly takes it without telling the researchers, and so these studies are inadvertantly comparing two ivermectin groups. This is another clever and elegant theory. Unfortunately, the recent spate of negative American studies sink it⁶. Still, I agree there is a strong geographic element here; worms are one possible explanation, but there are others - including the scientific culture in different countries. I appreciate Alexandros highlighting how much this is true.) I asked Dr. Bitterman for his thoughts. He reiterates that although steroids are one major cause of Strongyloides hyperinfection, another is eosinopenia, a decrease in the immune cells that fight parasites. COVID can cause eosinopenia directly, so just because a COVID patient didn’t get steroids, or was only on steroids for a short period, doesn’t prove that the patient couldn’t have had hyperinfection. On the mixing of different sources to get Strongyloides prevalence data, he said:
On the long delay before hyperinfection kills:
On the more general argument:
I find Alexandros’ adjustment for Brazil somewhat convincing - not necessarily as a good adjustment, just in the sense that some adjustment needed to be done. I think the broader point is that results on the border of “statistical significance” often appear or go away depending on ambiguous decisions about coding single cases. Alexandros realizes this and includes a more gestalt style chart directly showing the correlation, which he says goes below the significance threshold when you recode the Brazilian studies. This chart seems to be missing some studies which might change its conclusions; it was made by a third party and Alexandros is going to get back to me with more information. Dr. Bitterman adds that more recent American studies strengthen his hypothesis. More discussion with Dr. Bitterman has also helped me better understand the context of this theory. Ivermectin does worst in studies of intermediate clinical endpoints: hospitalization, ICU admission, recovery time. It does best in studies of viral clearance rate and mortality. Viral clearance rate is a weak preclinical endpoint: not only is it especially susceptible to biases and file drawer effects, but it’s not that interesting unless it affects later clinical outcomes; many drugs change secondary endpoints but fail to change the things we care. Mortality is (usually) a strong and important endpoint; apparent positive results of ivermectin here require an explanation. The Strongyloides hypothesis tries to provide it. But I erred on my earlier post by holding it up as “the” explanation for a large and heterogenous group of studies which were mostly looking at endpoints other than mortality, or as a counter to ivmmeta’s analysis which found positive results everywhere for everything through statistical incompetence. I think I implicitly believed a stronger version of the worm hypothesis - that even in places without literal Strongyloides literally killing you, some people had some parasitic worms that were holding them back, ivermectin killed those worms, and that made them healthier overall and better able to deal with COVID. But nobody has asserted or defended that hypothesis and there’s no evidence for it. When I asked Dr. Bitterman, he pointed out that the opposite was at least as credible: parasitic worms depress the immune system, but immune overreaction is a major cause of death in COVID, so getting rid of them could make things worse rather than better. The original post should have explained this hypothesis better, devoted less emphasis to it, and focused more on publication bias and other issues that could explain the overall result. In some cases, these issues would have shed more light on the mortality statistics too. On my original post, I wrote:
In retrospect this is framed too weakly - “significant” in “some” studies is compatible with irrelevant overall. Still, sticking to the spirit of what I meant, I think I would lower this guess to more like 35% now , and lower my overall estimate of how much of the mystery it explains even further. I’m not an expert on this, you shouldn’t care about my exact probability, and I’m only mentioning it to communicate clearly and try to hold myself accountable. V. Publication BiasAlexandros has various arguments against funnel plots in general, and Dr. Bitterman’s funnel plot in particular. Some of these arguments are reasonable, but taken together they would discredit 95 - 100% of all funnel plots everywhere. Trying to destroy the whole institution of funnel plots just because one of them disagrees with your hypothesis is . . . honestly a move I have to respect. I agree that these provide Bayesian evidence, rather than 100% irrefutable evidence, of publication bias, and need to be considered in the context of everything else going on. After doing that, I still think they’re publication bias. That makes publication bias more important. In the original post, I included this funnel plot from Dr. Bitterman: In case you haven’t seen one of these before: this plots how big an effect the study found (horizontal axis) against study size (vertical axis). Studies that find ivermectin had no effect are at the center (RR = 1), studies that find a strong curative effect are to the left, studies that find a strong harmful effect are to the right. When all studies are good, we have no reason to expect a correlation between study size and ivermectin efficacy - any deviations from the true effect should be random. This would look like a triangle centered around the true effect of the drug, with an equal number of studies on both size. When there is a lot of publication bias, we should expect that small studies get published only if they find exciting results, and big studies get published regardless (because a lot of work went into them, someone will want to publish them, and journals will accept them regardless of how exciting they are). So here you would expect to see big studies around zero, and an asymmetric tail of smaller studies heading in the more-exciting direction. This is what we see on Dr. Bitterman’s plot, suggesting strong publication bias for ivermectin results. Alexandros’ full counterargument is here. Trying to sum it up:
My response to Point 1: Funnel plots are a widely-, almost universally- used tool in meta-analyses. Alexandros has found some finicky statisticians saying that maybe they are bad, but for any statistical test there are finicky statisticians saying that maybe they are bad. Alexandros says that Dr. Bitterman’s funnel plot fails standards laid out by a 2006 Ioannidis paper, but that paper says that 95% of funnel plots in the Cochrane database (considered an especially high-quality database of meta-analyses) would fail its standards. When John Ioannidis attacks funnel plots, I am fine with this because Dr. Ioannidis is known to be unusually rigorous and this is part of his pro-rigor crusade. But when Alexandros gets angry at me for rejecting Borody et al, whose control group was “we got a control group from somewhere, it had 10x the normal hospitalization rate, don’t ask questions” - or thinks it’s offensive to suspect Carvallo, whose statistical analysis was “the person I claim was my statistician denies ever having been associated with me and explicitly accuses me of lying, but whatever, here are some numbers proving that zero people who took ivermectin died” - then I don’t think he can fairly demand Ioannidean levels of rigor when it serves him. (am I incorrectly switching from truth-seeking to social-fairness-games? Probably partly, but I also think a part of truth-seeking is to apply the same level of rigor to all evidence. If we demand infinite rigor on both sides, nothing will ever pass our bar. If we demand zero rigor on both sides, we’ll be left with lots of confusing garbage. There are many defensible points on the how-much-rigor-to-demand continuum, but whichever one you choose it’s important to stick with it on both ends. This is especially true when we’re not expert statisticians and some of what we’re debating is which sources to trust and which tests are acceptable.) Points 2 - 3 are worth taking seriously. These studies definitely have lots of heterogeneity. It’s why we’re having this discussion at all - if all studies agreed, then ivermectin would obviously work/fail and it wouldn’t be a controversy. An asymmetric funnel plot shows that something unusual is going on. Publication bias is one common kind of unusual thing that causes this pattern. Therefore, asymmetric funnel plots provide evidence for publication bias. But they don’t prove it. Other things like heterogeneity can produce the same pattern. If Alexandros’ point is that we need to think clearly about whether an asymmetric funnel plot shows publication bias or something else, his point is well taken.⁷ But in this case, it’s probably publication bias. An exciting new medication for a deadly pandemic, where it would be revolutionary if it worked, but also boring and obvious if it didn’t, and which is originally tested mostly in small trials - is a situation perfectly designed to elicit publication bias. Alexandros presents two arguments for why publication bias is unlikely. First, our selection of studies comes from IVMMeta, a site that included not just published articles but preprints and vaguely-written-up summaries of experimental results. This removes one source of publication bias: bias in what journals choose to publish. But it doesn’t remove another: bias in what studies get written up, even at the vague summary level. I asked Dr. Bitterman for his assessment of how often this happens; he says it’s very common, even with medium-sized studies. When I pressed him on how medium-sized, he says he knows institutions that might not publish a hundred-person RCT if its results were too boring. Most of the studies on IVMMeta were smaller than that, so publication bias is still likely. Second, he thinks he has found a specific example of heterogeneity: Red studies (late testing) seem disproportionately on the left compared to blue studies (early testing). Here’s my contribution to the effort: I think this is at least as convincing. I don’t think Alexandros would disagree - part of his point is that with so few studies, you can support any kind of heterogeneity you want. He hasn’t so much found real heterogeneity, as shown that with a funnel plot of so few studies, you never know. This funnel plot shows the viral clearance results. This is important because it’s one of the areas where ivermectin studies have most commonly found a signal of efficacy. But Dr. Bitterman has also looked for publication bias in ivermectin studies overall. For example, here’s a plot of all studies on IVMMeta.com: Here’s just the mortality studies: These aren’t confounded by day of PCR test (they’re mostly not testing PCR)⁸, but they’re at least as skewed as the PCR plot. Either by coincidence every funnel plot about ivermectin results that you can generate is confounded by a different kind of heterogeneity, or this area with high incentives for publication bias, investigated by types of studies where publication bias is extremely likely, has publication bias. VI. What’s Happened Since 2021?Since I wrote my post, several new ivermectin studies have come out. I briefly looked to see if there were more small positive trials of the type analyzed above, but didn’t find too many of them. Studying ivermectin unsurprisingly seems to have gone out of vogue among ordinary doctors (or I’m getting worse at finding the studies). But there were three new big RCTs - I-TECH from Malaysia, and ACTIV-6 and COVID-OUT from the United States. All three found no effect. With these studies (notably from low-parasite areas) meta-analyses of mortality no longer show any effect. You can read Alexandros’ criticisms of ACTIV-6 trial here (1, 2, 3, 4), and his criticisms of I-TECH here. I don’t think he’s criticized COVID-OUT yet, but I’m sure it’s only a matter of time. However skeptical you were of ivermectin efficacy in 2021, you should be more skeptical now. It’s not directly related to ivermectin, but around the same time some studies seemed to show that another medication, fluvoxamine, did treat COVID effectively. I welcomed those studies and said that, although they had not yet firmly proven that fluvoxamine worked, the risk-benefit ratio was high enough that doctors should prescribe it and patients should ask for it. Later trials, including some of the same ones that found no effect for ivermectin, also found no effect for fluvoxamine. I no longer believe that fluvoxamine is likely to help COVID. The whole “repurpose existing drugs against COVID” idea seems to have been a big wash. VII. Conclusionsa. I ended my original post by saying I had 85 - 90% confidence that ivermectin didn’t have clinically significant effects. After a year, I’m upgrading that to 95% confidence, for a few reasons. First, the many new well-done negative trials that have come out, mentioned above. Second, the better visualizations of publication bias, that have convinced me that this is more of a problem than I previously thoughts. Third, feeling like all of this analysis has actually gotten somewhere. I started looking into this because I wanted to know why studies so often fail, or return contradictory results. I appreciated the perspective Dr. Bitterman relayed to me on this, which is - come on, this always happens, we do Phase 1 trials on a drug, it looks promising, and then we do Phase 3 trials and it fails, this is how medicine works. He’s right and this is the correct attitude towards drug development. But in other fields - the one that comes to mind now is declining sperm count, which I’m trying to write an article on - there aren’t the equivalent of Phase 3 trials. Just a hodgepodge of smaller or bigger studies, probably about as good as the ones that find ivermectin improves viral clearance, producing a hodgepodge of noisy results. Then some statistician draws a line through the noise and tells us the line is pointing up and that means we should be worried. Should we? What about silexan for anxiety? There are five studies - better than the worst ivermectin studies, but nowhere close to Phase 3 - and they find positive results. Silexan would revolutionize the treatment of anxiety and help avoid medications with much worse side effects. Do I start recommending it as a first line treatment? My first post was very limited progress towards this goal; I felt able to make the case ivermectin wasn’t useful, but not to explain why so many studies found the opposite. I feel slightly more confident now… …that I have reasonable explanations for what went on in about 26 of the 29 studies. Two of the remainders - Aref and Elalfy - give off gestalt vibes of untrustworthiness. The last one, Bukhari, remains mysterious. It’s a viral clearance study, and this is an area without lots of publication bias. But it got results p = 0.001, and you would need a thousand studies in file drawers to produce one like that. Weird. b. Alexandros has grappled with this same mystery, and taken a different route, Alexandros has wrestled with these same problems, but solved it differently. He’s looked for (and found, as per his judgment) flaws in the few big trials, which lets him keep most of the rest. I haven’t read his whole corpus on this, but am generally not impressed. He does find some issues, but all trials will have some minor errors or ambiguities. And the big trials will have more documentation you can look through to find things to nitpick or be confused. Alexandros has previously stressed that he doesn’t mean to express certainty that ivermectin works. He calls his style of reasoning Omura’s Wager, by reference to Pascal’s Wager. If you use ivermectin, and it doesn’t work, then you’ve wasted your time and maybe gotten a few minor side effects. If you don’t use ivermectin, and it does work, then you’ve missed out on a potentially life-saving medication. Therefore (he concludes) given even a little remaining uncertainty about whether ivermectin works, you should use it. This isn’t how mainstream medicine thinks about this in any other context, and if true it’s much more interesting than a debate around one particular repurposed dewormer. I try to respond in Pascalian Medicine. But since then, there’s been more evidence that ivermectin at the doses used in COVID studies might be harmful. Both the I-TECH study and Dr. Bitterman’s analysis found more severe side effects in ivermectin groups compared to placebo. Not only does this challenge ivermectin in particular, but using it as a test case calls Omura’s Wager into question more generally. Acknowledging that the Wager debate is interesting, I also asked Alexandros to, gun-to-his-head, tell me how likely he thinks it is ivermectin works. To his credit, he gave a clear response:
I said 5% above, so it seems like we still haven’t converged. Sad! c. I made several mistakes in the first version of this post: First, I made a major mistakes on one of the studies I looked at (Biber). Second, I was directionally correct on others, but too quick to accuse people of fraud on data that was merely suggestive, and to mock them for it. Nobody can be right all the time, but I try not to be simultaneously wrong and smug. I failed at that here and need to do better at Principle of Charity, even when looking at twenty-nine boring and often terrible studies one after the other until my brain turns to mush. Third, I messed up the informal “meta-analysis”. I said in the post it was probably wrong and I was just churning something out to get a vague idea, but it was still embarassing and (weakly) misleading. Fourth, I probably overemphasized the importance of worms, and underemphasized the importance of publication bias. This left me confused about some non-mortality-related results, and overall more confused about the literature and less willing to dismiss ivermectin than I should have been. Realistically, this project was outside my expertise and competence level. But I’m not exactly apologizing for publishing it. I think the discourse on this was terrible. The mainstream media just repeated that Elgazzar was a fraud, again and again, without mentioning the dozens of other studies that found positive effects of ivermectin. Academia knew what it was doing, but mostly failed to communicate their reasoning to the public, beyond giving short snippets for interviews harping yet again on how Elgazzar was a fraud. A few data detectives and epidemiologists with Twitter accounts took potshots at specific claims, but almost never in a way that explained the bigger picture (Dr. Bitterman was the main exception). In contrast, the pro-ivermectin side did an amazing PR job. IVMMeta was tireless, fantastically designed, and presented exactly the kind of gestalt picture nobody else bothered to touch. Alexandros dedicated much of his time over two years filling in details of the pro-ivermectin case and gathering a network of amateur epidemiological detectives contributing to the effort. This replicates a pattern I’ve seen again and again: the mainstream shies away from deep analysis of contraian theories, not wanting to dignify them with tough analysis, while obsessive contrarians produce compelling and comprehensive guides to their side of the argument. I wanted to gather a gestalt argument for the consensus in one place and help relieve this embarrassing state of affairs. I give myself a C+ for results but an A for effort. I wish other people would do this so I could stay in my lane of sniping at people with bad opinions about antidepressants. I appreciate everyone who helped gather this information, question its assumptions, fortify key points, and beat it into shape, including Alexandros. 1 Alexandros responds: “These are quotes by a local official who is involved in the case. They're not the BMJ's opinion. I believe, but am not sure, that Cadegiani is in a legal dispute with this peroson. And in any case, they're non-factual and hyperbolic. I can only look at the facts of the matter. The fact that this official is making hyperbolic statements can be interpreted as an argument on either side of the debate.” 2 Alexandros responds: “This is extremely common in studies, and accepted as normal. TOGETHER is missing ‘time since symptom onset’ for 23% of its patients, and age for 7% of its patients. Each of these are inclusion criteria, so one wonders how these patients were even included in the first place. And yet, nobody bats an eyelash. In principle, I have no objection if one wanted to make "missing data" a factor to filter by, but then every study has to be looked at the same way, not just Babalola. More here. 3 Alexandros adds: “This is from the buzzfeed piece: ‘Javier Farina, an infectious diseases doctor at Hospital Cuenca Alta and a member of its ethics committee, acknowledged that staff members may have individually participated and noted that it is common in Argentina for employees to work at different hospitals.’ -- I don't think this one is ‘according to Carvallo’. Surely Buzzfeed has to substantiate that there is in fact an issue with an adult medical professional enrolling in a study of their choice.” 4 Alexandros writes: “I checked the original, and I'm not sure what this is referring to. Was I irritated when I was writing this? Sure. But I don't see any case where I make any aspersions about your intelligence or fling any insults whatsoever. If I did write something along those lines, please point it out. I tried quite hard to keep my tone as neutral as I possibly could, but I can see how something may have slipped.” I appreciate this. I acknowledge he did not literally insult my intelligence in so many words, but I found paragraphs like:
…pretty grating, given that I don’t feel like Alexandros understood or responded to the reasons why I was uncertain about how random this trial was. 5 Alexandros writes “But patients who come in the weekend can totally differ systematically from patients who come in the weekdays.” First, I think (though this is very speculative) that the trial used a six day week to avoid Friday, the Muslim equivalent of the weekend. Second, although this is a possible source of bias, it’s a pretty weak one, and I’m still interested in whether the only source of bias is weekend vs. weekday, or whether this is actually nonrandom in some more general sense. 6 Alexandros writes: “Given that these new trials were mostly in late '21 and '22, and the mortality in them was extremely low, I don't believe they would alter the results of my analysis, since studies with few if any events get very low weights in this style meta-analysis. I haven't re-run it, but if it will make a difference, let me know and I'll give it a go. I also think (but won't take it the wrong way if you ignore me on this) it is unfair to introduce new data mid-argument. As you probably know, I have serious objections about these negative American studies (the most serious of which I have not yet published) and believe they are worthy of a separate conversation. My point, i believe, has consistently been that the confidence people are projecting that ivm doesn't work isn't supported by the evidence. either strongyloides was particularly strong as an explanation at the time, or there were other hypotheses we should have considered.” I agree it’s in some sense unfair if I use introduction of new data mid-argument to defend myself, and I don’t mean to do that. I do think it’s always fair to get new data to try to resolve the original question. And if it turns out a theory was right about the original question, this does sort of defend the at least the spirit of the theory, if not its implementation. 7 Thanks to Dr. Bitterman for helping clarify this 8 See here for a separate look at timing as a confounder. You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Older messages
Mantic Monday 1/30/2023
Tuesday, January 31, 2023
One million Metaculi, fake stocks, scandal markets again
Open Thread 261
Monday, January 30, 2023
...
Janus' Simulators
Thursday, January 26, 2023
This post isn't about AI, but bear with me
You Don't Want A Purely Biological, Apolitical Taxonomy Of Mental Disorders
Wednesday, January 25, 2023
...
Who Predicted 2022?
Tuesday, January 24, 2023
Winners and takeaways from last year's prediction contest
You Might Also Like
Inside The Plan To Let Trump Track Millions of Immigrants
Tuesday, November 26, 2024
The private prison lobby has been quietly pushing a drastic expansion of ICE's surveillance apparatus. Trump's reelection may be the final step. According to records uncovered by The Lever,
Act Fast! Your Exclusive Membership Offer Awaits!
Tuesday, November 26, 2024
Annual Digital Membership Just $49.99 For Your First Year Subscribe for as low as $1/week your first year. Act fast! Exclusive Membership Savings — Act Now! Don't miss the Cyber Sale! Dive into a
Get ready to argue about Thanksgiving recipes
Tuesday, November 26, 2024
Plus: Trump charges dropped, an IVF mixup, and "holding space" for Wicked's press tour. November 26, 2024 View in browser Melinda Fakuade is a staff editor on the culture and features
Trump Charges Dropped, Best Christmas Lights, and a Sombrero Galaxy
Tuesday, November 26, 2024
The Justice Department dropped all charges against President-elect Donald Trump in the federal case charging him with conspiring to overturn the 2020 election. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Numlock News: November 26, 2024 • Butterfly, Hurricane, Insurance Nightmare
Tuesday, November 26, 2024
By Walt Hickey ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
☕ Playing the villain
Tuesday, November 26, 2024
Macy's had a $154 million blunder... November 26, 2024 View Online | Sign Up | Shop Morning Brew Presented By BambooHR Good morning. President Biden continued the curious White House Thanksgiving
China has utterly pwned 'thousands and thousands' of devices at US telcos [Tue Nov 26 2024]
Tuesday, November 26, 2024
Hi The Register Subscriber | Log in The Register Daily Headlines 26 November 2024 US China tech trade war China has utterly pwned 'thousands and thousands' of devices at US telcos Senate
What A Day: Hindsight is 2024
Tuesday, November 26, 2024
The Harris campaign leadership speaks out for the first time on what went wrong. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
What the Tweens Actually Want
Tuesday, November 26, 2024
Plus: What Neko Case can't live without. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission.
Dr. Oz Shilled for an Alternative to Medicare
Monday, November 25, 2024
Columns and commentary on news, politics, business, and technology from the Intelligencer team. Intelligencer politics Dr. Oz Shilled for an Alternative to Medicare Trump's pick to oversee the