Lately, anti-vaccers have been touting a “new” vaccinated vs. unvaccinated study that purportedly shows that vaccines are associated with all manner of detrimental health conditions. I put the word “new” in quotes, because this study was actually accepted for publication once before, but the journal that had accepted it (Frontiers in Public Health) retracted it before it actually came out. Following that retraction, the authors managed to get it published in the Journal of Translation Science as, “Pilot comparative study on the health of vaccinated and unvaccinated 6- to 12-year-old U.S. children.” That journal has now also retracted it, but I somehow doubt that is going to matter to anti-vaccers. This is, after all, the group that continues to praise Andrew Wakefield, despite his work being retracted due to extremely clear evidence that he falsified data, violated ethical rules, and was in it for the money. Therefore, I want to actually go through this study and explain why it is absolutely terrible. As usual, my goal is provide a worked example of how to critically analyze scientific papers, rather than simply addressing this one study (see my previous posts here, here, here, and here).
Note: I started writing this post several days ago, then became busy with my actual research, and by the time I got back to it, several excellent blogs/sites had already dealt with it, but since I had already started it, I figured I might as well finish it and add my two cents to the discussion.
Biased sampling design
Before I talk about the methods used in the paper itself, I need to talk about some basic concepts in designing studies. The first rule of experimental design is that you need to get representative samples, because failing to get a proper representation of the population will give a biased result. Imagine, for example, that I conducted a poll to see how popular Donald Trump was, and to do that, I sent the poll to multiple Republican organizations and asked them to distribute it among their members. Obviously, I would get a very biased result, because my sampling was not representative of the entire population (i.e., I sampled a group that predominately likes Trump, rather than sampling the population as a whole). Conversely, if I had sent that poll to liberal groups, I would have gotten a result that was biased in the opposite direction. Do you see the point? If you use a biased sampling method, you will get a biased result. That is why it is so important to randomize your sampling rather than doing what is known as “convenience sampling.” This is where you get your data from a group that is convenient, rather than a group that is representative. For example, if I wanted to conduct a study on how people felt when eating organic vs traditional food, it might be convenient for me to stand outside of a Whole Foods and poll people as they exit, but that would obviously be an extremely biased design. Of course, they are going to say that eating organic makes them feel healthier, that is why they are there. So whenever you are reading a paper, take a good look at how they did their sampling, and make sure that it wasn’t biased (note: when I said that this was rule #1, I was being literal, this is literally the first thing that I was taught in the first stats course I ever took).
So, how did this paper do its sampling? You probably guessed it; it used convenience sampling.
“the object of our pilot study was not to obtain a representative sample of homeschool children but a convenience sample of unvaccinated children of sufficient size to test for significant differences in outcomes between the groups.”
That’s not how this works. Statistical tests are not magic wands. They rely on strict assumptions, one of which is that your sampling was done correctly. If you put garbage in, you are going to get garbage out. So, saying that you used convenience sampling so that you could run statistics makes no sense, because the usage of convenience sampling prevents you from drawing valid statistical inferences.
To give you some more details about what this study did, it sent a survey to homeschool groups and asked them to distribute it among their members. This survey asked various questions about the children’s vaccination status, health, conditions during pregnancy, etc. This is obviously a huge problem, because, as the paper itself admits, there is a disproportionate number of anti-vaccers among homeschoolers. Indeed, it is not at all difficult to imagine anti-vaccers eagerly sending the survey to their anti-vax friends and telling them to fill it out (I have seen this behavior on similar online surveys). Thus, the paper clearly had a biased sample, so of course it found that vaccines cause problems. When you use an extremely biased sampling design like this, you expect to get a biased result, just as I would expect to find that Trump is popular if I used a biased sample of Republicans. Given this design, it would have been shocking if the paper found anything other than “vaccine injuries.”
We could actually stop right here and be done with this paper, because this biased sampling design completely precludes any meaningful comparisons. We can already see that the results are meaningless, but wait, there’s more.
All the results were self-reported
Not only did this study bias the people that it sampled, but it also used an extremely unreliable method of getting data from them. You see, the survey was entirely, 100% self-reported. Parents never had to send in any medical documents. Rather, they simply reported the vaccines their child received, the health conditions, etc. This is a terrible data collection design because memories are notoriously easy to influence, and self-reporting like this frequently fails to give reliable results (especially when using a biased sample).
The authors described several ways that they tried to eliminate these biases and control the reliability issues, but they were not at all adequate. First, they said that parents were instructed to consult their actual medical records, rather than relying on memories. There is, however, absolutely no guarantee that parents actually did that. People are lazy and generally bad at following instructions, so it is really hard to believe that all the parents in this study actually looked up their child’s medical records. Similarly, the authors told parents only to report conditions that were actually diagnosed by a physician, but once again, there is no way to know if parents actually did that, and, in fact, it is extremely likely that parents didn’t follow that. Imagine that you have an anti-vaccine parent who is convinced that vaccines are dangerous and injured their child, but since they don’t trust doctors, they never actually had the condition diagnosed. Do you honestly think that this parent isn’t going to jump at the chance to report their child’s “vaccine injury” even though they never got an official diagnosis? People lie on medical forms all the time, why should this be any different? Further, even if we assume that no parents deliberately lied, that doesn’t address the issue of “alternative” health practitioners. A parent may have had a condition “diagnosed” by a naturopath, which in their mind, counts as a medical diagnosis even though it’s actually nothing of the kind.
Now, you may say that this is all very speculative and I can’t actually prove that parents didn’t strictly obey the rules set out by the authors, and you are correct, but you’re missing the point. Reliability is one of the requirements for a good scientific study, and I shouldn’t have to assume that parents reported the data accurately. The fact that I am forced to make that assumption makes this study unreliable. In other words, I don’t know that parents relied on memories, lied, used the diagnoses of pseudo-physicians, etc., but I also don’t know that they didn’t do any of those things, and that is the problem. It means that we have no reason to be confident that these data are accurate. In contrast, if actual medical records had been supplied by the parents, then we would know that the reports are accurate. Additionally, it is worth mentioning that it would not take many dishonest and/or lazy parents to bias the study. Their sample sizes were quite small (especially for the children who had certain conditions), so just a handful of parents who provided bad info could seriously skew the results.
To be fair, the authors state that they chose not to collect actual records simply because after talking to several homeschool groups, the realized that doing so would result in a much smaller sample size and, therefore, prevent them from doing the study. It is probably true that they had no choice but to do it this way, but that doesn’t automatically make this method valid. In other words, given the choice of either not doing the study at all or doing the study by using an extremely unreliable method, they should have chosen the former. Unreliable data are often worse than not having any data at all.
Finally, the authors said, “Dates of vaccinations were not requested in order not to overburden respondents and to reduce the likelihood of inaccurate reporting,” but this seems totally backwards and nonsensical to me. Think about it. They already instructed the parents to look at the medical records rather than relying on memories, so if the parents already have the forms in hand, how is it an “overburden” to ask them to write down the date? Indeed, asking for a date seems like it should make parents more likely to actually consult their records, rather than going from memory. It’s almost like the authors tried to make this study as terrible as possible.
Differing numbers of doctor visits
There is one final flaw in their sampling/data collection that needs to be addressed. Namely, anti-vaccine parents and pro-vaccine parents generally differ in the frequency with which they take their children to the doctor. Indeed, this study even found that these groups differed in how frequently they visited the doctor. This is another big problem, because if one group visits the doctor more often, then you expect them to be diagnosed with more conditions by mere virtue of the fact that they see doctors more regularly (thus maximizing the odds of detecting any problems). This automatically biases the study towards higher numbers of problems in vaccinated children.
Finally, we get to the statistics, and they are a disaster. The first thing to note is that they compared vaccinated and unvaccinated for over 40 conditions. That is potentially problematic, because with that many tests, you expect to get some false positives just by chance. This is what is known as a type I error. I have explained this in detail elsewhere (here, here, and here), so I’ll be brief. Statistical tests give you the probability of getting a difference as large or larger than the one that you observed if there is not actually a difference between the two groups from which your samples were taken. Typically, in biology, we set a threshold at 0.05 and say that if the probability (P value) is less than 0.05, then the result is statistically significant. In other words, for statistically significant results, there is less than a 5% chance that a result that is identical or greater could arise by chance. If you think about that for a second though, it should become obvious that you will sometimes get false positives. Further, the more tests that you run, the higher your chance of getting a false positive becomes. Therefore, whenever you are using lots of different tests to address the same question (e.g., “are vaccines dangerous?”) you should adjust your significance threshold (alpha) based on the number of tests you run. However, the authors of this study failed to do this, and given the enormous number of tests they ran, some of their results are probably just statistical flukes. (on a side note, their tables are also very deceptive, because they only showed the results that were significant or nearly significant, which makes it look like vaccines were significant for almost everything, when in reality, only a handful of their 40+ conditions were significant. This is not an honest way to present the results).
Additionally, beyond testing numerous different conditions, they also went on a statistical fishing trip with how they structured the tests. For many conditions, they made comparisons among fully vaccinated, partially vaccinated, and totally unvaccinated children, then also made comparisons among all children who received at least one vaccine, and all children who receive no vaccines. That is a problem because, again, the more tests you do, the more likely you are to get false positives. You can see why this is a problem when you look at things like ADHD and ASD. When they used all three groups, ADHD was not significant, and ASD was barely significant (though it wouldn’t have been if they had controlled their type I error rate correctly), but when you jump down to the tests with fully and partially vaccinated children lumped together, suddenly you get stronger significances (if we ignore the type I error problem). Thus, by doing multiple tests, they were able to get one to be significant. That is not ok. It is what is known as “P hacking.” You can’t just keep manipulating and retesting your data until something significant falls out. The correct way to do this would have been to define the groups ahead of time (a priori) then only run the comparisons on those pre-defined groups.
Before I move on, I also want to point out that that lumping the partially vaccinated and fully vaccinated children makes little sense, because the partially vaccinated group should be all over the place. For example, surely, a child who only received one vaccine would be more similar to the unvaccinated group than the fully vaccinated group. This is yet another way in which the study was not designed reliably.
The next problem is confounding factors (i.e., things other than the trait you are interested in that differ between your groups and have the potential to influence the results). You see, the chi-square tests that the authors used are quite simplistic, and they have no mechanism for dealing with confounding factors (which is, once again, why it is so important to randomize your samples rather than using convenience sampling). The things being tested are, however, quite complex. For many learning disabilities, for example, it is well known that they are affected (or at least tied to) race, sex, genetics, etc. Therefore, unless the two groups you are comparing are similar with regards to all of those factors, your tests aren’t valid. However, the authors gave no indication of matching their groups by race, sex, medical history, etc. Indeed, the authors even acknowledged that there were differences between the two groups with regards to at least some factors (like the use of medicines other than vaccines). Therefore, we can automatically chuck out all the comparisons that used “unadjusted data.” In other words, all the results in tables 2 and 3 are totally meaningless. All those comparisons between vaccinated and unvaccinated children are utterly worthless because the experiment was confounded and the authors didn’t account for that. So even if they had sampled randomly, used actual medical records, and controlled the type I error rate, those results would still be bogus.
Next, the authors move on to look specifically at “neurodevelopmental disorders (NDD),” but to do this, they combined children with ADHD, ASD, and any learning disability. That is not a valid, reliable way to do this, because those things are quite different from each other. You can’t just lump any learning disability in with ASD then going fishing for a common cause. They aren’t the same thing, and there is no reason to put them together.
Further, at this point their methods become really unclear. They say that they used, “logistic regression analyses,” but they don’t give any details about how the model was set up. There are a lot of assumptions that need to be met before you can use this method, and they don’t state whether or not those assumptions were met. Similarly, it is very easy to set up these models incorrectly, and they give almost no information about how theirs was constructed. I need to know things like whether they tested for multicollinearity, but that information isn’t given. Further, based on what little description they do give, it seems like they almost certainly over-fit the model by including meaningless categories like religion. Things get even worse from there, because they start talking about adjusting the model based on significant patterns, but they give no explanation of how they made those adjustments. A proper paper should not make you guess about how the statistics were done. In other words, when you read a paper, ask yourself the following question, “if I had their data set, could I use the information in the paper to exactly replicate their statistical tests?” If the answer is, “no,” then you have a problem, and you should be skeptical about the paper. The type of extremely terse description that is given in this paper is totally unacceptable.
Now, at this point, you might protest and argue that I am assuming that they set up the model incorrectly. That is, however, not what I am doing. I’m not saying that they did it wrong; rather, I am saying that I don’t know if they did it right. A good paper should describe the statistics in enough detail that I know exactly what they did, and this paper doesn’t do that. It does not make it possible for me to evaluate their methods. To put that another way, extraordinary claims require extraordinary evidence, and if you want to say that vaccines cause neurological disorders, then you are going to need some extraordinary evidence, and a paper where I have to assume that the authors knew what they were doing simply doesn’t cut it. It’s not good enough. Further, given all of the other problems with this paper, it seems pretty clear that the authors did not know what they were doing, so I am not at all willing to give them the benefit of the doubt when it comes to logistic regression.
In short, this paper is utterly terrible from start to finish. It used an extremely biased sampling design, it used an unreliable data collection method, and it used bogus statistical tests that were poorly explained and failed to control confounding factors and the type I error rate. Indeed, when you look at how this study was designed, it was set up in a biased way that almost guaranteed that it would find “evidence” that vaccines are dangerous. It would have been shocking if such a horribly designed study found anything else. This isn’t science, not even close. It is a junk study that very much deserved to be retracted.
- 10 steps for evaluating scientific papers
- 12 bad reasons for rejecting scientific studies
- Does Splenda cause cancer? A lesson in how to critically read scientific papers
- Is the peer-review system broken? A look at the PLoS ONE paper on a hand designed by “the Creator”
- Most scientific studies are wrong, but that doesn’t mean what you think it means
- No, homeopathic remedies can’t “detox” you from exposure to Roundup: Examining Séralini’s latest rat study
- Peer-reviewed literature: What does it take to publish a scientific paper?
- The hierarchy of evidence: Is the study’s design robust?
- Understanding abstracts: Does the study say what you think it says?
- Who reviews scientific papers and how do reviews work?