COVID comorbidities are not analogous to car crashes: Debunking the 6% mortality claim

CDC COVID twitter tweet #only 6%Recently, the CDC released data on COVID comorbidities, including data showing that 6% of COVID-19 deaths only listed COVID on the death certificate, while the remaining 94% of COVID deaths also listed other conditions. Many have jumped on this as proof that COVID is far less deadly than previously claimed, and they are arguing that most reported COVID deaths are actually just people who died of some other condition while happening to have COVID. In particular, I keep seeing an analogy of someone who has COVID getting hit by a car, then the death being attributed to COVID. This is a very bad analogy (and faulty argument in general) that horribly mischaracterizes these data. So, I want to briefly explain what is actually going on.

First, you need to realize that when a patient dies, doctors list all of the factors that contributed to the death. This often includes multiple conditions, at which point we call them “comorbidities.” In the case of COVID, two main things are happening. First, in some cases, people have a pre-existing condition that interacts with COVID and makes them more likely to die from COVID. Second, COVID leads to conditions that then contribute to the death.

Let’s start with the pre-existing condition situation. We know that people with some health conditions are more prone to die from COVID than people without those conditions, because those conditions make them more vulnerable to COVID. Thus, there is an interaction between COVID and the pre-existing condition, with both contributing to the death. Importantly, however, in most cases, the person would not have died at this particular point in time had it not been for COVID. In other words, something like an existing respiratory problem makes people more sensitive to COVID, resulting in a higher death rate when infected with COVID. That does not mean that COVID wasn’t a key factor in their deaths. It is simply that it was not the only factor.

By way of analogy, imagine that someone with asthma gets trapped in an environment with lots of smog, ultimately resulting in an inability to breathe and subsequent death. What killed them? Well, both the asthma and the smog played a role. The smog was a serious problem because of the asthma, but conversely, they could have kept on living with the asthma had it not been for the smog. If we could have prevented them form being exposed to the smog, they would have lived.

Even so, for many people, COVID is fatal because of interactions with other conditions, but that still means that COVID was fatal. It still means that they would have lived had it not been for COVID.

To give one final analogy, imagine a disease that is far deadlier in men than in women. Imagine that we look at the mortality statistics form that disease and see that 94% of deaths were from men. It would clearly be absurd to say, “they didn’t die form the disease, it was being a male that killed them.” That would obviously be nuts. It would be apparent to everyone that there was an interaction between the disease and sex that causes men to be more sensitive to it. Even so, there are interactions between many pre-existing conditions and COVID that make people with those conditions more sensitive to COVID and more likely to die from it.

On the flip side, many of the reported comorbidities are actually caused by COVID. Look at the data from the CDC. The single most common comorbidity category* (68,004) was influenza/pneumonia. These diseases are often secondary infections that happen as a result of viral infections. Similarly, respiratory failure was present in 54,803 cases. Again, this is something that we know COVID causes. So many of these comorbidities are actually caused by COVID!

*Technically, the most common category was “other” which includes a very wide range of conditions that were grouped together because each was too uncommon to merit its own category. Thus the influenza/pneumonia category was the most common category for discrete diseases, rather than the large hodgepodge of conditions.

By way of analogy, the argument being made by science deniers is no different from someone bleeding out from a gunshot wound, then someone else saying, “bullets aren’t dangerous, because she died from blood loss, not the bullet.” That’s obviously a dumb argument. She only lost the blood because of the bullet. Even so, many people are only dying from conditions like respiratory failure or heart failure because of COVID19.

It is also worth noting that, as is often the case, this argument is straight out of the anti-vaccine playbook. For diseases like measles, secondary infections with diseases like pneumonia often contribute to children’s deaths. Thus, anti-vaccers incorrectly argue that measles isn’t deadly because the pneumonia is what killed them. Just like COVID and my gunshot example, however, they only developed pneumonia because of measles.

So now, with all of that in place, let’s circle back the analogy of someone getting hit by a car. I like analogies a lot. I have frequently argued that they are valuable for testing whether consistent reasoning is being applied. However, as I have explained before, for the analogies to be useful, they must follow the same logical structure as the original argument. That is very clearly not the case here. Someone who happens to have COVID getting hit by a car is a very, very different thing from either someone with a pre-existing condition that predisposes them to complications from COVID dying from an interaction between the condition and COVID or COVID itself causing a secondary condition.

Do you see the difference? The vast majority of comorbidities listed are directly related to COVID either as a factor that exacerbates the situation or as a result of COVID. In contrast, the car accident has nothing to do with COVID. They are not analogous, and anyone who would use such a clearly terrible argument obviously does not know what they are talking about.

Having said all of that, there are almost certainly some cases in this database where COVID truly wasn’t the cause. There are probably some cases where someone who had COVID just happened to have a heat attack that would have happened without the COVID, or where someone who had COVID was in an accident, but when you start looking closely at the data, those are clearly a very tiny minority, and the vast majority of comorbidities relate to COVID. Indeed, beyond these data and all the data looking at how COVID attacks the body, we also know that there have been far more deaths this year in the US than there were during the same time period last year (Weinberger et al. 2020). Indeed, there are more excess deaths than the total number of reported COVID deaths. Understanding exactly what that means is very complicated because there are many contributing factors. We may be underestimating COVID deaths, but also, there may be increased deaths due to factors like people not seeking medical help for conditions for which they normally would seek help. Conversely, things like a decrease in car accidents could pull the number the other direction. However, several pieces of evidence (such as a spike in excess deaths in places that had large outbreaks with many reported COVID deaths; e.g., New York city) indicate the COVID is a key factor in the number of excess deaths seen this year, and it is very unlikely that we are grossly overestimating the COVID mortalities.

As others have pointed out, the correct way to look at this 6% figure is not that only 6% of reported COVID deaths were actually from COVID. Rather, it means that of all the people who died from COVID, 6% did not have any other reported conditions. In other words, these data show that some people are more vulnerable to COVID than others due to existing health conditions (which we already knew) and COVID often results in secondary problems which contribute to patients’ demise (again, which we already knew). Stop trying to twist science to fit your personal agenda and look rationally at the facts. Think critically and don’t blindly believe something just because you saw it on Facebook or Twitter.

Posted in Uncategorized | Tagged , | Comments Off on COVID comorbidities are not analogous to car crashes: Debunking the 6% mortality claim

What does “statistically significant” mean?

Lately, social media has been flooded with people sharing studies about various aspects of COVID. This is potentially great. I’m all for people being more engaged with science. Unfortunately, many people lack a good foundation for understanding science, and a common point of confusion is the meaning of “statistically significant.” I’ve written about this at length several times before (e.g., here and here), so for this post, I’m just trying to give a brief overview to hopefully clear up some confusion. In short, “statistically significant” means that there is a low probability that a result as great or greater than the observed result could arise if there is actually no effect of the thing being tested. Statistical tests are designed to show you how likely it is that the sample in a study is representative of the entire population from which the sample was taken. I’ll elaborate on what that means below (don’t worry, I’m not going to do any complex math).

Let’s imagine a randomized controlled drug trial where we take 100 patients, randomly split them into two groups (50 people each), give one group a placebo, give the other group the drug, then record how many of them develop a particular disease over the next month. In the control group, 20% of patients (10 individuals) develop the disease, whereas in the treatment group only 10% (5 patients) developed it. Does the drug work?

This is where the confusion comes in. Many people would look at those results and say, “obviously it helped, because 10% is lower than 20%”. When we do a statistical test on it (in this case a chi-square test), however, we find that it is not statistically significant (P = 0.263), from which we should conclude that this study failed to find evidence that the drug prevents the disease. You may be wondering how that is possible. How can we say that taking the drug doesn’t result in an improvement when there was clearly a difference between our groups? How can 10% not be different from 20%?

To understand this, you need to understand the difference between a population and sample and the reason that we do these tests. This hypothetical experiment did find a difference between the groups for the individuals in the study. In other words, the treatment and control groups were different in this sample, but that’s not very useful. What we really want to know is whether or not this result can be generalized. We really want to know whether, in general, for the entire population, taking the drug will reduce your odds of getting the disease.

To elaborate on that, in statistics, we are interested in the population mean (or percentage). This may be a literal population of people (as in my example) but it applies more generally, and is simply the distribution of data from which a sample was taken. The only way to actually know the population mean (or percentage) is to test the entire population, but that is clearly not possible. So instead, we take a sample or subset of the population, and test it, then apply that result to the population. So, in our example, those 100 people are our sample, and the percentages we observed (10% and 20%) are our sample percentages.

I know this is starting to get complicated, but we are almost there, so bear with me. Now that we have sample percentages we want to know how confident we can be that they accurately represent the population percentages. This is where statistics come in. We need to know how likely it is that we could get a result like ours or greater if there is no effect of the drug, and that’s precisely what statistical tests do. They take a data set and look at things like the mean (or proportion), the sample size, and the variation in the data, and they determine how likely it is that a result as great or greater than the one that was observed could have arisen if there is no effect of treatment. In other words, they assume that the treatment (drug in this case) does not do anything (i.e., they assume that all results are from chance), then they see that how likely it is that a result as great or greater than the observed result could be observed given the assumption that all results are due to chance. Sample size becomes important here, because the larger the sample size, the more confident we can be that a sample result reflects the true population value.

So, in our case, we got P = 0.263. What that means is that if the drug doesn’t do anything, there is still a 26.3% chance of getting a result as great or greater than ours. In other words, even if the drug doesn’t work, there is a really good chance that we’d get the type of difference we observed (10% and 20%). Thus, we cannot be confident that our results were not from chance variation, and we cannot confidently apply those percentages to entire population.

Having said that, let’s see what happens if we increase the sample size. Imagine we have 1,000 people, but still get 10% for the treatment group and 20% for the control group. Now we get a highly significant result of P = 0.00001. In other words, if the drug doesn’t do anything, there is only a 0.001% chance of getting a difference as great or greater than the one we observed. Why? Well, quite simply, the larger the sample the more representative it is of the population, and the less likely we are to get spurious results. From this, we’d conclude that the drug probably does have an effect.

a jar of red and blue marbles labeled "population" and five randomly selected marbles labeled "sample"

Another useful way to think about this is to imagine a jar full of red and blue marbles (this is your population). You want to know if there are more of one color than the other, so you reach in and randomly grab several (this is your sample). Suppose you get more blue than red, can you conclude that there are more blue marbles than red marbles in the jar (population)? This clearly depends on the size of your sample. The larger it is, the more confident you can be that your sample represents the population.

To try to illustrate all of this, imagine flipping a coin. You want to know if a coin is biased, so you flip it 10 times and get 4 heads and 6 tails. That is your sample: 40% heads. Now, is the coin biased? In other words, if you flipped the coin 10,000 times, would you expect, based on your sample, that you’d get roughly 40% heads? How confident are you that your sample result applies to the population? You probably aren’t very confident. We all intuitively know that it is entirely possible for even a totally fair coin to give 4 heads and 6 tails in a mere 10 flips. No one would scream, “but in your test the coin was biased!” We all realize that the sample may not be representative of the population.

Now, however, imagine that you flip it 100 times and get 40 heads and 60 tails. This is your new sample. Now how confident are you? Probably more confident than before, but also probably not that confident. Again, we all realize that there is chance variation. Indeed, if we ran that actual stats on this, we’d get P = 0.2008. In other words, this test says, “assuming that the coin is not biased, there is a 20.08% chance of getting a difference as great or greater than a 40%/60% split,” but what if we did 1,000 flips and got 400 heads and 600 tails. Now, we’d probably think that the coin truly was biased. At that point, we’d expect that our sample probably does apply to the population and continuing to flip the coin will continue to yield percentages of roughly 40% and 60%. If we actually run the stats on that, our conclusion would be justified. The P values is less than 0.00001, meaning that if the coin is not biased, there is less than a 0.001% chance of getting a result as great or greater than ours. This would be good evidence that the coin itself (the population) is likely biased, and our results are unlikely to be from chance variation in our sample.

That is, in a nutshell, what statistical tests (at least frequentists statistical tests) are doing, and we only consider something to be “statistically significant” when the probability that a result like it (or greater) could arise (given the assumption that there is no effect of treatment) is below some pre-defined threshold. In most fields, that threshold is P = 0.05. In other words, we only conclude that the sample results apply to the population if there is less than a 5% chance of getting a result as great or greater than the one we observed if the thing being tested actually has no effect.

Note: An important topic not covered here is confidence intervals. These show you the range of possible population values for a given sample value. So, for example, if you had a mean of 20 and a 95% confidence interval of 10-30, that would mean that you can be 95% sure that the population mean is between 10 and 30.

Note: This post was revised to change statements that a P value showed the probability that a result as great or greater than the one that was observed could arise by chance to statements that the P value showed the probability that a result as great or greater than the one that was observed could arise if there is no effect of the the treatment.

Posted in Nature of Science | Tagged , , | 10 Comments

Increased testing does not explain the increase in US COVID cases

The US is experiencing another sharp increase in COVID19 cases. This is a simple fact, but as always seems to be the case in today’s world, this fact is being treated as an opinion. Countless people (including prominent politicians and even the president) are claiming that cases are not actually increasing, and the apparent increase is simply the result of increased testing. This claim is dangerous and untrue, but it also offers a good opportunity to teach some lessons in data analysis. Obviously, an increase in testing will result in an increase in the number of cases that are documented, that much is true, but that doesn’t necessarily mean that the entirety of the increase is from increased testing. So how can we tell whether the true number of cases is increasing? There are multiple ways to examine this, and I’m going to walk through several of them and try to explain the stats in a non-technical way so that everyone can really grasp these concepts.

To begin with, I’m not actually going to talk about coronavirus. That topic has, unfortunately, becomes such a political battleground (even though it should be entirely scientific) that it is difficult to get people to think clearly and unbiasdly about it. So instead, let’s start by talking about Willy Wonka’s chocolate factory. Like most chocolate factories, they sometimes get insects in their chocolate bars and they test subsets of them to see how often this occurs. This situation is analogous to testing for a disease, and the math is the same, so let’s use it as an example to understand the math, then we’ll apply that understanding to coronavirus.

For sake of example, let’s say that Wonka produces 10,000 chocolate bars a day, and examines 2,000 of them for the presence of insects (these are the tests). Further, as you might have guess, his chocolate factory has rather lax hygiene standards, so out of those 10,000 bars, 1,000 actually have insects. How many do we expect to have insects (i.e., be positive cases) in the sample of 2,000 tests? This is easy to calculate. 1,000 is 10% of 10,000, so we expect 10% of the tests to be positive. Thus, out of 2,000 tests, we expect to get 200 bars with insects (i.e., documented cases; note that I am acting as if testing is random to make the math easy for all to follow; this is a simplification, but doesn’t actually change the point; see note at the end).

Now, suppose that Wonka increases the testing and gets higher numbers of positives (more cases). What does that mean? It could simply mean that the number of bars with insects is unchanged, but more are found due to more testing. However, it is also possible that both testing and the true number of bars with insects are both increasing. How can we tell which is occurring?

Figure 1: Changes in the percent of tests that are positive under different scenarios. For each line, testing increases by 10% of its starting value each day, but the number of actual cases (not observed cases) varies, and the lines show the percent of tests that were positive. Blue lines show a decrease in actual cases over time, the grey line shows no change in actual cases, and the red lines show an increase in actual cases. As you can see, anytime that the total number of cases increases, the percent of tests that are positive will increase, whereas if the total number of cases is unchanged or decreases, the percent of positives will either remain stable or decrease, even if testing increases.

The answer lies in the percentage of tests that are positive. If the actual number of bars with insects is unchanged, and the increase in positives is simply due to increased testing, then the percent of tests that are positive will remain constant even though the total number of positive tests goes up (Figure 1). Think about the math from earlier. 10% of bars have insects. So, we expect roughly 10% of tests to be positive, regardless of how many tests we do (though the percentage will be more accurate with a larger sample size). So, if we do 2,000 tests, we expect 200 bars with insects (10% positive). If we do 4,000 tests, we expect 400 bars with insects (10% positive). If we do 6,000 tests, we expect 600 bars with insects (10% positive), etc. The total number of bars with insects (cases) increase as testing increases, but the percentage of those tests that are positive remains the same. As another example, imagine that you have a bag with 500 blue marbles and 500 red marbles. You reach into the bag and grab a handful. You expect to get roughly 50% of each color regardless of how many you grab (though you expect the value to be closer to 50% [more accurate] as sample size increases). It’s the same with testing.

So, if the increase is entirely from testing, the percent of tests that are positive should be unchanged, but what happens if the number of insects in chocolate bars are actually decreasing, while testing is increasing? What happens then? Well, the total number of positive test results may either go up or down (depending on the sizes of the decrease in insects and increase in testing), but the percentage of tests that are positive will always go down (Figure 1). Going back to the example, we expect 10% of tests to be positive when 1000 out of 10,000 bars actually have insects and 2,000 tests are conducted. Now, suppose that the number of bars with insects is cut in half (500) and testing is tripled (6,000). Now, we expect only 5% of tests to be positive, but 5% of 6,000 is 300. So, while the total number of observed positive cases increased, the percent of tests that were positive decreased. This tells us that the actual number of bars with insects is decreasing, despite the increase in testing.

Conversely, if more bars actually have insects, we expect a higher percentage of tests to be positive, even if the level of testing increases. Imagine, for example, that the number of bars with insects increases to 2,000 out of 10,000, while the number of tests also doubles (4,000). Now, we expect 20% of tests to be positive, resulting in 800 cases. See how that works?

I have illustrated all of these patterns in Figure 1, showing the hypothetical situation I have been describing with changes in testing and, sometimes, changes in the actual number of bars with insects over a 20-day period. Each line shows the percent of tests that were positive. The grey line shows the situation where testing increases but the actual number of bars with insects (cases) do not, the blue lines show increased testing with a decrease in the actual number of cases, and red lines show increased testing coupled with an increase in the actual number of cases. As you can hopefully see, the only way to get a decreasing percentage of positive tests is if the actual number of cases (not simply the number of documented cases) decreases, and any time that the actual number of cases increases, the percent of tests that are positive will also increase. This percentage of positive tests is key for understanding what is actually happening.

Figure 2: Percent of coronavirus tests that were positive for June. The first panel shows the data for the whole country, and the second shows two states with large outbreaks (Florida and Arizona). They are presented in separate panels simply so that the change for the whole country is not obscured by the much larger change for individual states. Data were downloaded from the Covid Tracking Project late on 28-June-20.

Now, with all of that in mind, let’s look at coronavirus in the US. If the situation is truly improving and the actual number of cases is truly decreasing and the apparent recent increase in cases is just a result of increased testing, as many argue, then we should see that the percent of tests that are positive has continued to decrease. That is not, however, what we see. It was decreasing for a while, but if we look at June (when things have been opening back up and when the spike in cases occurred) we see a statistically significant (P < 0.0001) increase in the percentage of tests that are positive (Figure 2). In other words, the increase in tests simply cannot explain the entirety of the increase in cases. It probably is a contributing factor, but the actual the actual number of coronavirus cases in the US is actually going up rapidly. That is a fact. To be clear, exactly what is happening varies by states, and some cases are experiencing decreases in the rates of positive tests, but many others are experiencing sharp increases, particularly in states like Florida and Arizona (Figure 2). They are very much experiencing viral outbreaks (Johns Hopkins has some very nice data and graphs for state data that I recommend looking at)..

There is another really useful way to examine this, which is to look at the percent change for number of tests and number of observed cases (positive tests). Sticking with chocolate bar example and using the data presented in Figure 1, we find that when testing increased by 100 tests each day, but the actual number of cases remained constant, the number of tests increased by 145% over time and the number of positive tests per day (cases) increased by 145%. This is what we expect if the actual number of bars with insects is constant, but the testing increases: the percent difference should be the same for both the total number of cases and the number of observed cases (positive tests). When testing increased by 100 tests a day and the actual number of bars with insects increased by 1% of the original level each day, however, the percent difference in tests was still 145%, but the number of positive tests (cases) increased by 216%, and when actual cases increased by 5% of the original level each day, the number of positive tests increased by 500%! Do you see how that works? If the increase is entirely from increased testing (while the actual number of cases remains the same), then both the increase in tests and the increase in observed cases will match. In contrast, if actual cases are also increasing, then the increase in positive tests will outpace the increase in testing.

So, what do we find for coronavirus in the US? Well, if we compare the last 7 days of May (7-day average) to the past 7 days of June (with the 28th being the most recent date based on when I downloaded the data), we find that the number of tests increase by 40.5%, while observed cases increased by 83.0%! In other words, the increase in cases substantially outpaces the increase in testing, clearly indicating that we are actually experiencing a real increase in coronavirus cases, not simply an increase in known cases due to increased testing. The situation is even more dire when you start looking at states where the largest outbreaks are occurring. In Arizona, for example, again comparing the last 7 days of May to the past 7 days of June, we find testing increased by 116.9%, but daily new cases increased by 498.2%. Florida is a similar story. Testing has increased by 88.3%, but daily new cases has increased by an astounding 726.7%! This is undeniably an outbreak.

Indeed, you can get a sense for these general trends just by looking at a comparison of testing rates and numbers of new cases over time (Figure 3). As you can see, at first, testing lagged well behind cases as we experience the first initial outbreak. Then, cases started declining, even though the number of tests continued a steady increase. It is only in the past few weeks (i.e., since social distancing restrictions, closures, etc. have been being lifted) that we see a spike in cases. Further, the recent spike in cases does not correspond to a spike in testing. Testing has been increasing at a steady rate, whereas cases suddenly shifted from a steady decrease to an exponential increase. In other words, the number of observed cases does not track well with the number of tests. If the current increase in cases was really a result of increased testing, then new cases should have been tracking with testing all along. They should have continued to increase after March, because testing increased. That’s not at all what we see, however. Again, testing simply can’t explain the trends. That doesn’t mean that there is no impact of testing, obviously there is, but it is clearly not the key thing driving trends.

Figure 3: Coronavirus testing and cases for the USA. As you can see, cases are a poor match for testing, indicating that testing alone does not explain the recent increase in cases. The x-axis labels show the start of each month. Data were downloaded from the Covid Tracking Project late on 28-June-20.

Yet more evidence comes from hospitalization rates. The “its just more testing” argument relies on the notion of many asymptomatic people (or at least people with very mild cases) that have only been detected recently due to increased testing. If that was the case, then hospitalization rates should be remaining level or going down (if the virus is truly going away), yet many states are experiencing increased hospitalization rates, with the Texas Medical Center (an enormous complex) hitting 100% capacity for its ICU. That simply cannot be explained as a result of increased testing.

Fortunately, deaths have not started spiking yet. There are several reasons for this. One is that, this time, more young people are getting the disease. Another is simply that death rates inevitably lag behind infection rates, and it is very likely that deaths rates will increase in the coming weeks (though many experts are hopeful that we will be able to avoid the type of enormous spike we saw a few months ago).

In short, an actual examination of the data clearly and unequivocally shows that the current increase in coronavirus cases in the US cannot be explained simply as a result of increased testing. The percent of tests that are positive is increasing, which is a clear indication that the actual number of cases is increasing. Further, in states like Arizona and Florida, the numbers are truly shocking, with the increases in new cases massively outpacing the increases in testing. We are clearly still in the middle of a deadly outbreak, and it is getting worse. This isn’t a liberal conspiracy to undermine Donald Trump; it is a fact, and facts don’t change based on your political party.

Note: Please refrain from political comments. This post is about science and evidence and comments should likewise be about science and evidence (see Comment Rules).

Note: someone might object that my examples assume random testing, while testing is actually somewhat targeted, and people who are symptomatic or are known to have been in contact with someone who is infected are more likely to be tested. This fact is true, but actually doesn’t substantially change anything I’ve said. It does affect the exact percentages but doesn’t change my point about the trends. It is still true that the only way to get an increasing percentage of positive tests while the testing rate is increasing is for the actual number of total cases to be increasing (technically, this could also happen if we learned to do a much better job at targeting our tests, but there is no indication of this that I have seen; certainly not enough to cause the numbers we are seeing, and it still would not explain the increases in hospitalization rates).

Data source: The data I presented here were downloaded from the Covid Tracking Project late on 28-June-20.

Posted in Uncategorized | Tagged , | 25 Comments

Science is a path to knowledge

There are a lot of misconceptions about what science actually is, and, as a result, there are a lot of incorrect conclusions about the reliability and utility of science. I frequently encounter people who expect science to give absolute answers. They act as though science is a method for proving what is true with 100% certainty. As a result, they view cases where science led to an incorrect conclusion as evidence that science itself is flawed. You can clearly see this in arguments that a current scientific result doesn’t need to be accepted because “science has been wrong before” or “there used to be a scientific consensus that the earth was flat” (there wasn’t, but that’s another topic), etc. Similarly, there is a false view that a scientific conclusion is either 100% right or 100% wrong. In reality, science is a path to knowledge. It is a way of testing ideas and slowly building a body of knowledge based on the results of those tests. Sometimes, the path takes wrong turns, but unlike every other path to knowledge that has ever been invented, science is systematic and self-correcting and steers itself back in the correct direction, resulting in a gradual accumulation of knowledge.

Before I go any further, I want to acknowledge that this description of science as a “path to knowledge” is not original with me and was coined by my friend and fellow skeptic, The Credible Hulk. So, go check out their blog and Facebook page for more great science content.

I really love this description of science as a “path to knowledge” because it beautifully encapsulates what science is and why it works. You see, science does not give absolute results. In other words, it does not “prove” anything with utter certainty. Rather, science is all about probabilities. As I often like to say, science simply shows us what is most likely to be true given the current evidence. That probability can, however, always change with future evidence. Any scientific result can be overturned as new evidence comes to light.

The tentative nature of a scientific result is one of its great strengths, but it can lead to confusion. People often make the incorrect leap from, “science does not give definitive answers” to “science is uncertain and therefore I don’t have to accept a given result.” This is a flawed way of understanding science. Remember, it is a way for telling us what is most likely true given the current evidence. Therefore, it’s results should be accepted until such time as future evidence arises to discredit those results. Sticking with our path analogy, a lack of 100% certainty that a path is going the right direction would not justify abandoning the path altogether and wandering aimlessly through the forest. Further, a lack of 100% certainty does not mean that we cannot be highly confident in a result. There are some things that have been so thoroughly tested so many times in so many ways that it is extraordinarily unlikely that they are wrong. In other words, some paths are marked well enough that you can be really confident in them.

On the other end of the spectrum, people ignore the tentative nature of scientific conclusions and act as though it should give definitive answers, leading to the flawed arguments about science having been wrong in the past. These arguments are problematic in a number of important ways. First, they treat the inherently self-correcting nature of science as if it is a bad thing, when in fact, it is another great strength of science. Really think about this. If you are going to argue that, “I don’t have to accept a scientific result because scientists used to think sun moved around the earth,” my question would be, “why do we no longer think that the sun moves around the earth?” The answer is very clearly that other scientists continued conducting tests and discredited the previous view. Science corrected itself. This is not a weakness, but rather a strength. No other path to knowledge does this. No other system of understanding repeatedly and systematically tests its conclusions and updates its information by rejecting debunked results and accepting new results.

Further, because of the way that science advances, the argument that “science has been wrong before” is inherently self-defeating. Sticking with the orbit of the earth for a minute, we only know that the earth orbits the sun because science debunked the notion that the sun orbits the earth, so you can’t use that as an argument that science doesn’t work, because the argument inherently includes the premise that science works! In other words, if this argument gives us carte blanche to disregard scientific results, then why should we accept the result that the earth moves around the sun? That result was produced by science, and this argument claims that we don’t have to accept scientific results, so why should we accept the result that the earth moves around the sun? We only know that science was wrong before because of science. Again, this self-correction is one of the best things about science.

Additionally, it is important to realize that scientific results are often incomplete more than actually wrong, and there are degrees of wrongness. The progression of physics is a great example of this that I use frequently. Newton made enormous strides in physics. He moved us far along the path, but we later found out that he was slightly off course. Einstein showed that Newton’s work was incomplete and his conclusions did not apply universally. However, that didn’t mean that we threw Newton out the window and went all the way back to the trail marker Newton started at. Newton moved us closer to the truth, and Newtonian physics are still taught and applied all around the world, but he was incomplete, and Einstein took Newton’s results and shifted us back on track. Think of it like this: we needed to go north, and Newton took us slightly north west. He still moved us much closer to our goal, but we needed Einstein to reorient us and get us back on track.

This gradual accumulation of knowledge is another key aspect of science. Yes, science sometimes makes mistakes, but because it corrects those mistakes, we gradually get closer and closer to the truth. People who thought the sun revolved around the earth were less wrong than people who though the sun was a god. Galileo was less wrong than the people who thought the sun moved around the earth. Newton was less wrong than Galileo. Einstein was less wrong than Newton, etc. At each step, we got closer, and closer to the truth. This is also another reason why it is so absurd to blindly disregard modern scientific results on the basis that science has been wrong before. Science is a gradual accumulation of knowledge, and although there certainly are things about which we are wrong today, we are less wrong than previous generations, and we know this because we tested the views of previous generations and built on that knowledge.

To give another example, there are certainly things about which modern medicine is wrong. That is inevitable due to the tentative and probabilistic nature of science, but modern medicine is less wrong than medicine was 20 years ago, and medicine 20 years ago was less wrong than medicine 40 years ago, and medicine 40 years ago was less wrong than medicine 60 years ago, etc. Further, I can demonstrate this extremely easily. Imagine you need a major medical intervention and you can be treated using the technology and knowledge from any of the following time points: 200 years ago, 100 years ago, 50 years ago, 25 years ago, or current. Rank your choice from lowest to highest. I’m willing to bet your choices went chronologically (inverse) with your preference being treatment via our current knowledge, and there is a very good reason why that is the correct way to rank things. Namely, science works! It’s not perfect, but it is a path that moves generally in the right direction, and we all intuitively realize that science has helped us progress and, thanks to science, we know more than any generation before us knew.

Further, we can extend my medical analogy to just about any field of science. Imagine that you are on a game show run by omnipotent aliens with a perfect knowledge of the universe. They ask you a chemistry question, and you have a lifeline that will let you call a random chemist from the current year, or from 25 years ago, or from 50 years ago, etc. Whom do you call? Obviously, you call the chemist from the current year. Again, we all intuitively accept that science works and gradually builds knowledge. Even those who like to argue that “science has been wrong before” must admit that, thanks to science, we know more now than at any other point in our history. Science has a proven track record of moving us in the right direction.

Finally, if you are not convinced by anything I’ve said thus far, then my question for you is simply, “what’s the alternative?” Really think about this. What other path to knowledge can compete with science? As I’ve explained before, science is responsible for our modern society. All of the technological and medical marvels around you are the result of gradually testing ideas and accumulating knowledge. Look at all the previously fatal diseases that we can now cure or even prevent, look at the decreases in mortality rates, etc. All of that is because of science. So why should we go back to unsystematic guess work? We tried other systems (like relying on anecdotes) for millennia, and they didn’t work. It was science that brought us out of the dark ages, and it is science that will allow us to continue our advancement as a species. Again, that doesn’t make science perfect or infallible. It simply shows us what is most likely true given the current evidence, but by constantly testing, by constantly self-correcting, by constantly updating, it gradually moves us closer and closer to the truth. It’s not perfect, and it certainly isn’t a straight path, but it’s the best path to knowledge that we have.

Note: To anyone who is about to reply with a snarky remark about doctors/scientists saying that smoking is safe, please read this post. The reality is that there was never a scientific consensus that smoking was safe and, in fact, science had showed that it caused cancer all the way back in the 1930’s. Indeed, actual studies consistently showed that it was dangerous. Tobacco companies simply did a good job of creating the illusion that science was on their side; meanwhile, actual science was continuing along the correct path.

Related posts


Posted in Nature of Science | Tagged , | 29 Comments

The problem with “just asking questions”

Asking questions is generally a good thing. Indeed, questions are the very foundation of science. People become scientists because they are curious and like to ask questions, and science itself is simply a systematic method for asking and answering questions. Unfortunately, the positive perception of questions often leads to people using questions as a disguise for wilful ignorance, and the phrase, “just asking questions” has been used to justify all manner of insane and illogical beliefs. The people who use this phrase are generally not actually asking questions. Rather, they are phrasing a belief as a question in an intellectually dishonest attempt to maintain the appearance on intelligence.

There are two major problems that I am going to discuss. The first is simply that not all questions are good. I fundamentally disagree with the notion that there is no such thing as a stupid question. Good questions stem naturally from known facts and evidence. In other words, they have a basis in reality. Bad questions, however, are not based on facts or evidence and instead rely on wild conjecture. Indeed, in science, hypotheses do not spring out of nowhere. Rather, they are based on the existing evidence.

Let me give an example. In my field (herpetology) there has been a fair amount of debate and discussion about the purpose of basking behavior in turtles (i.e., why do aquatic turtles come out of the water and bask on rocks and logs?). There have been many hypotheses/questions that people have looked at. For example, is it for thermoregulation (temperature)? Does it help immune functions? Does it remove parasites? Etc. All of these are good questions. They are perfectly rational things to wonder about based on our existing knowledge of biology.

Now, however, imagine a scientist asked, “Are they basking to avoid aliens that live in the water?” That would be a bad question, because it’s not based on any known facts. There is no reason to think that aliens are involved, and we’d need good evidence of the presence of aliens before it would be rational to even consider the possibility that they are involved. If a scientist asked that question at a conference, they would be laughed out of the room, and they absolutely could not justify it by saying, “I’m just asking questions. Aren’t you scientists supposed to be open-minded?” Yes, scientists should be open-minded, but being open-minded means being willing to accept new ideas when presented evidence for them. It does not mean being willing to accept or even consider the possibility of aliens influencing turtle behavior despite a lack of evidence that aliens are living in our aquatic ecosystems. Do you see the point? You can’t just say something insane that has no evidence to support it and justify it as, “just a question.” There needs to be some reasoning behind the question. There needs to be some actual evidence to make the question worth perusing in the first place.

If we apply that to current events, questions like, “where did coronavirus come from?” are fine. That’s a totally reasonable thing to ask. Even asking “is coronavirus man-made?” was not entirely unreasonable at first (see below), because there is a very real possibility of people bio-engineering viruses. However, a question like, “did Bill Gates invent coronavirus so that he could microchip everyone?” is not a good question. That is a stupid question, because there is utterly no evidence to suggest that either Gates engineered the virus or that Gates is trying to microchip people. The question, “Did Bill Murray engineer coronavirus because he enjoyed being in Zombieland and wanted to try an apocalypse in real life?” is just as valid, by which I mean, just as stupid. The fact that something is phrased as a question does not make it rational.

The second major problem with people “just asking questions” is that those questions are rarely good-faith questions being asked out of honest curiosity. Rather, they are often statements of belief that are being disguised as questions. Many (if not most) of the people asking things like, “did Bill Gates make coronavirus?” don’t actually want the answer. Rather, they are confident that they know the answer, and that’s a problem.

Asking questions is only a good idea if you are willing to accept the answers to those questions. In other words, asking a question like, “is coronavirus man-made” is fine if it is being asked out of a genuine sense of curiosity and desire for knowledge. There is nothing wrong with asking that question if you are then willing to look at the evidence and accept the answer provided by that evidence (in this case, the answer is a clear, “no, it was not man-made”). The problem is that many people asking the question won’t accept that answer. They refuse to accept the evidence, but also don’t want to admit that they are denying evidence. So, instead, they claim to be “just asking questions.”

To be clear, I don’t think most people are deliberately using the phrase “just asking questions” because they know that they are denying evidence and don’t want to look foolish. Rather, this is simply one of many cognitive traps that people fall into. Most of the people who go around justifying nonsense by saying that they are “just asking questions” probably truly think that they are being rational and are simply asking good questions. So, the point of this post is really to act as a warning. Be conscious of your views and biases, and if you find yourself “just asking questions” stop and ask yourself, “why am I asking this? Is there actual evidence to suggest that this is a good question?” Then, if you think that it is a good question, actually look at the evidence. If you aren’t willing to look at the evidence, then you are stating a belief, not a question. Once you’ve been shown the facts, it is no longer rational to keep asking the same question. Once you’ve been given the answer, your choices are either to accept it or deny it. You cannot claim to be rationally asking questions if you’ve already been given the answer to your questions and simply refuse to accept it.

Finally, it is worth explicitly stating that when I say to look at the evidence, I mean actual evidence from reputable sources. Youtube videos, conspiracy websites, outlets on either extreme of the political spectrum, someone you know on Facebook, a cherry-picked expert, etc. do not count. To quote Will Turner, “that’s not good enough.” In science, your evidence needs to come from the peer-reviewed literature, and you need to look at the entire body of literature, rather than cherry-picking, and for topics like politics and current events, you should get your information from multiple reputable news outlets. Don’t accept the first source you come across. Rather, cross-reference it using multiple other sources and see if they all say the same thing (the Media Bias Chart is a very useful tool for seeing if the sources you are using are neutral and reliable).

My point with all of this is simple. You should ask questions. You should think critically and evaluate what you are told, but your questions need to be based on known facts, and they need to be good-faith questions that are asked out of an honest curiosity. You must be willing to answer them by actually looking at evidence from reputable sources and accepting facts.

Related posts

Posted in Rules of Logic | Tagged , | 6 Comments