Information Literacy Friday: Statistics is Hard

Undeniable math”, “Statistical impossibilities”: these words do not mean what the non-mathematician-authored articles you may have seen them in think they mean.

One of the bigger problems I see in reporting, even with otherwise-excellent reporters in generally-reliable sources, is bad applications of or misunderstanding of math, particularly statistics. Add some bias on the part of the author—particularly where anything involving people or politics is involved—or a tight deadline, and it only gets worse. 

So, the information literacy lesson is: if you see someone making claims with math, and you, personally, can’t do all the necessary math to verify those claims, you should find someone trustworthy who can, or take them with a healthy dose of skepticism. There are several risks here:

  1. they might not understand the math, themselves
  2. they might be able to understand it, but didn’t fact-check in this case
  3. they might /think/ they understand it
  4. they might explain it badly (even if they understand it)
  5. they might understand the raw math, but be misapplying it to this situation

…and there are probably other problems that can show up. And note that none of these problems assumes the writer is trying to deceive you. But badly-understood or badly-presented math can certainly play a part when someone does want to mislead you. 

This doesn’t mean that you can never accept math as proof if you don’t have a Masters in statistical analysis. It just means you need to be cautious, double-check and cross-check claims like these, and seek out people who are experts—which generally the article authors aren’t, unless you’re reading a math or science journal.

This is particularly important when applying math to reality. People like to say that “numbers don’t lie” or that math is unbiased. The raw math may be, but how it’s applied? That can absolutely be biased. Intentionally or unintentionally. 

A good test when reading something, if you can’t verify or understand the math yourself, is: does the rest of the article hold up without the math? What about if the math is wrong? Would it still be a convincing and well-supported argument even if every bit of the math in it turns out to be irrelevant to the topic? If the article makes its points in other ways, that’s a good start. But if the claims fall apart if the math is all wrong, then it’s important to make sure the math is right. 

BTW, in a perfect world, you’d find an expert on a topic before you need them. That’s why I’m confident linking to a couple videos by Matt Parker: not only can he explain the math at a level that I can understand, but I know him from non-controversial topics and he has been demonstrating his mathematical expertise for years. But most of us aren’t reference librarians or journalists with a specific subject beat so most of us won’t “just happen to” know someone with expertise and a proven track record on a random topic before that topic is in the news. But someone with a broadly applicable expertise, like statistics or other maths, is worth being aware of.

Examples

So, some real-world examples. All of these are from actual reporting, in some cases widespread reporting. I mostly haven’t provided links because I don’t want to drive additional traffic to bad reporting. While some of my analogies are hypothetical, none of the reporting I’m citing is—no strawman arguments here.

[Not] Applying Bayes’ Theorem

I saw an article a few months back about Covid-19 infection rates in the US. The article asserted that the tests being used for Covid-19 had accuracy rates of 96% or better [this was not true at the time, nor did the source they cited actually say that], and that therefore the error rate of the tests couldn’t account for the alleged “overcount” [still unsubstantiated] of Covid-19 cases—that it had to be either due to something else, or due to deliberate deceit. At first glance, that looks obvious: a 96%-accurate test should give results that are off by no more than 4%, right?

Wrong. If 10% of the tested population actually has Covid-19, a consistent 96% accuracy rate would result in a roughly 30% overreporting of the infection rate. Because the math for what happens when you test thousands (or millions) of people doesn’t work the way that most people think it does—it’s highly dependent on what portion of the population actually has whatever you’re testing for. (And if 80% of the population is actually infected, then that same test is likely to produce a slight undercount.) What looks like a very narrow error margin at the individual level can lead to a huge error margin for the whole population, when you don’t know the true prevalence of a disease. Meaning that there is no fraud or deceit needed to account for the reported numbers being off from the true (and unknown) numbers by significant amounts—it’s simply that the math can’t be more precise than that. 

In this case, I think it was that the author simply didn’t understand the math, and didn’t even know that there was math they should be applying but weren’t. So in one sense this was a good-faith error, but at the same time, it is absolutely a breach of journalistic standards (probably due to a case of motivated reasoning, since the article in question made 11 claims: this math error, 3 misunderstandings of medicine, 5 with no evidence, 1 off-topic, and 1 that might be true but at that time lacked evidence) not to make sure you’re accurately reporting. But whether due to ignorance or deception, the result is the same: what the author claims is a strong argument against something actually turns out to be a modest argument in favor of that exact thing, if you do the math—and lots of readers (maybe including you?) wouldn’t know this. 

False Negatives and False Positives

That same article also gives an example of how not understanding the context for math can lead to misleading statements: medical tests have both a “sensitivity”—how likely is a a positive result real?—and a “specificity”—how likely is a negative result real? In many cases, those numbers are very different—it might be very likely to get a false positive, but unlikely to get a false negative (a PCR test) or vice versa (an antibody test). The author of that article clearly didn’t know this, and so didn’t even realize that they had misread one of their sources. In this case, their analysis was far enough off, and wasn’t using real data anyway, so it was a moot point. But if they’d had accurate data but misused it in the same way, the article would likely have still misrepresented whether the data was suspicious or not.

But that’s specific to that situation. The broader takeaway, as a consumer of information is: do you know whether or not the author is applying the math correctly? Have they explained it clearly and, if non-obvious or complex or controversial, explained why that’s how the math works or why it’s the right math to apply to this situation? Have they cited a source (online, it really should be a link)? 

You Keep Using that Word…

  • Pouring 100 coins out onto a table and having them all land heads-up: wow, is that improbable! But it’s not impossible.
  • Pouring 100 coins out onto a table and having 150 of them land heads-up: that is a statistical impossibility.

And if someone glued 100 pennies to a table heads-up, the “odds” of you finding 100 heads when you look at those pennies has nothing to do with random distributions.

I’ve been seeing some reporting on the 2020 US Presidential election results that are describing the electoral equivalent of finding glued-down pennies, and claiming it’s weird that those pennies show the side that they’ve always shown. Votes in elections aren’t random, and aren’t statistically independent. There’s no reason to expect them to look like random data from nature. 

And there’s nothing “impossible” about a contentious election resulting in higher turnout. While very high turnout might be improbable, it is not impossible.

Benford’s Law Doesn’t Work for Elections

Math theories often have very specific applicable situations. Benford’s Law is one of them. See, one of the rules for Benford’s Law to apply is that the total range of the numbers needs to span a certain range. If the numbers don’t match that range, then it either doesn’t apply, or it applies differently. Sometimes, a superficial understanding of a math theory is worse than no understanding at all: as it turns out, given the sizes of the precincts in Chicago, the vote totals for Biden are a good match for Benford’s Law, and it is the votes for Trump that look a little odd (though likely, Trump’s totals were enough lower, that they, also, roughly conform to Benford’s Law. 

In addition to needing to know when and how a math theory applies, this is also an example of not understanding what the theory says:

«It is not simply that [Benford’s] Law occasionally judges a fraudulent election fair or a fair election fraudulent. Its ‘success rate’ either way is essentially equivalent to a toss of a coin, thereby rendering it problematical at best as a forensic tool and wholly misleading at worst.»

Which is really the real lesson here: Benford’s Law simply doesn’t apply in the way political press has claimed to US election results, because the precincts don’t vary enough in size, and because the vote totals aren’t, and shouldn’t be expected to be, random. In many precincts, they will be predictably skewed in favor of one candidate. So when you decide to apply Benford’s Law anyway, and do it wrong, you get what looks at first glance like evidence of something shady going on. But what you’re really seeing is evidence that Biden consistently had more votes in Chicago. Which is what everyone expected going into this election.

How Many is “A Lot”?

Magnitudes of numbers and our old friend “compared to what?” are another common problem. 95,000 people turned in a ballot that only marked president, and none of the other races, in Georgia. But before we can evaluate the meaning of that, we need to know a couple things:

  • Out of how many ballots cast? If it were out of 150,000, that would be pretty remarkable. But when it’s out of 2.5 million, that’s about 4%. That’s not nearly so remarkable: given 1000 people who voted in a particular precinct (like the precinct I worked this year), that would mean 40 of them only showed up to vote for President. Or 3 per hour. Doesn’t sound quite so dramatic now, does it?
  • Is this unusual? I have no idea, and the articles I’ve found on this haven’t bothered to tell us. I can find analyses that show that it is far more common for people to vote for the highest office on a ballot (President in a Presidential election year) than for any other office—but also an undervote for President of 0.5-2% seems to be typical. I found some analysis showing that the lowest items on a ballot might be undervoted at a rate as high as 7x the Presidential undervote rate in the same election. But nothing that specifically calls out President-only ballots. Without that data, we have no idea whether 4% is even unusual.

None of this is proof that President-only ballots are not an anomaly—just that we don’t yet have any evidence that they are an anomaly—the numbers aren’t actually that large, relative to the number of ballots cast. To point out just one possible (non-fraud) explanation: once you adjust for estimated population growth, 26% more people voted for Biden in Georgia in 2020 than voted for Clinton in Georgia in 2016. So if just 1 in 7 of these “new” voters were motivated to vote just to get rid of Trump, and didn’t care about any other race, that would account for those “mysterious” 95,000 ballots. What do you think the likelihood is of people who didn’t vote in 2016 but did in 2020 did so because of opposition to Trump? Could that possibly be a large number of people? Maybe out of every 77 people in Georgia, 1 of them doesn’t like Trump enough to vote against him, even though they don’t usually vote?  

Statistics Don’t Provide Explanations

Context, context, context. Even when not doing fancy analysis of regressions or checking digits for signs of fraud, just basic statistical claims often fall afoul of not explaining to the reader/listener the context. Like our voting numbers in Georgia, we need more info to know if a large number is, in fact, large, or a small number small. Statistics can analyze numbers, but they can’t tell you why they are.

I was listening to a podcast about bicycle thefts in Minneapolis, and a couple numbers jumped out at me because of the lack of context.

  1. They said that the Minneapolis-St Paul metro area had 4300 thefts from the beginning of 2017 to the middle of 2019, and that this was an anomalously high number. That averages out to about 1700 per year.

So lets compare to some other large metro areas that are also known for being very bike-friendly. Looking at overlapping time periods, Portland OR had roughly 3000 bike thefts per year, and the Portland metro area is only about 2/3 the population of the Twin Cities metro area. Denver CO has roughly the same annual bike theft rate (1700), but only 75% of the population. So at a very quick glance, it doesn’t look like the Twin Cities have particularly high theft rates for a large metro area where bicycling is popular. They actually look kinda low.

  1. The podcast then mentioned that the “Twin Cities Bike Theft” Facebook group had, at the time of recording, 8300 members. And they said that the large number of people participating was an indication of the high bike-theft rate in the area.

Setting aside that membership doesn’t have to be triggered by bike theft—it could indicate a very tight-knit community, where everyone looks out for everyone else, and bike theft is consequently comparatively low—that number really doesn’t tell you anything unless you have some idea how many cyclists are in the area. As it turns out, MN does a pretty good job of collecting bicycling data. Using that, some quick back-of-the-envelope math comes up with around 800,000 people in the Twin Cities metro who ride their bike at least weekly. Around 160,000 who ride daily. Probably the 900,000 who ride once a month also own bikes. So at least 160,000—and likely more than 1,860,000—people own bicycles in the area.  

And that’s as far as we can get. See, sometimes we don’t have enough data to compute real statistics. Anything more than this is supposition, until we get more data. 

We can make a few educated guesses, however. Comparing the membership of the Facebook group to annual bike thefts, we see that there are 5x as many members as the annual theft rate, which lends credence to the idea that the large number of members doesn’t represent victims of bike theft—that the large membership is perhaps an indication of a tight-knit community. But it’s not proof. And it doesn’t disprove the interpretation (which the podcast made) that bicycle theft is a significant problem in the Twin Cities—or is seen as such. 

Or we can compare the membership of the Facebook group to the total cyclist population. The group represents less than 0.5% of bicycle owners in the area. If only daily riders are likely to even become aware of the Facebook group, then the group would be about 5% of them. Now, is either 0.5% or 5% of cyclists caring about bike theft an indication of a high level of bike theft? I don’t know. And neither do the podcast hosts. They’re weaving a narrative, not presenting objective statistical interpretations. In theory, we could compare to other cities, or to other things people worry about. Or we could decide for ourselves whether we think that is a “high” or “low” number. 

Lying About Voter Turnout

Don’t forget the basics: good statistics depend on the data being real. You may have seen claims that more people voted in {Wisconsin|Michigan|some other state} than there are adults. The problem is that the people making those claims are comparing the number of people who voted in 2020 to the population in 2010 or 2000 (depending on which article you read). Unsurprisingly, since there are more people in the state than 10 or 20 years ago, there are also more voters. 

Another “statistic” that has been floating around is claiming that the voter turnout in Wisconsin was improbably high—around 90%. This time, they’re using real numbers, but using them wrong. Voter turnout is calculated as the percentage of eligible voters that voted. That 90% number comes from dividing the number who voted by the number who were registered to vote prior to the election. In fact, if you look at multiple states across multiple elections, 90% of registered voters actually voting is on the high end, but not particularly noteworthy: people who make the effort to register are fairly likely to also vote. The problem comes from these articles then comparing this number to the actual voter turnout numbers from previous elections. So they’re saying that the percentage of registered voters who voted in 2020 is significantly higher than the percentage of eligible voters who voted in previous years. They’re using different denominators. (At least one meme floating around had skipped all this bad math: it was just a bunch of made-up numbers. So watch out for that, too.)

Dubious Odds

Watch out for claims that sound like math, but don’t actually give you any math. 

One article about the 2020 election, after making some dubious (and in at least one case, outright false) claims about the “suspicious statistics” of the election results, finished up by saying that the odds of Biden having won fairly are roughly the odds of pitching 4 perfect games in the World Series. But the article didn’t assign any probabilities to the various “improbable” events it alleged. Nor did it show any evidence of having calculated the odds of pitching a perfect game, let alone the odds of pitching 4 perfect games in a row in the World Series. 

It’s like saying “the odds of being electrocuted by a downed powerline are about the same as the odds of being killed by the opening stop sign on a stopped schoolbus”—if I don’t show you the math, there’s no reason you should believe me that those two completely separate things are comparable. And, doing a little digging around, there are conflicting answers on the odds of pitching a perfect game, so the odds of pitching 4 perfect games in the World Series range across 7 orders of magnitude, and all of them are ridiculously tiny, with the most-likely answer being the middle one, at 0.000000000000012%. Now, maybe the author actually meant to say “the odds of this election being free and fair are less than the odds of me and my coauthor both being declared saints“, but since they didn’t provide any numbers, I’m betting they just picked a random thing that they thought felt suitably impressive, and called it a day.  

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.