The Stat Figure Man Fallacy

There's a common saying in science that 'the plural of anecdote is not statistics.'  By this we mean that you can't just take a large number of personal experiences and combine them together to find the average of the population.

An apocryphal quote after Nixon won the presidency has a liberal Pauline Kael saying, "How could Nixon have one?  I don't know a single person who voted for him."  The quote isn't what she actually said, but it's popular because it reveals how easily someone can accidentally make statistical inferences from biased data.  When designing experiments scientists expend a lot of effort minimizing bias so the data they collect matches the actual population average as closely as possible. 
This phenomenon is something most people are at least tangentially aware of, even if they don't think about it directly in those terms.

I want to talk about a phenomenon that's at least as common, but that I think far fewer people are aware of: the average statistic doesn't look anything like most of people it describes.  This is almost the opposite of the heuristic above about the plural of anecdote.  Sometimes this is because 'average' doesn't describe a population very well.  One assumption you make when you do most statistical analyses is that the data follow something we call a 'normal distribution".


In other words, when you look at an average of something, you're assuming the 'average' is a meaningful way to represent that data.  For some types of distributions, the average can actually be misleading.  We saw this once already when looking at life expectancy.  In a society that has high infant mortality the average life expectancy goes down, but not because people aren't living to old age anymore.  It goes down because the age at which people die is no longer normally distributed (it's bi-modally distributed), and therefore it's no longer appropriate to represent that population as an average.


The classic example of using an average to incorrectly represent a bi-modal distribution is to say that the average human has one breast and one testicle.  That's an obvious case where an average does a worse job than simply not describing a population, it actively misinforms.  However many non-obvious examples of this phenomenon happen all the time.  Race, gender, age and other interpersonal comparisons are some places this happens.  It also happens with trans-national comparisons, health comparisons, and any other field where more than one type of person, trait, object, etc. is lumped together in order to 'improve understanding' of a subject as a whole.

The average sandwich might be a toasted avocado, peanut butter, tuna, ham, and provolone on rye - in the USA.  What might the 'average' international sandwich look like?  That question might be fun to contemplate, but the answer doesn't tell us anything about real sandwiches.  It's actually less than informative, because that 'average sandwich' is a new creation that doesn't represent the population it's meant to describe.  If we want to understand sandwiches better, the first thing we need to do is forget about the average.  The same rule applies whenever we're looking at a population - or a trait - that isn't described well with an average.  We can try other methods of analysis, or we can sub-divide the population into categories that do have meaningful averages.  Be we have to start by jettisoning the average to describe the population.

So one heuristic we might take from this would be to ask an additional question whenever we're looking at an average: Is this population normally distributed?  And while that's a good rule to adopt, it doesn't completely shield us from committing the stat figure man fallacy.  This is because we have to remember what it means to have a normally distributed population.  It means representatives of the population usually fall to either side of the average along a predictable curve.  People aren't directly in the center of the graph, they're just clustered around it.

What's the difference?

This concept can more easily be understood if we add a few more variables to the mix. For example, the average male is 5' 9" (176 cm) tall and weighs 198 pounds (89.8 kg).  His shoe size is about 10 1/2 (27.3 cm).  If you want to understand the general range within which humans fall, this information might be useful.  Say you want to create a graphic of the average T-Rex and put an average man next to it.  This information might help you scale your picture accurately.

But say instead you're interested in making clothes for an army.  You use those average numbers - averages that all represent normally distributed populations - and create a standard-issue set of military uniforms and boots.  You figure there might be some outliers, but at least a majority of men should be able to use these, right?

Yet if you look at thousands of men, very few of them will be within two inches of the average height, ten pounds of the average weight, and still be able to fit into the shoes.  Some taller men will be lighter, and some shorter men will be heavier.  Often foot size correlates with height, but not always.  Lots of men will be in the tails surrounding the distribution, but few people will be right in the center of the three-way graph.  The majority of enlisted men will complain and be unable to use the clothing you provided. Yet when I describe the average man most people will imagine that average case, even though in reality it represents few actual people, and fewer still as I add more variables.

The place we went wrong wasn't incorrectly assuming a normal distribution.  I was right about the normal distribution.  I was right about the average population.  But I was wrong when I then tried to extrapolate that average to represent an actual person.  That's not how statistics works, and it's an abuse of the numbers to move backward from the collective data into the individual case.  Few military planners would make the mistake I give above, but it still happens all the time in public discourse on other subjects.


For example, the average male is taller and physically stronger than the average female.  This is an average, though, and while it may be useful when considering things for which an average is appropriate, it's not useful when we need to account for individuals.  This is obvious when we look at how small the shift actually is.  The average woman is 5' 4" (162 cm) tall. That extra 5 inches might seem significant, but look again at the graph above.  There's a lot of overlap.  There are even some women who are much taller than the average man.  In other words, there are women who are taller than half the male population.  What might you have 'learned' from the average data that you could apply to the case of a 5' 11" tall woman?  Most men are shorter than she is.  If you're talking about the average man versus the average woman, you'll be right.  But if you say, "any random man is taller than any random woman" you'll be wrong part of the time.

You'll still be right more often than you're wrong, but this is part of why this fallacy is so deceptive.  You might think you're always right, but really you're only situationally accurate.  Your accuracy isn't because you know something, it's subject to chance.  That's only if we look at one variable.  Confound the statement with multiple variables, and you'll quickly become wrong more often than you're right.  Confound this one more step, by applying low-frequency events, and you're not only wrong you're irrelevant.

So you could compile a list of things that women do better than men, and after reading it you might think you know something more than you did before you read it about the differences between women and men.  You could call this science and say you've increased human understanding a little bit.  If you try to use any of that 'learning' you'll quickly fall into error.  As you meet people, they'll be all over the normal distribution on each point.  You might meet two people and think, "Women are better at [X] than men, so Tina is probably better at that thing that Ted."  But there are many other factors at play, especially when we're talking about characteristics that can be influenced by many things at once.

Take a hypothetical alien species that comes to Earth.  On average, they're 15% better than humans at growing plants.  You might assume a majority of these aliens take jobs in farming or agriculture.  You might even look at the frequency of alien farmers and notice that they outnumber human farmers.  Your alien friend comes to you asking for advice about how to plant their garden.  You think, "I've planted gardens for ten years now, so I know a few things about gardening, but I'm not like one of these aliens.  I don't have anything to contribute."  Except you don't know where your alien friend is on the 'plant growing prowess' distribution for aliens.  Not only that, but even if it does have naturally better gardening skills, if it has no experience gardening the small innate advantage it has over you might be dwarfed by your own advantage in experience.

More concrete examples of this fallacy can be found any time you see the phrase "studies show that on average..."  This last place is probably the most pernicious area where misuse of average findings is destructive.  Countless news articles want to tell you that if you eat this or that thing, ride the subway, live more than 157 feet from a cell phone tower, etc, etc, etc, you're more likely to get cancer, divorced, anxiety, stood up on a date, etc, etc, etc.  Assuming these statements are accurate (which mostly they're not - I'll add that to my to-do list of posts) the useful information you can glean from them far outweighs their prejudicial nature.  If you stop eating non-organic tangerine nectar unless it's imported from Columbia because you read a study saying that otherwise you'll increase your chance of getting thyroid cancer, you've fallen for the Stat Figure Man Fallacy.

Finally, there's one place that applying the Stat Figure Man is not a fallacy: the lottery.  A few years ago my sister called me up, asking me to buy her a lottery ticket.  The jackpot was at a record high and she lived in a state that doesn't participate in the lottery.

Sis: "I just really feel like I should have my hat in this ring."
Me: "No.  It's a waste of money."
Sis: "I'll pay you for it!"
Me: "No.  It's a waste of my time.  You're not going to win."
Sis: "Somebody is going to win the lottery."
Me: "You will never win the lottery.  I am more certain of the accuracy of this statement than almost any statement I could make.  I am more than 99.99999% certain that you are not going to win the lottery.  When I'm driving and I get on the freeway entrance, I bet my life on the probability I won't get into a car accident and die.  I would sooner bet my life on the probability of you not winning the lottery - even if you play a thousand tickets - than that I won't die in traffic."

Given what you know.  What's the difference?

Comments

Popular posts from this blog

Reverse Engineering Life

Cancer update: precision oncology comes into its own