instruere...inlustrare...delectare Disputations

Wednesday, September 19, 2007

"Science suffers from an excess of significance"

This article suffers from an excess of quotable lines:
...most published research findings are wrong.

"A new claim about a research finding is more likely to be false than true."

...Dr. Ioannidis and his colleagues analyzed 432 published research claims concerning gender and genes. Upon closer scrutiny, almost none of them held up. Only one was replicated.

"People are messing around with the data to find anything that seems significant...."

"The correction isn't the ultimate truth either."
My take on the field of statistics is that it's an arithmetic discipline for making people feel comfortable. Just think about the term "95% confidence": it's a bait-and-switch trick, which works by giving the result of a convoluted and artificial computation the same name as a vaguely understood emotional or intellectual state favorable for decision-making.

Which isn't to say it's all smoke and mirrors. As a statistician I know likes to say, Las Vegas is filled with monuments to the Central Limit Theorem, and all horse players die broke.

But what Dr. Ioannidis is pointing out in the article is that, too often, scientists conduct experiments in statistical alchemy: they want to turn a set of data into a statistically significant conclusion. They do that by loading the data into a statistical analysis software tool, then monkey around until they get a number less than 0.05 (or, if they're desperate, 0.1) to come out of the function they've been told computes significance.

And guess what? Given enough time and experience with the analysis software, it's usually possible, one way or another, to get a score under 0.05.

What does the score actually mean? Well, it means something, probably, but to know just what you have to carefully work through every step taken to obtain it. And statistics is known for counter- and contra-intuitive reasoning, so if you're not an expert in statistics (NOTE: 6 semester hours of undergrad statistics doesn't make you an expert, nor does unlimited hours using statistical analysis software), you shouldn't be too confident your confidence interval tells you what you tell yourself it tells you.

But we're all taught the Scientific Method, which in its simplest form is:
  1. Make a hypothesis.
  2. Test the hypothesis.
  3. Reject the hypothesis if it fails the test.
Rejecting a hypothesis you thought up all by your own self is hard enough when the data proves it's false. Rejecting it when the data doesn't prove anything at all is near impossible, almost as difficult as getting more money after admitting you didn't learn much from the money you've already spent.

I'm not knocking science, or even statistics (which, considered as an applied discipline, is full of deucedly clever (and undeniably useful) stuff). But we have to understand things as they are, not as they are idealized to be, and that includes understanding that the majority of scientists are lousy statisticians.

(Link via Eve Tushnet.)