Why Is Social Psychology the Main Front in the Replication Crisis?


In the comments at Columbia U. professor of statistics Andrew Gelman’s blog, Dr. Gelman and I discuss why social psychology seems to usually be at the center of the Replication Crisis wars, such as in the controversy over Amy Cuddy’s power posing experiment. I said:

Why has social psychology been the central front in the Replication Crisis?

I think this is partly because social psychology, as social psychologist Jonathan Haidt has documented, is extremely politicized. On the other hand, it is also because social psychologists are scientific enough to care. Other fields are at least as distorted, but they don’t feel as bad about it as the psychologists do.

I lifted this idea from Greg Cochran.

(At the extreme, cultural anthropologists have turned against science in general: at Stanford, for example, the Anthropology Department broke up for a number of years into Cultural Anthropology and Anthropological Sciences.)

Is the social psychology glass therefore half empty or half full? I’d say it’s to the credit of social psychologists that they feel guilty enough to host these debates rather than to just ignore them.

Dr. Gelman responded:

Steve:

What you say is similar to what I said here, where I argued that psychology has several features that contribute to the crisis:

– Psychology is a relatively open and uncompetitive field (compared for example to biology). Many researchers will share their data.

– Psychology is low budget (compared to biomedicine). So, again, not so much incentive to hoard data or lab procedures. There’s no “Robert Gallo” in psychology who would steal someone’s virus sample in order to get a Nobel Prize.

– The financial rewards are lower within psychology, hence the incentive is not to set up your own company using secret technology but rather to get your idea known far and wide so you can get speaking tours, book contracts, etc. Sure, most research psychologists don’t attempt this, but to the extent there are financial rewards, that’s where they are.

– In psychology, data are generally not proprietary (as in business) or protected (as in medicine). So there’s a norm of sharing. In bio, if you want someone’s data, you have to beg. In psychology, they have to give you a reason not to share.

– In psychology, experiments are easy to replicate (unlike econ or poli sci, where you can’t just run a bunch more recessions or elections) and cheap to replicate (unlike medicine which involves doctors and patients). So replication is a live option, indeed it gets people suggesting that preregistered replication be a requirement in some cases.

– Finally, hypotheses in psychology, especially social psychology, are often vague, and data are noisy. Indeed, there often seems to be a tradition of casual measurement, the idea perhaps being that it doesn’t matter exactly what you measure because if you get statistical significance, you’ve discovered something. This is different from econ where it seems there’s more of a tradition of large datasets, careful measurements, and theory-based hypotheses. Anyway, psychology studies often (not always, but often) feature weak theory + weak measurement, which is a recipe for unreplicable findings.

To put it another way, p-hacking is not the cause of the problem; p-hacking is a symptom. Researchers don’t want to p-hack; they’d prefer to confirm their original hypotheses. They p-hack only because they have to.

Hey—that’s a blog post right there. I guess I’ll post it; there’s room in May.

By the way, I’ve long been interested in the theoretical possibility that some failures of famous old experiments to replicate in experiments years later are not due to the original experimental results having been just plain wrong but due to history having moved on and people behaving differently in the present than in the past.

But I don’t have any good examples from social psychology.

But the power of historical change is evident in psychometrics in the Flynn Effect. Unlike much else in psychology, IQ testing has proven hugely replicable … except over time, much to the surprise of just about everybody.

A lot of effort was devoted to making IQ tests consistent over space, across different languages and in different countries. But the surprise of just about everybody, IQ tests proved inconsistent over time: raw scores went up fairly consistently, decade after decade and all over the world. This was not expected, and it remains pretty interesting.

[Comment at Unz.com]