I had no intention of revisiting the debate over the use of brain imaging in social neuroscience, which I blogged about last month.
But that post brought such a tsumani of anger, dismay, invective and
outrage that I felt an obligation to go back and dig more deeply into
whether the charges in a paper by Ed Vul of MIT, Hal Pashler of UC San Diego and colleagues that is in press at Perspectives on Psychological Science were as meritless as many of the scientists I heard from claimed.
The basic criticism of Vul et al rests on statistics, so I sought
out eminent statisticians who have no horse in this race. I had room to
cite only two of them in my magazine column this week,
but there were no dissenting voices: some studies that use fMRI to
correlate patterns of brain activity with some measure of emotion,
thought or other psychological trait/behavior of interest to social
neuroscience are indeed problematic. For a good discussion, I recommend
a post
by Andrew Gelman, professor of statistics and of political science and
director of the Applied Statistics Center at Columbia University.
But the targets of the criticism also make legitimate points.
They're right that calling anyone’s science “voodoo” (in the title of
the Vul et al paper) is not very nice or conducive to constructive
dialogue. And as they say, social neuroscience is not the only field
that uses functional neuroimaging in a way that has problematic stats.
Neither of these points gets to the heart of the criticism, however. Even a response prepared by Matthew Lieberman
of UCLA and colleagues, while viewed as the best of the bunch by the
statisticians and neuroscientists who were kind enough to read it for
me, doesn’t answer all the concerns Vul et al. raised. As Lieberman
told me, “What we’re fundamentally interested in is whether there are
these relationships [between a pattern of brain activity and a
psychological measure] at all. The initial test [looking for patterns
that correlate with these measures] tells you there are regions of the
brain worth interrogating.” Once scientists do that initial pass, he
argues, those who know what they’re doing apply the proper controls and
methods of statistical analysis to make sure their subsequent scans are
independent of the first. The problem, say other scientists with
extensive experience in neuroimaging (and in reading neuroimaging
papers), is that “what he describes as good statistical practice
doesn’t occur in a lot of these papers,” as one researcher (who doesn’t
want to antagonize colleagues more than he already has) told me.
Failure to do the stats properly is the main problem identified by
Vul et al. Alas, some experienced practitioners of neuroimaging concede
that their field is indeed beset by the “circularity” the imaging
critics identified, Nikolaus Kriegeskorte of
NIMH told me. “In extreme cases, the effect [in which a pattern of
brain activity is correlated with a behavior, feeling, etc.] doesn’t
exist at all, and what you are reporting is just noise. Because we have
so much data and selection is inevitable, neuroscience is faced with
the challenge of avoiding the bias that can come with data selection.”
That problem is not unique to imaging, I hasten to add: EEGs and
invasive recording have it, too. “It is not a new problem, and there
are techniques to avoid it,” Kriegeskorte said.
“Things can do wrong, but how wrong?” he continued. “Our sense is, a
whole range of things can happen, from a slight distortion [in the
strength of correlations] to entirely spurious results. Some papers do
not deal with it well, and are based on incorrect statistics. Whether
the central conclusions are wrong cannot be determined without redoing
or at least reanalyzing the experiment. Vul et al. have the central
point right, but they were unnecessarily inflammatory and their
estimate of how much [reported correlations have been inflated] might
be too high. But reported correlations are almost certainly higher than
they should be.”
The reason that matters is that brain imaging is increasingly being
usd not for pure discovery and hypothesis testing, as UCLA’s Lieberman
rightly explains, but for real-world uses with potentially worrisome
implications, as I explain in my column this week.
So how can laymen, not to mention science journalists, separate good
studies from questionable ones? Not easily. Even when we play by the
rules and report only studies that have been peer-reviewed and
published, it turns out, we can't be assured that the study found what
it claims to: some of the most problematic studies ID'd by Vul et al.
are in eminent journals. But speaking for myself, when I write about
neuroimaging studies in the future I will ask a lot more, and harder,
questions about the method of analysis than I have in the past.