Final dissertation

I realized I never uploaded the final version of my dissertation, ‘Weakening memories by half-remembering them‘.

Brain orchestras and fMRI analyses

[With help from David Weiss]

I spent much of my PhD working on algorithms for making sense of gigabytes of brain data from fMRI scanners, especially on a fairly new approach called Multi-variate Pattern Analysis (MVPA). I want to show you how the MVPA approach is useful for tackling certain kinds of questions.

Think of the brain as a kind of orchestra. You have lots of separate instruments playing at the same time, and you can subdivide them in lots of different ways, e.g.

You can subdivide the orchestra into parts by location – the 1st violins, the brass, the percussion etc.
Or you could organize them by what they’re doing. Say the 2nd violins, the oboes and the trumpets have the melody, while the clarinets and the tubas have the harmony. [The harps are doing their own thing and the bassoonist is drunk.]

Likewise, there are all kinds of things going on at once in the brain.

You can subdivide the brain by location – frontal, temporal, parietal, occipital lobes.
Or you could organize the sub-parts by what they’re doing – vision, language, executive control, motor etc.

Let’s go back to thinking about how the multivariate approach differs in the kinds of questions it can address.

Standard univariate analysis is useful if you want to tell which instruments are involved in one case rather than another, e.g.

violins are more active in Beethoven than Mozart, but for trumpets it’s the other way around

one part of the brain is more active when looking at houses than faces, but for another part it’s the other way around

In contrast, a multivariate analysis might be useful if you want to know:

is this Mozart or Beethoven?
is this the brain of someone looking at faces or houses?

Now, let’s introduce one more concept: dimensionality reduction is an attempt to boil down many instruments (or brain regions) into a few key themes/groups:

Take the famous da-da-da-dum of Beethoven’s Fifth, where the entire orchestra is one voice – one could more or less describe the entire orchestra’s activity in terms of just one theme/process. In contrast, for Bach or something more complex and interwoven, it might be very hard to summarize what’s going in with less than 10 themes.

Likewise, maybe it’s straightforward to summarize the brain’s activity with just one or two processes when you’re doing a very simple task like looking at faces vs houses, but if you’re doing something more complicated (like watching a movie) then multiple processes are interacting in complex ways.

David Weiss‘s PACA algorithm boils down the brain’s activity over time into just a few themes. Once you’ve summarized the 50,000 readouts we get from fMRI every few seconds into 50, it’s much more feasible to try and compare different cognitive processes – just as it’s much easier to compare Mozart and Beethoven by looking at the scores of a few key instruments than looking at the full orchestral scores.

PACA was inspired by a bunch of existing dimensionality reduction algorithms that could equally be applied to problems like voice, face or handwriting recognition.

But its magic involves adding a few constraints that are particularly relevant to the brain. Here’s one example of a constraint: it doesn’t allow its estimate of a theme’s presence at a given moment to go below zero. Think of it like this – when was the last time you heard an anti-violin? Or had an anti-thought? In other words, PACA breaks the manifold streams of activity in the brain down to just a few that are all present to a greater or lesser degree at each moment.

P.S. If you hated this, you might also hate How to beat an fMRI lie detector.

Excretation over

I just handed in my dissertation draft – Weakening memories by half-remembering them. This was the sunrise that birthed it (inspired by neurotomfoolery).

Mewling infant.

How to beat an fMRI lie detector

In a not-so-distant dystopia, you might be placed in a brain scanner to test whether you’re telling the truth. Here’s how to cheat.

The polygraph

First, you’ll need some background on old-school lie-detection technology. [This is a simplified story – see polygraphs for a richer account.] Polygraphs are seismographs for the nervous system. They measure physiological responses such as heart rate, blood pressure, sweatiness through skin conductance, and breathing. When you’re anxious, angry, randy, in pain, or otherwise emotionally aroused, these measures spike automatically. The effort and stress of lying also causes them to spike.

Of course, if you’re trapped in a windowless room on trial for murder, these measures will probably be pretty high to begin with. So you’ll first be asked a few control questions to assess your baseline levels when lying and telling the truth, against which your physiological response to the important questions will be compared.

So, if you want to beat a polygraph, you need either keep your physiological responses stable when you lie (which is difficult), or you need to artificially elevate your baseline response when telling the truth. The age-old technique is to place a thumb-tack in your shoe, and press on it painfully with your toe when telling the truth, spiking your physiological responses, and providing a misleading control so that your lies don’t seem higher relatively.

Functional magnetic resonance imaging

Now, on to fMRI. Simplifying again, the fMRI brain scanner takes a reading of the level of metabolic activity at thousands of locations around your brain every couple of seconds. Activity in a number of brain areas tends to be elevated when we lie, perhaps because we have to work harder to invent and keep track of the extra information involved in a lie, and override the default responses in the rest of the brain. Under laboratory conditions, accuracy at distinguishing truth from lie approaches 100%.

The modern machine learning algorithms used to make sense of the richer neural data are more sophisticated than those used in a polygraph. And they’re measuring your brain activity (albeit indirectly), so it might feel as though there’s no way to deceive them. But ultimately, they work in an analogous way to the polygraph, by comparing your neural response to the important questions with your neural response to the baseline questions. That means that they can be gamed in an analogous way – as you’re being asked the baseline questions, wiggle your head, take a deep breath, do some simple arithmetic or tell a lie in your head. Each of these will elevate the neural response artificially. By disrupting the baseline response, you disrupt the comparison.

Possible flaws in this argument

This simplified account of how to cheat an fMRI lie detector has some issues.

Firstly, it rests on the idea that we’ll still use some kind of comparison between baseline and important questions. In the case of most recent fMRI analyses, this is certainly true. Although they use modern machine learning classification algorithms to compare against baseline, they still seem subject to the same problems as the simpler statistical tests used in polygraphs.

Above, I suggested taking a deep breath, doing simple arithmetic or telling a lie in your head during the baseline questions. Taking a deep breath increases the BOLD response measured by fMRI throughout your brain. The idea behind doing arithmetic or telling a lie in your head is to engage the brain areas involved in internal mental conflict detection (between areas of the brain that are pulling in different directions), executive control (over the rest of your brain), and working memory whose activity changes when lying. As far as I know, all of the studies on lie detection seem to use naive participants, and no one has yet tested the efficacy of these counter-measures.

I have also assumed that the analysis would be run ‘within subject’. In other words, the machine learning classifier algorithms would be making a comparison between baseline and important questions for the *same person*. However, there have been attempts to train the algorithms on a corpus of data from multiple participants beforehand, and then applied to a new brain. This approach is considerably and inherently less accurate (less than 90% as opposed to nearly 100%) since everyone’s brain is different, and since brain activity will probably vary for different kinds of lies. Indeed, there appears to be variability in the areas that have been identified by different experiments.

There are alternative experimental paradigms to the basic questioning approach described here. For instance, one might show someone the scene of a crime, and look to see whether their brain registers familiarity. I haven’t looked into this approach. But fundamentally, this familiarity assessment is much more limited in the kinds of questions that can be asked, and furthermore, you only get one chance to assess someone’s familiarity (after which the stimulus is, by definition, familiar). That single response simply might not be enough data to go on.

All of the studies so far have employed ‘willing’ participants. In other words, the participants kept their heads still, told the truth when they were asked to, and lied when they were asked to. An uncooperative participant might move around more (blurring the image), show generally elevated levels of arousal that could skew their data, be in worse mental or physical condition, and come from a different population than the predominantly white, young, relaxed, intelligent and willing undergraduate participants. We don’t know how these factors change things, and it’s difficult to see how we might collect reliable experimental data to better understand them.

I haven’t considered alternative imaging methodologies here (such as EEG or infrared imaging). Mostly though, fMRI appears to be leading the field in terms of accuracy and effort spent, and all of these arguments should apply to EEG and other methods equally.

Why am I writing this?

There are a number of fMRI-based lie detection startups attracting government funding and attempting to charge for their services. I don’t begrudge them their entrepreneurial ambition, but I am dismayed by their hyperbolic avowals of success.

In truth, this is a new, mostly unproven technology that seems to work fairly well in laboratory conditions. But it’s subject to the same sensitivity/specificity tradeoffs that plagues medical tests and traditional lie detection technologies. The allure of an ostensibly direct window into the mind with the shiny veneer of scientific infallibility is a beguiling combination.

Eventually, the limitations of this technology will be realized. I’d prefer to see this techno-myth punctured and caution exercised now, rather than after costly mistakes have been made. Cheeringly, the courts appear to take the same view (at least so far).

My credentials

I’m finishing my PhD in the psychology and neuroscience of human forgetting at Princeton. I’ve worked on the application of machine learning methods to fMRI for the last few years, was part of the prize-winning team in the Pittsburgh fMRI mind-reading competition, and lead the development of a popular software toolbox for applying these algorithms for scientific analysis. However, I have no expertise in the neuroscience of cognitive control, lie detection or law.

So I apologize if I’m wrong or out of date anywhere here. If so, I’d be glad to see this pointed out and to amend things.

The Pittsburgh EBC competition

Try and picture the scene: you’re in a narrow tube in almost complete darkness, there’s a loud thumping noise surrounding you and you’re watching episodes of the 90s sitcom, ‘Home Improvement’, with Tim The Tool Man Taylor and his family. There’s a panic button in case you feel claustrophobic, but it’s all over in less than an hour. It sounds a little surreal, but that’s what it would have been like to be a subject whose functional magnetic resonance imaging (fMRI) brain data was used in last year’s Pittsburgh Brain Analysis Competition.

After you’ve watched three episodes, kindly folk in glasses and white coats would take you out of the scanner bore, give you a glass of water and then over the next few days, they’d ask you to watch those same three episodes again over and over. On the second viewing, they’d ask you ‘How amused are you?’ every couple of seconds. On the third viewing, they’d keep wanting to know how aroused you are on a moment-by-moment basis. Then, ‘Can you see anyone’s face on the screen?’, ‘Is there music playing?’, ‘Are people speaking?’ and so on, until you’ve watched every moment of every episode thirteen times, each time being asked something different about your experience.

Our job, as a team entering the competition, was to try and understand the mapping between your brain data and the subjective experiences you reported. For two of the episodes, we were given your brain data along with the thirteen numbers for every corresponding moment that described your arousal, amusement, whether there were faces on the screen, music playing, people speaking etc. Our team, comprising psychologists, neuroscientists, physicists and engineers, put together a pipeline of algorithms and techniques to whittle down your brain to just the areas we needed and smooth away as much of the noise and complexity as possible. Think of these first two episodes as the ‘training’ data. Then, we were given only the brain data for the third episode, the ‘test’ episode, from which we had to predict the reported experience ratings.

Our predictions were then correlated with the subjects’ actual reports, and we were given a score. We ended up coming second in the whole competition, and we’re hoping for the top spot in 2007. Much of this effort has had a direct payoff for our day-to-day research. We now routinely incorporate a lot of these machine learning techniques when trying to understand the representations used by different neural systems, and how they relate to behavior.

Members of the team: David Blei, Eugene Brevdo, Ronald Bryan, Melissa Carroll, Denis Chigirev, Greg Detre, Andrew Engell, Shannon Hughes, Christopher Moore, Ehren Newman, Ken Norman, Vaidehi Natu, Susan Robison, Greg Stephens, Matt Weber, and David Weiss