Sanity checks as data sidekicks

Abe Gong asked for good examples of ‘data sidekicks‘.

I still haven’t got the hang of distilling complex thoughts into 140 characters, and so I was worried my reply might have been compressed into cryptic nonsense.

Here’s what I was trying to say:

Let’s say you’re trying to do a difficult classification on a dataset that has had a lot of preprocessing/transformation, like fMRI brain data. There are a million reasons why things could be going wrong.

(sorry, Tolstoy)

Things could be failing for meaningful reasons, e.g.:

  • the brain doesn’t work the way you think, so you’re analysing the wrong brain regions or representing things in a different way
  • there’s signal there but it’s represented at a finer-grained resolution than you can measure.

But the most likely explanation is that you screwed up your preprocessing (mis-imported the data, mis-aligned the labels, mixed up the X-Y-Z dimensions etc).

If you can’t classify someone staring at a blank screen vs a screen with something on it, it’s probably something like this, since visual input is pretty much the strongest and most wide-spread signal in the brain – your whole posterior cortex lights up in response to high-salience images (like faces and places).

In the time I spent writing this, Abe had already figured out what I meant 🙂