The dangers of data dredging
4 Oct 2018 by Evoluted New Media
As an academic high-flyer comes crashing down, we really should have seen this coming says Russ Swan...
It's not common for a scientific paper to be retracted. It can happen for a variety of reasons, many of which are due to innocent human blunders. Graphs and images misplaced or mislabelled, arithmetical mistakes (we've all hit a wrong key at some point), individuals or institutions misidentified.
What is uncommon is for a number of publications to be retracted simultaneously, and for the retractions to identify a single specific author. This is precisely what happened last month when JAMA (which used to be known as the Journal of the American Medical Association but now seems determined to follow the inexplicable modern trend of disguising its identity and purpose) issued a terse 'media advisory' (don’t get me started on the use of adjectives as a nouns).
"JAMA... have retracted six articles that included Brian Wansink, PhD, of Cornell University, Ithaca, New York, as author" it stated. These few words became the killer blow to a once-glittering academic and media career and have highlighted, once again, the hubris that can attach itself to the celebrity scientist.
Each of these six peer-reviewed papers, withdrawn from journals with impact factors as high as 47.66, had two or three authors – but clearly it was Professor Wansink who was in the firing line. Who is he?
p-hacker I didn't at first recognise his name (despite its anagram potential) but did recall one of his more famous findings – that people will take smaller helpings of food when given smaller plates. This revelation went viral, or possibly bacterial, a couple of years ago when research from Wansink's 'Food and Brand Lab' at Cornell was published. It became a lead article in the inaugural issue of the Journal of the Association for Consumer Research (editor: Brian Wansink).
In a world troubled by a ballooning problem of obesity, Wansink's lab seemed to offer some promising and easy fixes. Eating with forks, using plain crockery, and installing mirrors in the kitchen could all help nudge consumers into better dietary habits. These studies into the behavioural science of eating made him a media darling and America's go-to-guy for commentary on almost anything to do with nutrition.
It wasn't all medals and media opportunities, though. Despite the endowed chair at one of the world's top universities, a library full of high-impact publications, and the ultimate accolade of an Ig Nobel prize, questions had long been raised about his methods and scientific integrity.
Alarm bells rang when Wansink published a blog post in 2016 in which he boasted about instructing a student to excavate data from a past experiment with a null result, to find some sort of conclusion. "This cost us a lot of time and our own money to collect. There's got to be something here we can salvage" he quoted himself as saying. A later addendum sought to clarify that this was not so much p-hacking as a 'deep data dive', or "figuring out why our results don't look as perfect as we want".
Yeah, that sounds like p-hacking.
Pun soup In fact, Wansink appears to have been quite open about his data manipulation practices. He spoke of 'data torturing' and methods to 'reinvent statistics'. He went so far as to set targets for data values.
Given his work in the area of health and nutrition, foodie puns proliferated as the scandal was unwrapped. Data sets were being cherry-picked, and his habit of squeezing as many publications as possible out of some research was criticised as salami slicing. Adding corrections and addenda to publications, which only incriminated him further, meant he was out of the frying pan, into the fire.
The six papers in the latest reprimand means that Professor Wansink has the dubious accolade of having 13 papers retracted – one of them twice. Talk about having your cake and eating it.
Four earlier publications, which became known as the pizza papers, were found by a PhD student to contain at least 150 errors. Statistical analysis showed that much of the data was either fraudulent or subject to gross errors.
One commentator suggests the data had not so much been massaged as kneaded. I can’t help thinking it should have been left to rise, or 'prove' – but proving proved impossible because, oops, the original data sets had been lost in a bout of tidiness.
Eventually the mounting evidence became too much for Cornell to swallow, and shortly after the JAMA announcement it found him guilty of scientific misconduct. The former high-flier has been forced to resign or, if you please, to eat humble pie.
Russ Swan