P-hacking, self-plagiarism concerns plague news-friendly nutrition lab

Tara Haelle

About Tara Haelle

Tara Haelle (@TaraHaelle) is AHCJ's medical studies core topic leader, guiding journalists through the jargon-filled shorthand of science and research and enabling them to translate the evidence into accurate information.

Photo: Dominic Rooney via Flickr

Some of the most difficult research to make sense of comes from nutrition science. It is difficult, expensive and labor-intensive to conduct randomized controlled trials in nutrition, in part because they require randomizing what people eat and then ensuring they eat what they’re supposed to – no more and no less.

Even when such trials are finished (often at in-patient labs), the populations are usually small and somewhat homogenous, thus reducing the generalizability and overall clinical utility of results. They also don’t run long since few people can spend months in a facility eating a prescribed diet. Short-term trial results about diet rarely offer much insight into long-term outcomes.

Therefore, the vast majority of nutrition science comes from observational studies, which seek associations between exposures and outcomes but can’t show causation. They often sound sexy and are the bane of health and nutrition journalism. Whether it’s about blueberries preventing cancer or more news about impact of Mediterranean diets, there’s no dearth of “advice” reported from epidemiological nutrition studies.

Despite this, the research of Brian Wansink at Cornell University has been given a fair amount of respect in the field, until now. Wansink’s research frequently makes headlines because it’s clever, practical and unexpectedly intuitive. As a source, Wansink is funny, charismatic, playful, and quotable. (Disclosure: I have interviewed Wansink several times and used his research and interviews in parts of the book I co-authored with Emily Willingham, “The Informed Parent.”)

His work often looks at how environment — including companions, music, lighting, location, food placement, dishware and utensils — can influence how much food people consume.

For example, one Wansink study looked at what happens when chocolate milk is removed from school lunch lines, or simply moved around or placed with a greater proportion of plain milk. Other studies focused on the color of M&Ms, the size of spoons or ice cream bowls, the size of servings on box labels, the amount people eat when watching a movie character eat, what type of food is placed first in a buffet and the effect of pre-slicing fruit. These are all headline-friendly topics, and the Cornell Food and Brand Lab research has been so prolific that its output seemed too good to be true.

Unfortunately, that may be the case. A blog post from Wansink in late November raised enough eyebrows among colleagues and beyond that several biostatisticians have dug deeper into his work. What they found isn’t encouraging. The blog post itself appears to describe classic p-hacking, a topic that Christie Aschwanden covered in a story from her award-winning series on understanding medical studies that we previously wrote about.

P-hacking is data mining in the negative sense: crunching the numbers from a study’s data every which way until you find something interesting — and then coming up with a hypothesis that fits. It’s the opposite of what the scientific method should be, but the publish-or-perish nature of academia can tend to encourage it.

Worse than p-hacking — itself pretty concerning — are the significant number of errors found in multiple studies from the lab that now are seen as mathematically impossible. Researchers Tim van der Zee, Jordan Anaya and Nicholas J L Brown published an analysis of these at the journal PeerJ. Anaya has since found problems with seven additional papers from the lab.

Wansink made two addenda to his post to address the criticism and discussion about the questioned research. He assured readers that the papers in question were being independently evaluated by an external biostatistician. Errata will be submitted to the journals where the papers appeared, if and as necessary, he said. But his early unwillingness to share the data with others for closer inspection (that appears to have changed per his most recent statement), as well as the sheer number and breadth of errors, do not bode well.

Articles in New York Magazine and Slate provide in-depth overviews of the controversy and Wansink’s response. Retraction Watch spoke with Wansink about the concerns in February.

Although those events happened in early February, an equally concerning discovery came to light this month. Wansink appears to have self-plagiarized his work throughout multiple papers, as Brown outlined in a new blog post. Self-plagiarism, also called text recycling, can be considered academic misconduct. Worse, two different studies – a 2001 study with 153 participants and a 2003 study with 654 participants – had identical results, to the decimal point, for 39 of 45 outcomes. You don’t have to be a mathematician to contemplate the odds of coincidence here.

A New York Magazine follow-up story and this Guardian piece dive into the details of the new revelations. In short, things are going from bad to worse for the lab.

Even before the most recent text recycling revelation, off-the-record sources suggested to me that the situation was more troubling than it appeared and was likely to get worse.

The Wansink controversy offers a remarkable opportunity for journalists to learn more about p-hacking, text recycling, biostatistical errors, academic misconduct and academic integrity in a single case study about a prolific, popular, media-friendly lab at a top-notch university. All the articles linked here – two from New York Magazine, one from Slate and one from The Guardian, plus Wansink’s original post, are highly recommended for journalists, especially those who cover nutrition. They are primers for understanding the limitations of nutrition studies.

The fallout from this situation is unclear. Retractions aren’t out of the question if the data are riddled with errors and become irreconcilable with the original conclusions. While it might result in nothing more than several errata published in journals, that seems increasingly unlikely. Either way it’s a story for journalists to monitor, if only to reinforce the weaknesses they should look for in nutrition studies.

Resources

Leave a Reply