New tip sheet offers detailed guidance for analyzing studies using big data

We’ve covered in another blog post what to be cautious about in scrutinizing an observational study that uses data from a massive database or dataset. And we’ve introduced a new section in the Data section of the Medical Studies Core Topic that describes characteristics and considerations of several large datasets that researchers may frequently use for such studies.

But sometimes you want to get really granular on deconstructing a study. Perhaps it’s an especially surprising study, or one that you know will make a big splash. Either it’s potentially very controversial or it’s just the kind of thing that raises eyebrows and garners a lot of news coverage. Or, perhaps you need to dig into a bunch of studies for a book you’re researching or writing, or you’re scrutinizing several studies for a long-term investigative project you’re doing. In any of these cases, you may want to dig a bit deeper into how the study was done and ask some detailed and/or hard questions of the authors and/or outside sources.

If so, there’s a new tip sheet for you: “10 in-depth considerations for observational studies using big data.” The tip sheet draws from a 10-item checklist in the open access editorial “A Checklist to Elevate the Science of Surgical Database Research,” written by the editor, deputy editor and a surgeon in JAMA Surgery.

The original list of tips was provided to help researchers ensure they are covering all their bases when using large datasets in a study. The list covers having a clear research question and hypothesis, potential confounders in the study, the completeness of the data available, determining inclusion and exclusion criteria, including a thorough review of existing evidence and other considerations in conducting an epidemiological study using Big Data.

This new tip sheet, however, flips each of those list items around to give journalists a sense of how they can examine whether and how the researchers addressed each of those considerations. How did the researchers deal with missing data in their dataset? Why did they choose the inclusion and/or exclusion criteria they chose? Did they adequately cover the existing evidence base in their introduction, or does it seem they could be leaving something out? Did they consider all the major possible confounders, or are there remaining ones that could introduce bias?

Keep in mind that this tip sheet is pretty in-depth. It’s not something you would use when assessing every study you write about (if only we had such time!). But if you use it frequently enough over time for the studies that seem especially to merit it, you will likely internalize many of these considerations to the point that you may recognize gaps or red flags in a study without even trying.

Leave a Reply