A post hoc analysis refers to analyzing data for reasons that differ from the reason the data was originally collected. In other words, the data have already been collected for an objective different from what the post hoc analysis is attempting to examine, and a new analysis is being conducted after the fact. Often it involves analyzing data from a clinical trial that does not relate to the trial’s original primary endpoints. It could be the researchers themselves using their own data to look for patterns or ask a different question than they did as they were collecting the data, or it could be a separate set of researchers using some other group’s data to look for patterns related to a particular research question.
Deeper dive
For example, if a group of researchers conducts a clinical trial to establish the efficacy of a drug for a specific cancer type, the primary objectives will relate to the efficacy and safety in treating that cancer. However, another group could take the same data set and analyze whether the same drug might be effective for a different condition if enough people in the study population had the other condition and the original researchers collected data on that condition. Or, the same group who did the original trial could re-analyze their data in terms of a different primary objective than the one they set out to test in their original experiment design. In this latter example, a reporter should ask the researchers why they are re-analyzing their data and/or why they are using it to look at a different objective than what they originally set out to investigate.
Any post hoc analysis has the potential to introduce different forms of bias. The biggest concern with post hoc analyses is that they can, if repeated indefinitely, become p-hacking. There needs to be a good reason for researchers to conduct a post hoc analysis, especially if they do multiple ones. If they do multiple analyses, they should also be doing some sort of statistical calculations to account for the increased likelihood of a statistically significant finding that can occur from analyzing the same data over and over. Subgroup analyses (and often sensitivity analyses) are very frequently post hoc analyses.