The federal government is expected Wednesday to release data on the services provided by – and money paid to – 880,000 health professionals who take care of patients in the Medicare Part B program. For 35 years, this data has been off limits to the public – and now it will be publicly available for use by journalists, researchers and others.
While the data offers a huge array of stories, which could take weeks or months to report out, it also has some pitfalls. Here are six things to be aware of before you dig in:
Have a strategy for storing and opening the data. This data set is big. About 10 million rows, from what I hear. Because of that, you won’t be able to analyze it in Microsoft Excel and you might not be able to open it in Microsoft Access. You’ll want to upload it onto a data server and analyze it in a more powerful program such as SQL or SPSS. This could well serve as a barrier to entry for smaller news organizations. You may want to partner with an academic institution or another news outlet to analyze the data. Continue reading