Data journalism for health care reporters
By Robin Berghaus/AHCJ-Texas Health Journalism Fellow
Health data is often stored in private servers, requiring journalists to become sleuths to acquire it. MaryJo Webster wants people know how to overcome access barriers and — when journalists do get their hands on data — how to sort and interpret it.
The Minnesota Star Tribune data editor led a hands-on workshop at HJ26, so journalists with varied expertise could dive in. (She also provided links to data sets, tip sheets and resources for accessing missing federal data.)
During the workshop, Webster educated journalists about how to develop a data mindset. Below are her top tips.
How to begin:
- Write down all your questions to guide you toward what you need to know.
- Find data with the most details, ideally at the person level, so you can ask questions, for example: Who is getting sick? and What is being done?
- Before analyzing the data, contact the person who provided it. You need to know what it includes and what’s missing.
Know data’s limitations:
- Data is good at answering: Who? What? When? and Where? — not Why? For example, if Medicaid enrollment numbers dropped, you must investigate why. Are people dying? Or are people being removed from the rolls?
- Small groups could be excluded. LGBTQ community data, for example, may not be tracked.
Explore data analysis resources:
- How to sort and filter data, apply formulas and functions, and create pivot tables — these demos for Google Sheets and more are available as free video tutorials on Webster’s Data Journalism Academy. Webster also created Data Mindset for Editors, a free course for editors to conceptualize and oversee the work data journalists do.
What if data isn’t tracked? Find and collect it:
- Start with a random sample that’s not massive. Webster pointed to the series “Denied Justice” that she produced with Star Tribune reporters who sought to identify the percentage of sexual assault cases that do not end in conviction.
Because that data wasn’t tracked, they began by analyzing a random sample of all sexual assault cases reported to police across the 20 largest agencies in Minnesota. They read the investigative case files, tracked them through the court system, and built a database, which showed that only 8% of those cases resulted in a conviction.
- No data? That’s a data story, too. As federal agencies began deleting data from government websites following President Trump’s orders, a group of government employees salvaged some data. Journalists can point to historical data to describe what the picture was in the past, and why it’s problematic that certain data is no longer tracked.
Webster concluded the workshop by encouraging journalists to dig in.
“Everyone always asks me, ‘Where should I go to get data?’ There is no one place,” she said. “Think of the internet as a clue that data exists. It’s about doing homework and putting a news reporter hat on and asking questions.”
Robin Berghaus is a freelance writer and producer based in Austin, Texas.









