Finding stories, avoiding pitfalls in new health data #ahcj15

Share:

With the trove of data out from everything from Medicare payments to data tracking relations between providers and drug companies, understanding the data to find stories can be overwhelming. At Health Journalism 2015, panelists shared their experiences with health data and gave attendees tips for avoiding potential potholes in the data.

Panelists:

  • Ronald Campbell, reporter, California HealthCare Foundation Center for Health Reporting

  • Cheryl Phillips, Hearst professional in residence, Stanford University

  • Eric Sagara, senior data reporter, Reveal/Center for Investigative Reporting

  • Fred Trotter, health care data journalist, The DocGraph Journal 

  • Moderator: Jennifer LaFleur, The Center for Investigative Reporting

Links and data:

R, an open-source statistics software package, is invaluable for covering health. Ron Campbell’s tipsheet, prepared for the NICAR conference in March, offers a step-by-step guide for using R and the ggplot2 package to visualize data. The data used in the exercise is Medicare hospital fines.

American Community Survey: The best data by far about the uninsured is published every September or October in the Census Bureau’s American Community Survey for the preceding year. Download comparable tables from 2013 and (once the data is released) 2014 from FactFinder to see how the Affordable Care Act changed the number of uninsured in your state, county and city. The 1-year data is available for cities and counties of 65,000 or greater population, though Campbell recommends using it only for areas where the population exceeds 100,000. Among the more than 100 tables on health insurance, here are four of the best:

  • S2701 – Health insurance coverage status. Compares insured and uninsured persons in several categories.

  • S2702 – Selected characteristics of the uninsured. Age, sex, race, nativity, education, income and more.

  • B27001 – Health insurance coverage status by sex and age.

  • B27007 – Medicaid/means-tested public coverage by sex by age. For states participating in the Medicaid expansion, a comparison of the 2013 and 2014 tables should be eye-opening.

Data.Medicare.gov: This site hosts dozens of downloadable databases grouped under the names Hospital Compare, Nursing Home Compare, Physician Compare, Home Health Compare, Dialysis Facility Compare and Supplier Directory.

CMS Data Navigator: If you know what you want but just can’t find it on the vast CMS website, look here.

For California reporters:

California Department of Public Health Open Data Portal: This site hosts more than 50 California databases. Topics include infant mortality and birth weight, school immunization, hospital-acquired conditions and smoking prevalence among high school students.

California Office of Statewide Health Planning and Development: OSHPD is simply incomparable. Here are a few of its wonders:

  • California hospital trends, 2008-2012: Lots of charts plus an enormous Excel workbook with 246 columns (!) of data on every hospital.

  • Inpatient mortality rates

  • Medical Service Study Areas: 542 of them, with age, racial makeup and ratios of population to primary care providers and dentists, plus shapefiles for mapping. There are two files; the detailed version includes a list of all census tracts in each MSSA. Experts use MSSAs to identify medically underserved areas. Lots more here:

Healthcare Workforce Development Division: Maps, GIS and Data

Data exploration and visualization tools

 CometDocs (free and premium versions) and Tabula are useful for converting data trapped in a PDF into useable formats. If you have a PDF document where the entire document is largely rows and columns, then try out CometDocs. If you have a PDF which is largely text but with smaller tables embedded within, try out Tabula.

Open Refine will help you standardize data.

For analysis, you won’t go wrong with a spreadsheet. SQL with Sequel Pro or Navicat, or SQLite in the Firefox Browser can help you with more complex analyisis.

For exploration and and visualization, try Tableau Public (free/premium version is free for IRE members) or Google Fusion charts and maps highcharts (javascript charting software).

Presentations:

Finding stories presentation.pptx (Ron Campbell)

Using visualizations for exploration and presentation (Cheryl Phillips)

Bulletproofing_and_using_data (Eric Sagara)


AHCJ Staff

Share:

Tags: