State databases pose challenges but can yield powerful insights

Share:

person on computer database

Photo by George Pak via Pexels

Journalists often report on hospital performance using federal data tools such as Care Compare. Less often do news organizations mine another rich source: administrative records that state health departments collect from hospitals. 

These enormous repositories, which are often cloaked from public view, draw from hospital billing systems and include granular details about inpatient stays, emergency visits and outpatient care.

Sometimes referred to as discharge data, they contain patient demographics, diagnoses, procedures and lengths of stay as well as  indicating whether a patient went home, transferred to another facility or died. 

Discharge databases have driven influential news investigations:

  • The Las Vegas Sun revealed hundreds of incidents of patient harm in local hospitals, some of which went unreported to regulators.
  • California Watch, a project of the Center for Investigative Reporting, exposed a hospital company’s questionable billing practices, including admitting an unusually large portion of emergency department patients.
  • USA Today calculated rates of childbirth complications at about 1,000 hospitals in 13 states and identified facilities where women were harmed at disproportionate rates.

Dealing with roadblocks

Alison Young
Alison Young. Photo by Lisa Damico Portraits

The drawbacks of these databases are that they can be difficult if not impossible to obtain and challenging to work with because of their size and complexity. 

States vary in how accessible they make their data, said Liz Lucas, senior training director at Investigative Reporters & Editors and an adjunct professor of data journalism at the University of Missouri School of Journalism. Some reporters “have tried to get discharge data from every state and found it extraordinarily difficult,” she said via email.

In most states, journalists must apply for access and sign data-use agreements that are typically used by scientific researchers, said Alison Young, who worked on the USA Today maternal complications investigation. In some states, the data may be deemed exempt from public records laws, she added.

Agencies in only 14 states provided USA Today with useful data, according to the report. Some states imposed exorbitantly high fees or shielded the identities of individual hospitals, said Young, who is the Curtis B. Hurley Chair in Public Affairs Reporting at the University of Missouri School of Journalism. In some states, hospital associations controlled the data and would not release it, she said. 

Data-use agreements required the reporting team to take strict security measures. This included limiting access to specific USA Today staff and ensuring that no one would use the information to identify specific patients, even as leads for interviews, Young said. 

Agreements typically blocked the reporting team from sharing the results of analyses when a small number of patients were affected by a specific type of adverse event. USA Today reporters also couldn’t use the data to analyze aspects of care outside of their initial application topic — maternal morbidity and mortality — without amending the agreements.

The new organization’s data team ran checks to make sure that datasets were complete and consistent, and the reporting team tapped outside experts for advice on how to analyze the data.

“I don’t think we had any idea when we started out on that project how challenging it would be,” Young said. Despite the hurdles, she said, the records “allowed us to do things that we could not have done with any other dataset.” 

“This really is a way of watchdogging and benchmarking what is going on inside of hospitals and the kind of care they’re providing, or signals that they’re not providing care,” she added. 

Tips for getting and using data

  • Find out which state agency maintains the data. Some states, such as California’s Department of Health Care Access and Information, post datasets online for anyone to download. Others provide data upon request. 
  • Be clear about what you’re seeking. Lucas recommended stating that you’re not looking for personal identifiable information, or PII. “Sometimes it’s worth asking for aggregated data rather than patient-level data because you’re more likely to get it, though patient-level data gives you more possibilities for analysis,” she said. 
  • Borrow analytical methods from researchers in the field. The USA Today team adopted procedure codes and other indicators used by the CDC to identify severe maternal morbidity. Phillip Reese, a data reporting specialist and associate professor of journalism at California State University-Sacramento, said he duplicated the methodology of a peer-reviewed study and downloaded state emergency room encounter data to document an increase in treatment for dog bites. His reporting appeared in KFF Health News.
  • Consider the Agency for Healthcare Research and Quality’s Healthcare Cost and Utilization Project (HCUP) tool, which aggregates state datasets and reports on national and regional trends. One caveat is that the tool does not identify hospitals. HCUP charges for databases but has helped journalists with specific data requests, Lucas said.

The next big thing: all-payer claims databases 

A growing number of states are launching all-payer claims databases (APCDs), centralized repositories that draw information from payers rather than hospitals.

These powerful tools resemble privately-held claims databases and allow state policymakers to better pinpoint cost drivers and assess the quality of care within their borders.  

They’re even larger and more complicated than hospital discharge databases, which means they would be difficult for journalists to navigate, said Kevin McAvey, a managing director in health consulting at Manatt, Phelps & Phillips law firm, who advocates for APCDs. “It requires a fair amount of leg work, unless you know exactly what you are looking for,” he said.

Some states, such as Minnesota and Wisconsin, have health data organizations that maintain public use files and query tools and publish reports based on APCD data. Nevertheless, McAvey suggested that news organizations join consumer advocates and researchers who are pushing state governments for even more transparency.

States should use their analytic capabilities to put key information at the journalists’ fingertips, McAvey said, adding, “It shouldn’t be incumbent upon reporters to become data analysts.”

Mary Chris Jaklevic

Mary Chris Jaklevic is AHCJ’s health beat leader for patient safety and a former AHCJ board member.